📳Introducing the new RM Compare Companion App
The RM Compare Companion App transforms professional judgment into a mobile-first experience. Quickly grab and add Items to your RM Compare sessions.
The RM Compare Companion App transforms professional judgment into a mobile-first experience. Quickly grab and add Items to your RM Compare sessions.
In a previous post, we explored the "Three Mirrors" of assessment - the Left, Right, and Centre views that together provide a complete picture of learner performance. Today, we want to look deeper into the glass. Specifically, we want to discuss why the "Left Mirror" (Holistic Assessment) works so differently from the "Right Mirror" (Absolute Assessment), and why the future of high-stakes evaluation is becoming more nondeterministic.
Welcome to a new series on the RM Compare blog. I’m Mark House. I was a PE teacher for over 25 years, which means I’ve seen every assessment fad, every "innovative" pedagogical shift, and more forgotten gym kits than I care to remember. I’ve decided to sit down with Declan Lynch - the man who invented Adaptive Comparative Judgement (ACJ) - to see if his high-tech world of algorithms and "professional judgment" can stand up to 25 years of common sense and a loud whistle.
Why a grade must be more than just a plausible statistic. It must be a justified belief rooted in human understanding.
Across hiring and education, AI is already deciding who gets seen, shortlisted, or passed over. Tools built on large language models can read thousands of CVs, mark essays, and sort candidates faster and cheaper than any human team. The efficiency story is compelling. The safety story is not.
Today, December 11, 2025, OpenAI launched GPT-5.2. If you've been watching the space, you might be wondering: is this it? Have we reached the point where foundation models can truly replace human judgment? The answer is still no. And understanding why is critical for anyone building assessment, hiring, or decision systems that matter.
The regulator has chosen a slow path for digital assessment. The economy can’t afford to wait. Here is why the future of skills depends on Transformation, not just Substitution.
The latest NFER report confirms that the skills of the future - Creativity, Collaboration, and Problem Solving - are the hardest to measure. Here is how we solve that.
We don't import your student rosters. We don't ingest names, IDs, or demographic data linked to assessment work. We use email addresses only where operationally necessary. This isn't a constraint we're managing around. It's a deliberate design choice. Here's why.
In a landscape where generative AI has eroded confidence in traditional written assessment signals, education leaders face an uncomfortable truth: the rubrics they've carefully crafted may no longer be fit for purpose on their own. Yet abandoning rubrics entirely isn't the answer.
We are currently living through an "Assessment Arms Race." On one side, students and job candidates are using Generative AI (like ChatGPT) to produce "perfect" essays and CVs in seconds. On the other side, institutions are rushing to buy AI marking tools to grade that work just as fast. It is a closed loop of machines grading machines. And in the middle of this loop, the human element - the actual understanding of quality - is quietly disappearing.
We assume that a blurry photo hides the quality of the work. We assume that if a drawing is photographed under yellow classroom lights, the examiner will think the drawing itself is yellow. We assume that Image Quality = Assessment Quality. But science suggests we are worrying about the wrong thing.
Look at the header image above. What do you see?