Defending Epistemic Integrity in the Age of AI Assessment
Why a grade must be more than just a plausible statistic. It must be a justified belief rooted in human understanding.
Why a grade must be more than just a plausible statistic. It must be a justified belief rooted in human understanding.
In the crowded landscape of educational technology, it is easy to be seduced by "smooth flow." Apps designed with slick interfaces, gamified rewards, and passive clicking mechanisms often feel engaging. They promise learning without the friction.
This week, Dennis Sherwood posed an uncomfortable question in a powerful SRHE blog post: when we become witnesses to systemic problems in assessment when 1.6 million out of 6.5 million GCSE and A-level grades differ from what a senior examiner would have awarded. What do we do?
Across hiring and education, AI is already deciding who gets seen, shortlisted, or passed over. Tools built on large language models can read thousands of CVs, mark essays, and sort candidates faster and cheaper than any human team. The efficiency story is compelling. The safety story is not.
Why a grade must be more than just a plausible statistic. It must be a justified belief rooted in human understanding.
In the crowded landscape of educational technology, it is easy to be seduced by "smooth flow." Apps designed with slick interfaces, gamified rewards, and passive clicking mechanisms often feel engaging. They promise learning without the friction.
This week, Dennis Sherwood posed an uncomfortable question in a powerful SRHE blog post: when we become witnesses to systemic problems in assessment when 1.6 million out of 6.5 million GCSE and A-level grades differ from what a senior examiner would have awarded. What do we do?
Across hiring and education, AI is already deciding who gets seen, shortlisted, or passed over. Tools built on large language models can read thousands of CVs, mark essays, and sort candidates faster and cheaper than any human team. The efficiency story is compelling. The safety story is not.
Today, December 11, 2025, OpenAI launched GPT-5.2. If you've been watching the space, you might be wondering: is this it? Have we reached the point where foundation models can truly replace human judgment? The answer is still no. And understanding why is critical for anyone building assessment, hiring, or decision systems that matter.
The regulator has chosen a slow path for digital assessment. The economy can’t afford to wait. Here is why the future of skills depends on Transformation, not just Substitution.
The latest NFER report confirms that the skills of the future - Creativity, Collaboration, and Problem Solving - are the hardest to measure. Here is how we solve that.
We don't import your student rosters. We don't ingest names, IDs, or demographic data linked to assessment work. We use email addresses only where operationally necessary. This isn't a constraint we're managing around. It's a deliberate design choice. Here's why.
In a landscape where generative AI has eroded confidence in traditional written assessment signals, education leaders face an uncomfortable truth: the rubrics they've carefully crafted may no longer be fit for purpose on their own. Yet abandoning rubrics entirely isn't the answer.