- Opinion
Where's the beef? Learning, Performance, and the Slow AI Revolution

If you’re old enough to remember the classic Wendy’s commercial, you’ll recall the irreverent demand: “Where’s the beef?” In education technology today, with all the sizzle around AI-powered assessment and automation, the same question applies. Are we seeing deeper learning—or just more output with less substance?
A recent Social Market Foundation paper by Tom Richmond (5th Sept 2025) makes the case that schools, policymakers, and edtech providers need to urgently rethink how AI is used in education. The government’s latest guidance talks up generative AI’s potential to reduce teacher admin workloads and enhance student support, but it pays less attention to a crucial difference: learning versus performance.
Learning vs. Performance: The Real Stakes
Performance is tempting to measure. It’s visible, quick, and easy to record—number of essays completed, lesson plans generated, answers marked. But as Richmond’s paper points out, performance is not the same as learning. True learning means information is stored in long-term memory through sustained, effortful thinking. Output, however, is just what’s produced; it’s fleeting, surface-level, and can be inflated by shortcuts like cognitive offloading—letting tools do the hard thinking for us.
The research cited is troubling. Students who rely on GenAI tools like ChatGPT show improved short-term performance, but deeper analysis reveals a downside: less neural activity, weaker memory recall, and poorer reasoning skills when those tools are taken away. In essence, they're acing the test as long as the AI ‘crutch’ is available, but stumble once true independent thinking is required.
Worse, the allure of measuring outputs leads educators and technologists to prioritise what’s easiest to count, rather than what matters most. This risks encouraging a generation of learners who perform—but don’t deeply know.
'Slow AI': Thinking Hard and Embracing Difficulty
How can we escape the trap? Rory Sutherland (Are we too impatient to be intelligent), a leading voice in behavioural economics, advocates for ‘Slow AI’ — tools that encourage reflection, iteration, and the friction that makes for real thinking. Instead of instant answers, Slow AI fosters “desirable difficulty”, the pedagogical notion that struggling with new material actually creates more profound, lasting learning.
The best learning happens when we “think hard”, when cognitive load is managed—not eliminated. This aligns with decades of educational theory: “Memory is the residue of thought,” as Dan Willingham famously put it. Over-reliance on AI shortcuts can hollow out the very processes that build critical thinking, creativity, and deep expertise.
RM Compare: Balancing Output with Outcome
At RM Compare, we’ve taken these issues to heart. Our Adaptive Comparative Judgement platform is built with the understanding that assessment isn’t always about rapid outputs—it’s about cultivating thoughtful engagement and promoting real learning outcomes.
How? By shifting the focus from replacing judgment with automation towards augmenting and supporting the deep, reflective process of comparative evaluation. Judges are brought into active dialogue with the evidence—reviewing and comparing real student responses, discussing reasoning, and iteratively refining judgements. This isn’t just a technical efficiency, it’s a pedagogical ethos: creating space for slow thinking, metacognition, and intellectual rigour.
Unlike standard outputs-based AI, RM Compare acts as a facilitator rather than a shortcut, nudging participants to make meaningful distinctions and reflect on their choices. Feedback is shaped collaboratively, supporting schema-building and deep understanding—not metacognitive laziness.
Where Do We Go from Here?
The 'beef' in educational assessment comes from recognising that not all that glitters is gold. Fast, automated outputs are appealing, but when they skip the hard work of cognitive engagement, learning suffers. The evidence is clear: assessment that values thoughtful judging and slow deliberation leads to richer, more resilient outcomes. That’s why RM Compare integrates both human and technical strengths to promote deep learning while keeping performance honest and meaningful.
At RM Assessment we think deeply about issues around AI and assessment, working with some of the world's biggest and most influential assessment providers. On October 8th we will be hosting our AI in Action in Summit in London - registration is currently open.
In a world of ever-faster AI, maybe it’s time to go slow. To ask, critically, “Where’s the beef?” And to build platforms and policies that put true learning back on the table.