Curriculum

ACJ as Agnostic Assessment Infrastructure - The Architecture of Curriculum Alignment (Part 3)

By Mark House

27th feb 2026

In the first two posts of this series, we set out the terrain. Kelly’s three curriculum models showed that different kinds of curriculum demand different kinds of assessment. The Camau i’r Dyfodol project then helped us see Curriculum for Wales (CfW) as a process curriculum – and showed what happens when teachers actually work with that model in practice.

Now we turn to Adaptive Comparative Judgement (ACJ) and RM Compare.

The core idea of this post is simple but easy to overlook: ACJ is an engine, not an ideology. It doesn’t come with “product” or “process” baked in. It will faithfully amplify whichever conception of quality and progression you plug into it. That makes it powerful – and dangerous – in systems where curriculum and assessment are not yet aligned.

ACJ in one sentence

ACJ replaces the attempt to score work directly with the question: “Which of these two pieces is better, and in what sense?” Repeating that simple comparative judgement many times, across many judges, allows us to build a stable scale of work quality without pretending we can pin everything down to a perfect mark scheme.

That basic mechanism can be used in very different ways:

to create precise, product‑style rank orders and cut‑scores, or
to surface and explore a rich space of diverse, process‑aligned responses, or
to support more content‑focused comparisons of how well key ideas are understood.

The difference lies not in the algorithm, but in the assessment logic wrapped around it.

How ACJ behaves under different curriculum models

To make this concrete, it helps to revisit Kelly’s three models and ask: “If this is the curriculum, what would ‘aligned’ use of ACJ look like?”

1. In a content model: comparing understanding of key knowledge

In a content‑oriented curriculum, the core question is whether learners have encountered and internalised a specified body of knowledge. ACJ can still add value here, but the focus is on how well that content is understood and communicated, not on open‑ended creativity.

For example:

Learners might write brief explanations of a historical event, a scientific concept, or a mathematical idea.
Judges compare pairs of responses and ask: “Which demonstrates a more accurate, complete and coherent grasp of the required content?”
Over many judgements, a scale emerges that distinguishes stronger and weaker understandings of the same material.

Here ACJ is helping to refine judgments of content mastery. The curriculum model (content) and the assessment use of ACJ are in the same row: we are still primarily interested in accurate transmission, but we are using human judgement to capture subtleties that raw right/wrong scoring might miss.

2. In a product model: sharpening high‑stakes decisions

In a product curriculum, we start from clear objectives and standards, and we often need reliable, comparable results for grading and accountability. ACJ can be configured to behave in a very product‑like way:

Tasks are tightly specified to elicit evidence of the defined outcomes.
Judging is guided by clear criteria or standard descriptions of performance levels.
The primary aim is to produce a defensible rank order or scale that supports decisions about grades, thresholds and targets.

Imagine a national writing assessment:

Thousands of scripts are uploaded.
Judges compare pairs, guided by a rubric that defines what counts as better in terms of the stated outcomes.
The resulting scale is used to set grade boundaries, report performance, and potentially feed into performance tables.

Here ACJ is acting as a more valid and reliable scoring engine inside a classic product architecture. It makes the existing wall smoother, not thinner: the underlying logic remains one of fixed standards, deterministic outcomes and high‑stakes consequences.

3. In a process model: making progression visible in rich work

In a process curriculum like CfW, the priorities shift. The curriculum is not just a list of things pupils should be able to do; it is a framework for developing capabilities over time through rich experiences. Assessment needs to illuminate progression, not just certify endpoints.

In this context, ACJ can be used very differently:

Tasks are open enough to allow diverse, authentic responses – projects, performances, multimodal artefacts.
Judging is guided by process‑based descriptors (for example, principles of progression or locally agreed indicators of “more developed” reasoning, creativity, collaboration, ethical awareness, and so on).
The primary aim is to understand the space of responses and how they cluster, rather than to drive everything to a single cut‑score.

A typical process‑aligned use might look like this:

A group of schools designs a rich interdisciplinary project aligned with CfW.
Pupils produce varied artefacts: writing, media, presentations, designs.
Teachers use ACJ to compare pieces, asking questions like, “Which of these shows more sophisticated reasoning / deeper understanding / more independence?”
The resulting scale is then used to identify exemplar clusters at different points in a progression, to write narrative descriptions of what “earlier” and “later” looks like, and to support professional dialogue about next steps.

In this mode, the “output” is not just a rank order. It is a shared, evolving picture of progression in complex learning – grounded in real work, refined through professional judgement, and aligned with the curriculum’s process model.

ACJ as an agnostic engine

Putting this together, the same core technology can live in very different architectures:

Curriculum model	If ACJ is used in an aligned way…	What ACJ mostly produces
Content	Compares how well core knowledge is understood and communicated.	Better discrimination within “knows/doesn’t know”, richer insight into misconceptions and depth.
Product	Ranks work against clear standards to support high‑stakes decisions.	Stable scales and cut‑scores for grading, progression and accountability.
Process	Surfaces and clusters diverse high‑quality responses to illuminate progression.	Exemplar sets, developmental descriptions, and shared professional understanding.

The engine is agnostic; the effects are not. They depend entirely on:

how tasks are designed;
what criteria or descriptors guide judgement;
what decisions are attached to the outcomes;
and how all of that relates to the curriculum’s underlying model.

This is why it’s not enough to say “we’re using ACJ, therefore we’re progressive” (or reliable, or fair). ACJ can happily power a very traditional product system or a very forward‑looking process system. The critical design decisions lie outside the algorithm.

Why this matters for Curriculum for Wales

For CfW, the stakes are clear:

If CfW is treated as a process curriculum, but ACJ is configured purely as a product‑style grading engine (e.g. primarily to generate high‑stakes ranks and scores), the system will experience assessment drag. Teachers will quickly learn that, whatever the curriculum says, they are being judged on numbers that behave like the old model.
If, instead, ACJ is explicitly tuned to CfW’s principles of progression, used to build shared exemplars and developmental descriptions, and coupled with low‑stakes, formative and evaluative uses, it can become one of the key pieces of infrastructure that makes a process curriculum workable at scale.

The design question for Wales is therefore not “Should we use ACJ?” but:

What conception of quality and progression are we asking ACJ to embody, and how does that line up with the curriculum we say we have?

In the next post, we’ll flip the lens and look at what happens when curriculum and assessment logics are not aligned – drawing on Scotland’s experience with Curriculum for Excellence – and why any assessment method, ACJ included, will fail if it is asked to serve the wrong master.

Group	Name	Domain	Expiration	Security	Purpose
necessary	csrftoken	compare.rm.com	365 days, 0:00:00	HTTP	Helps prevent CSRF attacks
necessary	_cf_bm	vimeo.com	1 day, 0:00:00	HTTP	Used to distinguish between humans and bots
preferences	wtm	compare.rm.com	365 days, 0:00:00	HTTP	Used to store users cookie preference choices
statistics	_ga	rm.com	365 days, 0:00:00	HTTP	Registers a unique ID used to generate statistical data on how visitor used the website
statistics	_ga_#	rm.com	365 days, 0:00:00	HTTP	Used by Google Analytics to collect data on user visits to the website
statistics	_hp2_#	rm.com	1 day, 0:00:00	HTTP	Collects data on the user's navigation and behaviour on the website
statistics	_hp2_id.#	rm.com	365 days, 0:00:00	HTTP	Collects data on the user's navigation and behaviour on the website
statistics	_hp2_ses_props.#	rm.com	1 day, 0:00:00	HTTP	Collects data on the user's navigation and behaviour on the website
statistics	vuid	vimeo.com	365 days, 0:00:00	HTTP	Collects data on the user's visits to the website
marketing	td	googletagmanager.com	0:00:00	HTTP	Used by Google Tag Manager to collect data on the user behaviour and interaction with the website
marketing	h	heapanalytics.com	0:00:00	HTTP	Collects data on the user behaviour and interaction with the website

Name	Domain	Purpose	Expiration	Security
csrftoken	compare.rm.com	Helps prevent CSRF attacks	365 days, 0:00:00	HTTP
_cf_bm	vimeo.com	Used to distinguish between humans and bots	1 day, 0:00:00	HTTP

Name	Domain	Purpose	Expiration	Security
_ga	rm.com	Registers a unique ID used to generate statistical data on how visitor used the website	365 days, 0:00:00	HTTP
_ga_#	rm.com	Used by Google Analytics to collect data on user visits to the website	365 days, 0:00:00	HTTP
_hp2_#	rm.com	Collects data on the user's navigation and behaviour on the website	1 day, 0:00:00	HTTP
_hp2_id.#	rm.com	Collects data on the user's navigation and behaviour on the website	365 days, 0:00:00	HTTP
_hp2_ses_props.#	rm.com	Collects data on the user's navigation and behaviour on the website	1 day, 0:00:00	HTTP
vuid	vimeo.com	Collects data on the user's visits to the website	365 days, 0:00:00	HTTP

Name	Domain	Purpose	Expiration	Security
td	googletagmanager.com	Used by Google Tag Manager to collect data on the user behaviour and interaction with the website	0:00:00	HTTP
h	heapanalytics.com	Collects data on the user behaviour and interaction with the website	0:00:00	HTTP