Opinion

Robots, Roberts and The Dignity of Human Assessment

By Mark House

4th may 2022

I've been thinking a lot lately about human dignity - the belief that all people hold a special value tied solely to their humanity. Unsurprisingly, I am particularly interested what this means for education assessment.

Dignity in assessment

The issues of noise and bias are a constant when judgements and assessments are made. Great effort is made to reduce both so candidates can be 'fairly' marked and graded, for example through rounds of moderation and standardisation. Indeed, one of the compelling values of RM Compare is that by involving multiple judges and an adaptive algorithm we can achieve more efficient, reliable, valid and fair assessments.

Automation of assessment in all industries and contexts continues to accelerate. Algorithmic hiring for example is now well established in the jobs market, helping busy hiring managers to assess the match between resumes and job descriptions. However, being excluded because you don't meet the rules driving the algorithm, despite having the best qualities and quantities for the job, can feel anything but fair.

In the pandemic several Governments resorted to algorithms to predict the summer grades of millions of students who were unable to sit traditional summer exams. As expected, most students received accurate results. For a small number of students however they were wildly inaccurate. Again, this is to be expected. What seemed to come as a bit of a surprise to Government Ministers however was the public outcry. This was so vociferous that subsequently most assessment policy moved away from an algorithmic approach and toward teacher assessment.

As we know, teacher assessment is by its very nature ridden with noise and bias. It also produces grade inflation if unchecked. However, for all stakeholders this human approach to assessment, with all its frailties and challenges was seen as 'fairer'. In this case the dignity of human assessment clearly outweighed the accuracy produced by the algorithm.

We see this trade off in all assessment environments. There is a certain point, particularly where the stakes of any judgement are high, where we want a Robert rather than a Robot to make the call. Perhaps the most obvious situation where this is the case is in the criminal justice system. As the seriousness of the crime increases, so the human element of any judgement increases. This is despite startling evidence of the outrageous unfairness this creates, with wildly different sentences being laid down in similar cases by different judges (Austin and Williams 1977 - A survey of 47 judges).

What might this mean for RM Compare?

So, can Adaptive Comparative Judgement pass the 'Dignity' test? We know that users understand the underlying principle and its strength in reducing noise and bias. But we also know that there are concerns around judge anonymity in a 'wisdom of the crowd' approach. Right now this is something we are still very much at the Discover Stage and, as always, rely on our users to help us to clarify our thoughts.

Group	Name	Domain	Expiration	Security	Purpose
necessary	csrftoken	compare.rm.com	365 days, 0:00:00	HTTP	Helps prevent CSRF attacks
necessary	_cf_bm	vimeo.com	1 day, 0:00:00	HTTP	Used to distinguish between humans and bots
preferences	wtm	compare.rm.com	365 days, 0:00:00	HTTP	Used to store users cookie preference choices
statistics	_ga	rm.com	365 days, 0:00:00	HTTP	Registers a unique ID used to generate statistical data on how visitor used the website
statistics	_ga_#	rm.com	365 days, 0:00:00	HTTP	Used by Google Analytics to collect data on user visits to the website
statistics	_hp2_#	rm.com	1 day, 0:00:00	HTTP	Collects data on the user's navigation and behaviour on the website
statistics	_hp2_id.#	rm.com	365 days, 0:00:00	HTTP	Collects data on the user's navigation and behaviour on the website
statistics	_hp2_ses_props.#	rm.com	1 day, 0:00:00	HTTP	Collects data on the user's navigation and behaviour on the website
statistics	vuid	vimeo.com	365 days, 0:00:00	HTTP	Collects data on the user's visits to the website
marketing	td	googletagmanager.com	0:00:00	HTTP	Used by Google Tag Manager to collect data on the user behaviour and interaction with the website
marketing	h	heapanalytics.com	0:00:00	HTTP	Collects data on the user behaviour and interaction with the website

Name	Domain	Purpose	Expiration	Security
csrftoken	compare.rm.com	Helps prevent CSRF attacks	365 days, 0:00:00	HTTP
_cf_bm	vimeo.com	Used to distinguish between humans and bots	1 day, 0:00:00	HTTP

Name	Domain	Purpose	Expiration	Security
_ga	rm.com	Registers a unique ID used to generate statistical data on how visitor used the website	365 days, 0:00:00	HTTP
_ga_#	rm.com	Used by Google Analytics to collect data on user visits to the website	365 days, 0:00:00	HTTP
_hp2_#	rm.com	Collects data on the user's navigation and behaviour on the website	1 day, 0:00:00	HTTP
_hp2_id.#	rm.com	Collects data on the user's navigation and behaviour on the website	365 days, 0:00:00	HTTP
_hp2_ses_props.#	rm.com	Collects data on the user's navigation and behaviour on the website	1 day, 0:00:00	HTTP
vuid	vimeo.com	Collects data on the user's visits to the website	365 days, 0:00:00	HTTP

Name	Domain	Purpose	Expiration	Security
td	googletagmanager.com	Used by Google Tag Manager to collect data on the user behaviour and interaction with the website	0:00:00	HTTP
h	heapanalytics.com	Collects data on the user behaviour and interaction with the website	0:00:00	HTTP

Robots, Roberts and The Dignity of Human Assessment

Dignity in assessment

What might this mean for RM Compare?

Cookies