AI & ML

Human Expertise Remains Crucial in Educational Assessment: Insights from Recent AI Research

By Mark House

2nd oct 2024

In the rapidly evolving landscape of educational technology, recent research has shed light on the limitations of AI in processing complex information. At RM Compare, we see these findings not as a setback, but as an opportunity to reinforce our commitment to human-centric assessment solutions.

Want to learn more? - Listen to the Deep Dive discussion

Recent Research Findings - Australia Securities and Investments Commission (ASIC)

A recent trial conducted by the Australian Securities and Investments Commission (ASIC) found that AI performed significantly worse than humans in summarizing complex documents. Here are the key findings:

Human-written summaries scored 81% on the assessment rubric, while AI-generated summaries scored only 47%.
AI summaries often missed nuance, context, and emphasis.
AI frequently included incorrect or irrelevant information and struggled with identifying key references.
Reviewers felt that AI summaries might create additional work due to the need for extensive fact-checking.

These findings underscore what we at RM Compare have long believed: human expertise remains crucial in the assessment process.

"ASIC has taken a range of learnings from the PoC, including: the value of robust experimentation; the need for collaboration between subject matter experts and data science specialists; the necessity of carefully designed prompt engineering; and given the rapidly evolving AI landscape, the importance of providing a safe environment that allows for rapid experimentation to ensure ASIC has a continued understanding of the various uses for AI, including its shortcomings."

Implications for Educational Assessment

The Power of Human Judgment

The ASIC study reinforces the unparalleled ability of humans to parse and critically analyze information.

Our Approach: RM Compare's Adaptive Comparative Judgement methodology leverages this power of human insight, supported by innovative technology.

Enhancing, Not Replacing

While AI offers potential in many areas, the research highlights its current limitations in tasks requiring deep contextual understanding.

Our Approach: At RM Compare, we focus on developing tools that enhance human capabilities rather than attempting to replace them.

Addressing Ethical Concerns

The study raised important questions about AI accuracy and reliability.

Our Promise: We prioritize transparency in our use of technology, ensuring that educators and institutions can make informed decisions about assessment methodologies.

Looking Ahead: A Hybrid Future

The ASIC report concluded that "GenAI should be positioned as a tool to augment and not replace human tasks". This aligns perfectly with our vision at RM Compare.

Our Strategy: Moving forward, we will continue to develop solutions that:

Emphasize the critical role of human judgment in assessment
Provide robust support for educators through advanced analytics
Adapt to the evolving needs of educational institutions and learners

Conclusion

At RM Compare, we see this research as validation of our human-centric approach. We remain committed to developing assessment tools that respect the complexity of human learning while leveraging technology to enhance efficiency and insight.

As we continue our journey to improve educational outcomes worldwide, we invite educators, institutions, and thought leaders to join us in exploring the optimal balance between human expertise and technological innovation in assessment.

Together, we can create a future of education that is more insightful, fair, and aligned with the nuanced realities of human learning and achievement.

Group	Name	Domain	Expiration	Security	Purpose
necessary	csrftoken	compare.rm.com	365 days, 0:00:00	HTTP	Helps prevent CSRF attacks
necessary	_cf_bm	vimeo.com	1 day, 0:00:00	HTTP	Used to distinguish between humans and bots
preferences	wtm	compare.rm.com	365 days, 0:00:00	HTTP	Used to store users cookie preference choices
statistics	_ga	rm.com	365 days, 0:00:00	HTTP	Registers a unique ID used to generate statistical data on how visitor used the website
statistics	_ga_#	rm.com	365 days, 0:00:00	HTTP	Used by Google Analytics to collect data on user visits to the website
statistics	_hp2_#	rm.com	1 day, 0:00:00	HTTP	Collects data on the user's navigation and behaviour on the website
statistics	_hp2_id.#	rm.com	365 days, 0:00:00	HTTP	Collects data on the user's navigation and behaviour on the website
statistics	_hp2_ses_props.#	rm.com	1 day, 0:00:00	HTTP	Collects data on the user's navigation and behaviour on the website
statistics	vuid	vimeo.com	365 days, 0:00:00	HTTP	Collects data on the user's visits to the website
marketing	td	googletagmanager.com	0:00:00	HTTP	Used by Google Tag Manager to collect data on the user behaviour and interaction with the website
marketing	h	heapanalytics.com	0:00:00	HTTP	Collects data on the user behaviour and interaction with the website

Name	Domain	Purpose	Expiration	Security
csrftoken	compare.rm.com	Helps prevent CSRF attacks	365 days, 0:00:00	HTTP
_cf_bm	vimeo.com	Used to distinguish between humans and bots	1 day, 0:00:00	HTTP

Name	Domain	Purpose	Expiration	Security
_ga	rm.com	Registers a unique ID used to generate statistical data on how visitor used the website	365 days, 0:00:00	HTTP
_ga_#	rm.com	Used by Google Analytics to collect data on user visits to the website	365 days, 0:00:00	HTTP
_hp2_#	rm.com	Collects data on the user's navigation and behaviour on the website	1 day, 0:00:00	HTTP
_hp2_id.#	rm.com	Collects data on the user's navigation and behaviour on the website	365 days, 0:00:00	HTTP
_hp2_ses_props.#	rm.com	Collects data on the user's navigation and behaviour on the website	1 day, 0:00:00	HTTP
vuid	vimeo.com	Collects data on the user's visits to the website	365 days, 0:00:00	HTTP

Name	Domain	Purpose	Expiration	Security
td	googletagmanager.com	Used by Google Tag Manager to collect data on the user behaviour and interaction with the website	0:00:00	HTTP
h	heapanalytics.com	Collects data on the user behaviour and interaction with the website	0:00:00	HTTP