Assessing the spoken word - What we know.

PISA 2025- What's new.

A few weeks ago, PISA made the announcement on the introduction of Foreign Language Assessment (FLA) in 2025. The target language for PISA FLA 2025 will be English with focus on three skills – reading, listening and speaking. For non-native speakers in K-12 education, English language learning is a complex process with multitude of influencing factors. In populations where differences in curriculum, pedagogy and assessments exist, an assessment aimed at standardising foreign language learning is evidently a need to be fulfilled.

What we know about assessing the spoken word.

Having collaborated with Voice21, the UK’s education oracy charity over the last year made one thing very evident - oracy is a very difficult skill to assess objectively and reliably. Evidence from several studies indicate that the spoken word involves a combination of different variables that may have numerous influencing factors, suggesting that distinct strands of competencies may need to be assessed separately to avoid or at least to reduce the subjectivity of the assessment.

What evidence from academic research says.

Prof. Neil Mercer (University of Cambridge) and Dr James Mannion (Rethinking Education) in their research review enlist a few challenges in the assessment of oracy:

1.The fact that spoken language is ephemeral.

2.The restriction on the number of pupils that can be assessed at a time.

3.The context specificity of speech.

4.Teachers lacking the skills to assess oral language.

5.The ability of the speaker to modify their speaking strategies appropriately in accordance with the demands of different tasks and different audiences.

Dr Ayesha Ahmed (University of Cambridge) in her research findings adds that among all the challenges involved in assessing oracy, the hardest one to tackle is the subjective nature of the judgements due to the notorious unreliability of assigning marks or grades to performances (extending to complex responses to open-ended tasks such as art, music, drama and essays). She indicates that the main reason that oracy is not included in high stakes assessments is due to the complexity involved in the assessment, the subjectivity and the lack of reliability stemming from subjectivity.

How can we address these challenges.

Recent revisions made to high-stakes assessments of speaking in the UK indicate that awarding bodies are willing to look at different ways of assessing speaking and appreciate the negative impact of the current system on teaching and learning. They also point out that there is a need for more reliable ways of assessment. Comparative Judgment (CJ) serves the purpose as an alternative to the traditional method of assessment, offering the ability to handle the issue of subjectivity and to help deliver judgements with greater reliability. Comparative Judgment (CJ) is originates in Thurstone’s (1927) Law of Comparative Judgement.

Compelling evidence from pilot study.

Voice21 chose RM Compare to trial the assessment of oracy by adopting the The Oracy Skills Framework and the evidence is compelling. With over 70% reliability across all cohorts of participants, the results clearly indicated that Comparative Judgement offers a viable route for oracy assessment. Read more about the work done by Voice21 here: Comparing Talk - The problem of assessing oracy: is comparative judgement the answer?


[1] Ulker, V. (2017). The Design and Use of Speaking Assessment Rubrics. [online] 8(32). Available at:

[2] Mercer, N. and Mannion, J. (2018). Oracy across the Welsh curriculum A research-based review: key principles and recommendations for teachers. [online] Available at:

‌[3] Ahmed, A. Oracycamb (2017). Should we assess oracy, and can comparative judgement help? [online] ORACY CAMBRIDGE. Available at:

[4] Consultation on the Removal of Speaking and Listening Assessment from GCSE English and GCSE English Language. (2013). [online] Available at: