Using Adaptive Comparative Judgement to assess Physical Education at scale across school groups

Building on the work previously completed using Art artefacts, this event took a look at Physical Education. The day was led by assessment consultant Victoria Merrick with assistance from domain expert Darrel Barsby.

Introduction

We have written many times about the challenges of assessing 'subjective' parts of the curriculum like Physical Education. As a former PE teacher myself this is an area close to my heart. The 'washback' effect means that too often current assessment approaches negatively impact on both curriculum and pedagogy.

We also know that not being able to assess appropriately can impact on the status of the subject on the curriculum - we know that we 'treasure what we measure'.

This project built on our recent learnings in Art assessment but added the additional complexity of using video artefacts.

Could it work? To test it out we were joined by a bunch of experienced PE teachers for another day of enquiry at the RM TTS HQ in Hucknall, Notts.

Assessing at scale

An important facet of this study was to consider the scaling challenge. The RM Compare Multi-Cohort Accelerator allows large ACJ sessions to be deployed across school groups.

This project used the principles of the multi-cohort design to build a session where artefacts from multiple schools were merged into a single session.

Method

The Judging pool of 16 PE Teachers were tasked with ranking the performance of 28 softball players. The judgement was a simple one asking them to compare the merits of a performers swing to 'hit for power'. They did this by comparing video Items in RM Compare.

Each Item was seen 16 times by the Judging pool.

Results

In the image below we can see that the group were able to successfully rank the Items with an extremely high level of reliability. Further, we can see how the task helped us to split out the Items into 4 broad ability groups

As we can see in this image the time to judgement on this task was remarkably quick with most decisions being made in less that 30 seconds.

This meant that all judging was completed in under 10 minutes!!

The consensus of the Judges was good, however there was one 'misfit'. Interestingly this Judge came from a different organisation than the others, and had a different job role where his time in the classroom was less. As we can see above this did not negatively impact on the session reliability (it is extremely rare that it does).

Summary

It seems then, based on this initial study, that Adaptive Comparative Judgement can indeed be used in the Physical Education domain using video artefacts. There is still more to learn but this is an exciting start and we look forward to building further on this work.