Understanding time to judgement and workload.

Illustration of comparative process

One frequently asked question is about the duration of each judgement, particularly in relation to managing Judge workload. Studies indicate that the average judgement time in comparative judgement systems is around 38 seconds. However, this figure is an average, and actual times may vary due to several influencing factors. It's not a one-size-fits-all answer – context matters!

Can your AI assistant help?

Your AI assistant can really help here. By starting off with a few simple prompts you can begin to focus in on your own particular scenario. For example

  • Prompt 1: How long on average does a judge decision take in RM Compare?
  • Prompt 2: I am creating an RM Compare Session using 2 minute videos of students talking to camera. How long will each judgement take?
  • Prompt 3: Is it quicker to make judgements using text items than video items?

Try it now in Perplexity.

What does your own session data tell you?

RM Compare session reports contain a wealth of data. These can be viewed in the system reporting area directly of by downloading as a .CSV.

To help you understand and consider matters of decision time we provide a number of data fields including

  • Average (Mean) decision time
  • Average (Median) decision time

When choosing your average you should consider the impact of user behaviour (nipping off to make a cup of coffee while still logged into the system) and the system settings (the auto log-out time for example).

How have you set up your session?

As we've highlighted before there are several diverse ways you can set up an RM Compare session. Some of the decisions you make in the set-up process may influence time to judgement for example:

  • Items: Volume? Type? Content? Quality? Contribution Method?
  • Judges: Who? Experience? Expertise? Training? Motivation? Payment?
  • Workload: Number of judgements? Decision (holistic) statement clarity?
  • Reliability Target: High? Medium? Low?
  • Session length: Minutes? Hours? Days?

What have the RM Compare Team learnt?

Based on our extensive experience, we've gathered some insights into the dynamics of decision-making time during judgement sessions. However, these observations may not necessarily apply to all sessions. As judges become more acquainted with the system, the items, and their tasks, the time they take to make a judgement tends to decrease. This acceleration in decision-making is observed as the session progresses.

Interestingly, there doesn't seem to be a direct correlation between the speed of judgement and the likelihood of misfit. In other words, judges who make quick decisions are not necessarily more prone to misfit. This also holds true for items that are judged more rapidly.

Training has been found to be an effective way to decrease the time to judgement. This is likely because training enhances familiarity with the system and the task, thereby enabling quicker decisions.The type of item being judged can also impact the time to judgement. For instance, if a judge feels the need to review an entire item before making a decision, this could potentially extend the decision time

In most scenarios, including video and portfolio assessments, the time to judgement typically settles to less than a minute. This suggests that, regardless of the item type, judges are generally able to make decisions efficiently.

Next steps - have you completed a pilot?

Before embarking on a full session it might be best to complete a small pilot. You could in the first instance simply add yourself as a judge to a new session to get a first hand experience of being a judge. You may then want to invite a small number of Judges to complete a small session to monitor time to judgement. Your learnings here will give you the confidence to extend to the full session with a better understanding of the likely time to judgement (and the workload you are expecting from each Judge).

The RM Compare Free Trial provides enough functionality to complete a small pilot if required (see below).

Summary - Factors Influencing Decision Time

A well-designed RM Compare session will be highly and effective and efficient. Competency in RM Compare session design improves with practice making it easier to meet your workload objectives.

  1. Individual Approach of Judges: Judges bring their own experiences, expertise, and biases to the CJ process. Their individual approach to making judgements can affect the time they take to decide between two pieces of work (Source).
  2. Complexity and Type of Work: The nature of the work being judged can impact decision time. For example, judging a Year 3 story might take an average of 24 minutes and 30 seconds, while a Year 4 non-fiction piece might take around 22 minutes and 52 seconds (Source). The complexity, length, and clarity of the work can all contribute to longer or shorter decision times.
  3. Judge Familiarity with the System: As judges become more accustomed to the CJ system and the criteria for judging, they may make quicker decisions (Source).
  4. Number of Judges and Judgements: The workload, which includes the number of judgements each judge is expected to make, can influence the time they spend on each decision. Judges may take longer if they are overwhelmed by a large number of judgements or if they are new to the process (Source).
  5. Training and Experience: Judges with more training or experience in CJ or the subject matter may make decisions more quickly than those who are less experienced (Source).
  6. Quality of Submissions: The relative quality of the submissions being judged can also affect decision time. If the quality difference between two pieces is clear, judges may decide more quickly than when the quality is similar (Source).
  7. Session Length: The length of the CJ session itself can influence decision time. Judges may take longer at the beginning of a session and speed up as they progress and become more familiar with the task (Source).
  8. Reliability Target: The desired level of reliability for the CJ session can affect how much time judges are expected to take. A higher reliability target may require more careful and thus potentially slower judgements


The RM Compare team are continuing to make enhancements to the system to optimise productivity and efficiency. By doing this we believe we can realise our ambition of providing a system that can be used at scale as an alternative to marking.