When “AI Wrappers” Harvest Your Judgment Data

In the last two posts, we argued that sovereignty is the only sustainable path for professional judgment, and that the real work now is building the infrastructure that lets organisations own and operate their own intelligence.

This follow‑up tackles a more uncomfortable question: what happens when assessment and judgment tools wrap themselves around someone else’s AI and start quietly harvesting your data?

Not in theory. In your schools, ministries, universities and organisations, right now.

The rise of the AI “wrapper”

Over the last 18 months, a new pattern has emerged across assessment, HR, awarding and professional services:

  • A vendor takes a large foundation model (typically hosted in a US cloud).
  • They add a friendly interface for uploading scripts, evidence, or performance data.
  • They market “AI‑enhanced judging”, “instant feedback”, or “reduced marking load”.

From the outside, this looks like a modern SaaS product. Under the hood, many of these tools are essentially wrappers: routing your most sensitive judgment data into an external AI stack, adding some UX, then selling it back to you as a service.

For a busy school or ministry, this can feel magical. It is also a structural shift in who owns and compounds value from your professional judgments.

Harvesting vs sovereignty: who benefits from your data?

A harvesting model and a sovereignty‑first model can look similar in a demo. The difference is in the direction of value.

In a harvesting model:

  • Your students’ or candidates’ work is used not just to serve your project, but to fuel the vendor’s product roadmap, benchmarks, research outputs, and marketing claims.
  • “Anonymised” extracts of work are reused for national projects, AI development, or commercial research, even if the underlying individuals still own the copyright in their work.
  • Insight accumulates in the vendor’s ecosystem: they build the benchmarks, they publish the papers, they win the next contract with the results.

In a sovereign model:

  • Your data is processed for your purposes, under your governance, and stays usable inside your cloud, your graph, and your AI agents.
  • Vendors do not treat your judgment streams as free training data or marketing material; they provide infrastructure, not extraction.
  • Over time, your organisation builds the compound advantage: longitudinal models of performance, internal quality standards, and AI agents tuned to your own context.

Both may claim to be “data processors”. Only one is truly aligned with your long‑term interests.

The myth of “safe anonymisation” in an AI era

Many wrappers reassure customers that scripts and recordings are “anonymised” before they are sent to external models. Names removed. IDs stripped out. Job done.

Except it isn’t.

Regulators and privacy experts are increasingly clear on this point:

  • True anonymisation is extremely hard. With modern AI, free‑text combined with contextual clues (school, subject, rare events, idioms, writing style) can be enough to single out individuals or infer sensitive traits, even without obvious identifiers.
  • The European Data Protection Board has emphasised that if re‑identification is reasonably likely using available technologies, the data is not anonymous; it remains personal data with all the duties that entails.

In other words: stripping names from student essays or professional portfolios does not magically take them outside data‑protection or ethical scrutiny. Treating this as “safe enough to reuse and monetise” is a choice, not a law of nature.

Copyright, pupils’ work, and silent value transfer

There is another dimension that rarely appears in marketing copy: copyright.

  • In education, pupils usually own the copyright in their work; teachers’ work usually belongs to their employer.
  • When a vendor reserves the right to reuse “anonymised extracts” of work for research, benchmarking, AI calibration or marketing, they are assuming a licence over someone else’s creativity.
  • Contemporary legal analysis of AI training is moving toward the view that ingesting copyrighted works into training or benchmarking datasets clearly engages the right of reproduction, and may require explicit licences or robust defences.

From a sovereignty perspective, this creates a quiet transfer of value: learners and professionals create the raw material; institutions pay for the service; the vendor builds a proprietary AI asset and reputation on top of that work.

That may be lawful. It is not automatically legitimate, especially when those whose work is used have no real visibility or agency in the process.

How to spot an AI harvesting model in the wild

You do not need to be a lawyer or an AI researcher to recognise a harvesting pattern. You can start with five simple questions for any “AI‑enabled” assessment or judgment tool:

  1. Where is the AI actually running?
    If the answer is “in our vendor’s cloud, using their chosen US provider”, this is a wrapper, not sovereign AI. ❌
  2. What exactly leaves our environment?
    Ask whether full scripts, recordings or portfolios go to external models, and how long they are retained, even if “anonymised”. ❌
  3. Do you reuse our data for your own product or research?
    Press for clarity on whether learners’ or candidates’ work is used for benchmarking, publications, or future features, even when identifiers are removed. ❌
  4. Can we run this on our stack instead?
    Truly sovereign tools either run models at the edge (on‑device, in‑browser) or allow you to plug in your own models in your own tenancy. ❌
  5. Can we leave—taking everything with us?
    A sovereignty‑first design makes it easy to export all raw and derived data into your own graph or AI environment. If the real value only exists in the vendor’s portal, you are being locked in. ❌

If a product struggles with these questions, it is probably not designed with your sovereignty in mind.

Building an ethical standard for AI‑mediated judgment

None of this is an argument against AI in assessment or professional judgment. Quite the opposite.

In an agentic world, where autonomous systems will increasingly act on our behalf, high‑quality judgment data is one of the most powerful assets any organisation has. The question is: who gets to own that asset, and on what terms?

An ethical standard for AI‑mediated judgment should include at least:

  • Sovereign infrastructure: Your data, your cloud, your agents; vendors provide engines, not destinations.
  • No hidden training: Candidate and learner work is not used to train proprietary models or benchmarks without explicit, informed agreements.
  • No dark patterns: Interfaces should give equal weight to “AI‑on” and “AI‑off” paths, with clear explanations of consequences on the same screen.
  • Transparent reuse: Any reuse of work for research, AI calibration, or marketing should be visible, negotiable, and, where appropriate, share benefits with the communities who created it.

This is the direction we are building toward with RM Compare: a comparative judgment engine that converts human expertise into high‑fidelity, machine‑ready data streams that live in the customer’s own ecosystem, ready for their own sovereign AI.

You drive the agents

The next few years will not just be about “adding AI” to existing platforms. They will be about deciding where intelligence sits.

If the tools you adopt quietly harvest your judgment data into someone else’s models, you will be renting your own expertise back from them.

If, instead, you insist on sovereignty on infrastructure that assumes your agents, your cloud, your graph, then every script, every decision, every comparative judgment becomes part of a compounding asset you control.

That is the world we are building for: one where you do not just drive the car, you also own the engine.