Access expert analysis, compliance trends, and real-world strategies — curated exclusively for our subscribers.
Your subscription gives you full access to our premium Insights articles and compliance briefings.
Join our community of professionals committed to excellence in compliance and training. Subscribe to our newsletter to receive:
Enter your information below to start receiving valuable insights directly to your inbox.
AI proctoring systems often produce false positives by misinterpreting human behavior, leading to unjust accusations of cheating. These systems should not make definitive judgments without human review; instead, assessments should focus on maintaining valid testing conditions and accommodating learners' needs.
AI-driven assessment systems don’t catch cheaters. They catch behavior. And too often, that’s not the same thing.
When we talk about AI in assessment and training, we tend to focus on what it can detect: eye movement, background noise, browser activity, face position. But we rarely talk about what it can’t: intent, context, or reality.
Over the past few years, I’ve seen dozens of cases where online proctoring systems raised red flags that had nothing to do with cheating — and everything to do with misunderstanding human behavior or environmental nuance. The result? False positives. And the consequences? Often career-threatening.
Let me give you a few real-world examples:
Each of these incidents was framed — not as a maybe — but as a definite breach, passed along to organizations or employers as if the AI had caught someone red-handed. But it hadn’t. It had seen behavior, made an assumption, and left the human interpretation out of the loop.
Under the EU AI Act, systems used in high-risk domains — including education — are prohibited from making unreviewed characterizations of people. That means AI shouldn’t be labeling a test session as "cheating" or "compromised" without a human in the loop.
Why? Because behavior is not evidence. And inference is not fact.
That’s the foundational problem: most AI proctoring doesn’t detect wrongdoing. It detects patterns and anomalies — and outsources the guilt to the people reviewing the report, often with an implied conclusion.
If your goal is to prove someone cheated, you're playing a probability game you’ll rarely win cleanly. Instead, the better approach is to define the conditions under which the assessment is valid, and enforce those.
In other words: don’t try to catch cheaters after the fact. Instead, pause or remove individuals from assessments when the environment clearly fails to meet baseline conditions for trust.
In-person testing centers have understood this for decades. Their job isn’t to prove someone cheated — it’s to ensure a quiet, distraction-free space, monitor basic behavior, and interrupt the assessment if anything suspicious occurs. The standard isn’t proof of guilt — it’s a break from acceptable conduct.
We should treat online testing the same way.
Imagine an in-person exam. A student keeps glancing at their palm. The instructor walks over and sees them rub their hand, smearing what looks like ink. When the instructor looks closer, the evidence is gone.
Did the student cheat? Maybe. Maybe not. But the behavior created reasonable suspicion, and that’s enough to intervene — not accuse.
In online assessments, that intervention is usually missing. The system flags. The learner is told they’ve “violated exam rules.” But no one asks the most important question: Did the learner actually do anything wrong — or just fail to meet unrealistic environmental conditions?
The lesson here is twofold:
This means placing the responsibility where it belongs: on the learner to create a distraction-free, private space — and on the organization to define and communicate those expectations clearly. If those expectations can’t be met due to disability, housing, or other constraints, then accommodation should be sought — not penalization.
The goal of assessment isn’t to trick learners or catch them out. It’s to create conditions where demonstrating competence is possible and trustworthy.
If we design systems that default to suspicion, we don't protect integrity — we erode it.
And if we let AI make the call without context, we’re not enforcing standards — we’re outsourcing judgment.
False positives aren’t just a technical glitch. They’re a human cost. And unless we rethink how we use AI in assessments, that cost will keep falling on the very people the system was supposed to serve.
Cognisense is a team of specialized experts dedicated to helping organizations navigate regulatory, legal, and industry standards. We focus on identifying the right technology, applications, and processes to ensure compliance while maintaining effective risk mitigation.
Robert Day, our Managing Director, brings decades of experience in high-risk industries. With deep regulatory knowledge and investigative expertise, he is passionate about protecting lives and ensuring organizations adopt rigorous, technology-driven compliance strategies.