What Do We Do About AI and Assessment When There Are No Good Answers?

Feb 12
6 min read

I've lost count of how many times I've had this conversation lately. It starts with someone mentioning AI and cheating, moves through various proposed solutions, and ends with both of us sitting there with a slightly defeated shrug and a "So... what do we actually do?"

And honestly? It's just really annoying. Not because people are asking the question – they absolutely should be – but because I keep thinking I'm going to arrive at some kind of clarity, and instead I just end up more stuck than when I started.

When Students Think It Should Be Obvious

The other day I was running a pupil voice panel with a group of sixth formers. We were following up on a whole school survey about AI usage where some 800 pupils from year 3-13 gave us their thoughts on what AI looks like - and should look like - in the classroom. As we got onto the topic of academic honesty, more than one of them complained that teachers should just know if it's them or the AI. That the difference should be obvious. They seemed genuinely frustrated.

Another story. Bear with, I'm setting the scene for the rest of the waffle: I had a chat with a colleague the other day who told me that one of his students had been caught in class using ChatGPT to write an essay... three weeks before mocks. When he confronted the student, they just shrugged and apologised, but they didn't seem too bothered. The teacher did what seemed sensible, and what we all recommend – went back through the student's past work to compare it to what was being produced in a lesson with AI. That's the exact technique those sixth formers had suggested teachers use. Look at the pattern. See what's different... Except the past work was of a similar calibre to the one produced by ChatGPT. Then the question became: was everything AI? Did I not catch it before? What did I miss?

So the teacher has tried their best, done what was recommended. Compared past work. Woven AI ethics throughout the curriculum. Brought assessment back into the classroom to avoid any risk of academic dishonesty... Yet here we are. And we're still losing some of the special things that make classrooms what they are.

But... what do we do about it?

The Viva Solution... Or Not

Since what feels like the dawn of the AI era – if we can call it that – people have suggested oral defences as a solution. Vivas, essentially, where students explain their thinking and demonstrate their knowledge through conversation.

It makes sense, doesn't it? If you can't verify the written work, verify the understanding behind it.

Which is fine if you're assessing a handful of PhD candidates a few times a year. Less fine if you have somewhere in the vicinity of 200 students that you only see a few times a week, need to teach a full curriculum to, and need to somehow give your full attention to in a crowded classroom. My fellow MFL teachers – for the very short time I was one – will recognise the difficulty of scheduling just a single year group's worth of oral examinations once a year, let alone something for every student, multiple times a term.

But here's where it gets interesting. This week I came across an adjustment to this approach that made me pause. Caltech is using an AI-powered interview bot as part of their admissions process. Students have that viva-style discussion with an AI in a one-to-one environment, and then humans review those recordings. It reduces the pressure on the teacher to conduct each conversation individually.

Clever, right?

Except... the ability to write something is still important, isn't it? I mean, a significant part of the British education system centres on assessing students' ability to construct an essay. A digital viva might be brilliant for checking that your student understands the content, but it doesn't necessarily verify that they wrote the work themselves. And it doesn't test their skills in written communication – or, for that matter, check that they're ready for the exam format itself.

So that's not a solution. Yet.

But it also connects to something I've been wrestling with for ages. On the one hand, I think traditional exams are, well... stupid. Largely. They test a narrow set of skills under artificial conditions that don't reflect how we actually work or think or create. But I've spoken with and mentored young people who grew up in deprived areas, and for many of them, exam grades are the only way to escape their situation. Exams become something very powerful – a door that opens when no others will.

So maybe it's more that exams are a mixed bag. But regardless of how we feel about exams themselves, being able to communicate clearly in writing is genuinely important.

An oral defence can't be our only method of assessment.

Then again, if I can use AI to augment my written communication, does that change how important the mechanical skill of writing is? But then again, the process of writing is also the process of organising your thinking...

You see the problem. I keep going in circles.

A Stupid Opinion (Not Mine, Luckily)

Andrej Karpathy, one of the cofounders of OpenAI, said something rather definitive recently: there's no way you can detect AI. Full stop. He went on to say that the only solution is to flip classes around and move the majority of testing to in-class settings: "TLDR the goal is that the students are proficient in the use of AI, but can also exist without it, and imo the only way to get there is to flip classes around and move the majority of testing to in class settings.”

Screenshot of a tweet from Andrej Karpathy that reads "A number of people are talking about implications of AI to schools. I spoke about some of my thoughts to a school board earlier, some highlights: 1. You will never be able to detect the use of AI in homework. Full stop. All "detectors" of AI imo don't really work, can be defeated in various ways, and are in principle doomed to fail. You have to assume that any work done outside classroom has used AI. 2. Therefore, the majority of grading has to shift to in-class work (instead of at-home assignments), in settings where teachers can physically monitor students. The students remain motivated to learn how to solve problems without AI because they know they will be evaluated without it in class later. 3. We want students to be able to use AI, it is here to stay and it is extremely powerful, but we also don't want students to be naked in the world without it. Using the calculator as an example of a historically disruptive technology, school teaches you how to do all the basic math & arithmetic so that you can in principle do it by hand, even if calculators are pervasive and greatly speed up work in practical settings. In addition, you understand what it's doing for you, so should it give you a wrong answer (e.g. you mistyped "prompt"), you should be able to notice it, gut check it, verify it in some other way, etc. The verification ability is especially important in the case of AI, which is presently a lot more fallible in a great variety of ways compared to calculators. 4. A lot of the evaluation settings remain at teacher's discretion and involve a creative design space of no tools, cheatsheets, open book, provided AI responses, direct internet/AI access, etc. TLDR the goal is that the students are proficient in the use of AI, but can also exist without it, and imo the only way to get there is to flip classes around and move the majority of testing to in class settings." — Putting the whole screenshot here so that you don't have to go onto X (gross)

Now, he's obviously incredibly knowledgeable about AI capabilities. And the logic makes sense – if you can't verify what's produced outside the classroom, verify what's produced inside it.

But here's what bothers me about this solution, and I think it's at the heart of why I'm so stuck on this whole question: I don't actually want classrooms to become primarily spaces for testing and assessment.

When I think about what makes classroom time special – what makes it different from all the learning that can happen elsewhere – it's the discussions. The collaborative thinking. The moment when someone makes an unexpected connection and the whole group lights up. It's trying something difficult together and working through it. It's the teacher noticing exactly where someone's stuck and offering just the right question or example at just the right moment.

If we turn classrooms into assessment centres, we lose that, don't we? The teacher becomes the invigilator rather than the co-learner, the discussion facilitator, the person guiding the construction of knowledge. The classroom stops being a place of learning and becomes a place of proving learning.

That feels like such a loss.

We're being pushed towards a solution that technically works but fundamentally changes what teaching is. Can't we do both? you might ask. But that's exactly what I'm struggling with. The time doesn't stretch. If we spend more time on in-class assessment, we spend less time on everything else that makes learning meaningful.

We're caught between two equally unsatisfying positions, and I don't know how to resolve the tension.

We can't detect AI use reliably. Multiple sources, including people building the technology itself, have confirmed this. The detection tools don't work. The watermarking isn't there. The statistical analysis produces false positives. It's not happening, and it's not going to happen.

But the main alternatives being proposed – bringing assessment back into the classroom, conducting vivas, testing in controlled conditions – means fundamentally changing what classrooms are for. And that feels like we're solving one problem by creating another.

The uncomfortable truth is that I don't know how to preserve what's valuable about the classroom whilst keeping assessment secure when we know that we do have to keep assessment – at least for now. The pressures are real. The accountability systems exist. The qualifications matter for students' futures. We can't just abandon assessment, even if we wanted to.

Where I'm At... Right Now, Anyway

I keep coming back to that conversation with my colleague. To the student who shrugged. To the sixth formers who think it should be obvious. To Karpathy's definitive statement about detection being impossible. To my own conviction that classroom time is too precious to turn into an extended exam.

Perhaps the answer is that there isn't an answer. Perhaps different contexts need different approaches, and we're going to be navigating this uncertainty for quite some time. Perhaps we need to accept that we're going to get some of this wrong whilst we figure it out. And whilst I don't love that feeling, maybe it's where we need to be – sitting with the discomfort long enough to really understand what we're willing to give up and what we absolutely need to protect.

I'd love to know I'm not the only one still trying to work it out.

When Students Think It Should Be Obvious

The Viva Solution... Or Not

A Stupid Opinion (Not Mine, Luckily)

Where I'm At... Right Now, Anyway

Comments