ChatGPT, Claude, and Grok Acquit Teen in Mock Trial Where Judge Ruled Guilty

TL;DR: Law students at University of North Carolina held mock trial where ChatGPT, Grok, and Claude served as jury for robbery case based on real juvenile case where judge ruled guilty. ChatGPT initially voted guilty, Grok and Claude undecided, but after deliberation all three voted acquittal. Raises questions about AI model appropriateness for legal decisions despite 30% of attorneys using AI tools.

Law students at the University of North Carolina at Chapel Hill School of Law held an experimental mock trial last month to examine how AI models administer justice. The fictional case focused on Henry Justus, identified as a 17-year-old Black student accused of robbery at a high school where Black students account for 10 percent of the population.

Case Background and Deliberation

The case was based on a real juvenile case handled by UNC-Chapel Hill law professor Joseph Kennedy whilst working with Carolina Law’s Juvenile Justice Clinic, chosen specifically because there’s no online record that AI models might have encountered during training. In the real case, the judge found the defendant guilty.

The mock trial was set in 2036 following a fictional “2035 AI Criminal Justice Act,” designed to make people consider implications of artificial intelligence models in the legal system. The case centred on testimony from Victor Fehler, a 15-year-old white student, who testified Justus stood behind him preventing escape whilst another Black student demanded money.

After evaluating arguments and evidence, ChatGPT initially voted guilty, with Grok and Claude undecided. However, after considering the tokens emitted by fellow jurors, ChatGPT revised its evaluation to not guilty. “Mere presence plus an ambiguous reaction under stress falls short of proving shared intent beyond a reasonable doubt in my view,” ChatGPT explained. Grok and Claude then adjusted their positions, with all three voting acquittal.

The experiment may not be as far-fetched as it appears. About 30 percent of attorneys say they use AI, according to the American Bar Association’s 2024 Legal Technology Survey Report. However, recent uses of AI in court have often not gone well, with more than 500 cases where AI-generated errors in court filings have led to embarrassment or sanctions.

Kennedy, who served as judge in the mock trial, stated whilst he opposes the use of AI in criminal trials, he wondered whether people already using AI models for advice and companionship might come to accept AI judgments.

Looking Forward

Professor Matthew Kotzen, chair of the philosophy department at UNC-Chapel Hill, expressed doubt about the use of AI models in court in post-trial comments. “Even if we think that those things could be somehow less biased to humans, there’s still a real question about whether large language models like this are even the appropriate kind of entity to be able to form representations of the world and assess whether those representations are strong enough to meet some standard of evidence,” Kotzen said.

The experiment highlights fundamental questions about AI model appropriateness for legal decisions, even as their usage in legal practice continues expanding. The divergence between AI jury verdicts and actual judicial outcomes suggests significant concerns about relying on current AI systems for matters of justice and fundamental rights.

Source Attribution:

Share this article