Briefly
- Colombia’s Supreme Court docket rejected a cassation enchantment after AI detectors flagged it as machine-generated.
- Legal professionals ran the ruling by way of the identical instruments and located it additionally appeared AI-written.
- Consultants and research confirmed AI-detection software program produced unreliable and inconsistent outcomes.
The Supreme Court docket of Colombia denied a cassation enchantment, arguing that it was generated by AI. However the identical software the courtroom used to find out the enchantment’s purported AI origins mentioned that its personal ruling additionally obtained generative assist.
Is it a double commonplace by the courtroom, or defective instruments at play?
“Confronted with a well-founded suspicion that the transient submitted by the legal professional had not been drafted by the authorized skilled himself, the courtroom submitted the textual content to the Winston AI software,” the courtroom argued. “Its evaluation indicated that the doc contained solely 7% human content material, evidencing a marked affect of automated writing and resulting in the conclusion that it had been produced utilizing synthetic intelligence.”
After operating the evaluation with different instruments that offered related outcomes, the courtroom dominated that “for the reason that submitting can’t be thought to be a duly submitted pleading, its dismissal as inadmissible is required.”
However when the courtroom’s ruling confronted related scrutiny from authorized specialists, it confirmed related outcomes.
“I submitted the textual content of Auto AP760/2026 from the Supreme Court docket to the identical Winston AI software program cited within the ruling,” legal professional Emmanuel Alessio Velasquez wrote on X on Tuesday. “The consequence: The doc comprises 93% AI-generated textual content.”
“If the very ruling that condemns using synthetic intelligence scores that share, the methodological fragility of utilizing these detectors as argumentative help turns into self-evident,” he argued in a subsequent tweet.
Inside hours of the courtroom posting a thread concerning the resolution on X, attorneys started operating their very own checks. Velasquez’s submit went viral in authorized circles, accumulating tens of 1000’s of views.
We ran the check on the courtroom’s verdict, as effectively, and issues initially didn’t look nice. When GPTZero scanned solely the opening phrases of the courtroom textual content, it returned a 100% AI consequence.

When the identical software processed an extended model together with the factual background part, it reversed course fully: 100% human.
The software is just not dependable sufficient to be trusted in courtroom or in conditions that might require a excessive diploma of certainty.

Colombian attorneys reacted shortly with their very own experiments. Felony protection lawyer and lecturer Andres F. Arango G, submitted a courtroom submitting from 2019, years earlier than the massive language fashions these instruments had been skilled to detect even existed, and it got here again claiming 95% AI era.
“These instruments then invite you to ‘humanize’ the article by way of their paid companies,” he wrote on X, noting an apparent industrial incentive baked into the detection enterprise mannequin.
Nicolas Buelvas ran his 2020 undergraduate thesis on the precept of belief in felony legislation. The consequence? 100% AI.
Dario Cabrera Montealegre, one other Colombian legal professional, identified the hypocrisy of counting on know-how to attempt to fight it.
“The courtroom is utilizing AI to find out if there was AI,” he mentioned. “One thing contradictory from my sensible standpoint.”
La Corte usa IA para determinar si hubo IA…!? Algo contradictorio desde mi punto de vista práctico…Si se rechaza debe ser porque como humanos lo detectamos
— Darío Cabrera Montealegre (@dalcamont_daro) March 2, 2026
Past authorized circles, additional tech-savvy people identified the hazards of extreme reliance on AI flagging instruments.
“Thus far, there isn’t a publicly accessible software that may precisely outline the proportion of AI use when drafting a textual content,” Carlos Alejandro Torres Pinedo argued. “What’s worse: Nobody can publicly confirm the supply code behind these detection platforms. How can they be used to delegitimize somebody’s proper of entry to justice?”
The technical causes for these failures are well-documented. AI detectors measure statistical patterns: sentence size, vocabulary predictability, and a high quality that researchers name “burstiness,” which refers back to the pure rhythm variation people introduce of their writing.
The issue is that formal authorized prose, tutorial writing, and texts produced by individuals who write in a second language share lots of those self same statistical signatures.
Research on AI detection
A 2023 examine revealed in Patterns discovered that greater than 61% of Take a look at of English as a International Language (TOEFL) essays by non-native English audio system had been incorrectly flagged as AI-generated.
A scientific overview by Weber-Wulff that very same 12 months concluded no accessible software is both exact or dependable. Turnitin acknowledged in June 2023 that its personal detector produced greater false constructive charges when the AI content material stage in a doc fell under 20%.
Even OpenAI needed to take down its personal AI detection software following fixed inaccuracies and an incapability to do its precise job.
Universities have been grappling with this for years. Vanderbilt disabled Turnitin’s AI detector in 2023 after estimating it will generate round 3,000 false positives yearly.
The College of Arizona dropped AI-detection options from its plagiarism software program after a scholar misplaced 20% of a grade on a false constructive. A 2024 case at UC Davis noticed 17 linguistics college students flagged, 15 of them non-native English audio system.
The sample is constant. The instruments penalize the individuals who write most formally, most repetitively, or most rigorously, precisely the profile that attorneys, teachers, and second-language audio system match.
The cultural fallout has bordered on absurdity. Throughout writing and journalism circles, folks have began avoiding em dashes of their work, not due to any model information, however as a result of AI language fashions use them often and detection instruments (and folks) have taken discover.
Writers are self-editing pure punctuation out of concern of algorithmic suspicion. Past the written world, artists have suffered the wrath of moderators and colleagues for making artwork items that look AI
Colombia’s two rulings—AC739-2026, through which the Civil Chamber fined a lawyer for citing 10 nonexistent AI-generated precedents in February, and AP760-2026—are rising as among the area’s first judicial choices immediately confronting the misuse of generative AI in authorized filings.
Colombia’s judicial department adopted formal pointers in December 2024 that regulate how judges and courtroom workers can use synthetic intelligence.
The principles permit AI for use freely for administrative and help duties, similar to drafting emails, organizing agendas, translating paperwork, or summarizing texts, whereas allowing extra delicate makes use of, like authorized analysis or drafting procedural paperwork, solely with cautious human overview.
The rules explicitly prohibit counting on AI to judge proof, interpret the legislation, or make judicial choices, emphasizing that human judges stay absolutely accountable for all rulings and should disclose when AI instruments had been utilized in making ready judicial supplies.
These pointers, compiled within the “PCSJA24-12243” settlement, might be used to contest such a call.
The Supreme Court docket has not but issued any further assertion in response to the backlash over its alternative of detection instruments. The ruling didn’t have em dashes, both.
Every day Debrief Publication
Begin on daily basis with the highest information tales proper now, plus unique options, a podcast, movies and extra.
