LLMs Struggle to Detect AI-Generated Text in Computing Education
Analysis
Key Takeaways
- •LLMs are unreliable for detecting AI-generated text in computing education.
- •Models struggle to differentiate between human-written and AI-generated content.
- •Deceptive prompts significantly reduce detection efficacy.
- •Current LLMs are unsuitable for making high-stakes academic misconduct judgments.
“The models struggled to correctly classify human-written work (with error rates up to 32%).”