Crap, computers can detect if something was written by ChatGPT?
This sucks. There goes my retirement. I suppose this means I need to keep cranking out this crap myself.
5 Ways to Detect Text Written by AI
The best way to figure out if an artificial intelligence wrote something may be to ask AI. We test AI-detection services with text written by ChatGPT and text written by a human: Here are the results.
By Chandra Steele, PC Magazine
Can you spot ChatGPT-generated text? The AI is being used in emails, cover letters, marketing pitches, college essays, coding, and even some news stories. But sussing out what's written by a human and what's written by a computer program may be best left to the computers themselves.
Detection tools have proliferated in the wake of ChatGPT and alternative large language models (LLMs). Most are free, albeit with character limits (something that can be bypassed by pasting in chunks of text at a time). An AI detector can serve many purposes, from making sure the text you write doesn't come off as too generic and stilted to uncovering deception from job candidates.
Educators are at the top of the list of those who could use a reliable way to tell whether something has been written by an AI. And they have indeed been among the early adopters of AI detector software. But just as ChatGPT and its kind can be unreliable, so are the AI detectors.
In the ChatGPT subreddit, a high school student recently sought advice(Opens in a new window) after being falsely accused by their history teacher of using ChatGPT. The teacher would not disclose what tool was used and, according to the student, felt justified in making the claim because the detector had helped them catch other AI-written text from other students who admitted to using ChatGPT.
It’s a cautionary tale we wanted to tell before we get to this roundup of popular AI detectors and our experience with some of them. Since ChatGPT and the like are trained to imitate how humans speak, separating out what an AI has cribbed from common usage and what is actual text written by people is not an easy task—even for AI.
There is some talk in the AI community of AI generators including a watermark(Opens in a new window), or signals within AI-written text that could be detected by software without affecting the text's readability. But this would have to include the participation of the companies that produce AI content, and it's unlikely any of them would do something that puts them at a disadvantage relative to their competitors.
That said, here are some of the most-used AI detectors. To try all of the free ones, I ran through text from my own story Is Dall-E the Next Dior? How AI Is Trying to 'Make It Work' in Fashion, as well as text from a ChatGPT-generated prompt: "Please write me an article on how AI is being used in the fashion industry, specifically Stable Diffusion, DALL-E 2, and Midjourney."
1. AI Text Classifier
AI Text Classifer(Opens in a new window) comes straight from the source: ChatGPT developer OpenAI.
It seems a little awkward for ChatGPT to evaluate itself, but since it’s an AI, it probably doesn’t care. OpenAI is also upfront about this solution's limitations: it works best with a minimum of 1,000 characters; it can mislabel both AI-generated and human-written text; it doesn't work well on text written by children or that isn't in English; and a few tweaks to AI-generated text help it to easily evade the classifier.
AI Text Classifier, which is free to use, grades text on how likely it is to be AI-generated: very unlikely, unlikely, or unclear. OpenAI stresses that the detector is meant “to foster conversation about the distinction between human-written and AI-generated content" rather than provide a definitive answer.
I used AI Text Classifer to evaluate the entirety of the ChatGPT-written essay and it gave this result: "The classifier considers the text to be likely AI-generated." For my own text, the tool determined this: "The classifier considers the text to be very unlikely AI-generated."
GPTZero(Opens in a new window) was crushing the dreams of college students just days into ChatGPT making headlines. It was developed by one of their own, Princeton senior Edward Tsai(Opens in a new window), who used the knowledge from his comp-sci major and journalism minor to analyze text for “perplexity” (how complex the ideas and language are) and “burstiness” (if there’s a blend of long and short sentences rather than sentences of more uniform length).
Tsai trained GPTZero on paired human-written and AI-generated text. While it can be used to test a single sentence (as long as it’s 250 characters or more), GPTZero's accuracy increases as it's fed more text.
GPTZero’s origin and speed to market made it popular among educators. But the program's FAQ cautions(Opens in a new window) against using results to punish students: “While we build more robust models for GPTZero, we recommend that educators take these results as one of many pieces in a holistic assessment of student work. There always exist edge cases with both instances where AI is classified as human, and human is classified as AI.”
There's a separate product, GPTZero Educator(Opens in a new window), that educators can sign up to use(Opens in a new window) for $9.99 per month. Its features are more suited to their needs: It highlights specific sentences that might be AI-written, handles larger amounts of text, and accepts multiple file formats. It will check up to 1 million words per month.
Meanwhile, anyone can try GPTZero for free at GPTZero.me(Opens in a new window). It lets registered users check up to 5,000 characters per document and three files per batch upload. A Pro plan is currently $19.99 per month; it uses a more advanced AI model and checks more words per month.
Of the AI-written text I fed it, GPTZero said: "Your text is likely to be written entirely by AI." My own received, "Your text is likely to be written entirely by a human."
GPTZero highlighted all of the ChatGPT-written text as AI-written and gave it an average perplexity score of 31.929 and a burstiness score of 21.385. My article received no highlights for possibly-AI-written sentences and got an average perplexity score of 588.703 and a burstiness score of 2,545.787. We don't know how GPTZero accounts for burstiness and perplexity, exactly, but the results seem positive.
Originality.AI(Opens in a new window) is a business-oriented product that states up front that it’s not made for academia. Its purpose is to aid content publishers in making sure that their source material or work submitted to them is not plagiarized or written by AI. This program's Chrome extension(Opens in a new window) quickly checks emails, Google docs, websites, and so on.
Unlike other AI detectors that warn about taking action based on their results, Originality.AI claims it has a 96% detection rate overall and 99%+ on GPT-4 content, 94.5% on paraphrased content, and 83% on ChatGPT. It says it owes this to “A LOT of compute power and does NOT have a free option." Originality.AI costs one cent per 100 words and uses automatic billing. Since it doesn't offer a free tool, we didn't use it to test our text.
4. Writer AI Content Detector
Writer makes an AI writing tool, so it was naturally inclined to create the Writer AI Content Detector(Opens in a new window). The tool is not robust, but it is direct. You paste a URL or up to 1,500 characters into the box on its site and get a large-size percent detection score right next to it. The product is free, and those who have a Writer enterprise plan can contact the company to discuss detection at scale.
Given 1,500 words of the ChatGPT-written piece, Writer AI Content Detector graded it "0% human-generated content" and recommended, "You should edit your text until there’s less detectable AI content." For 1,500 words of my own piece, I got a "100% human-generated" score and a robot-issued "Fantastic!" compliment.
ZeroGPT(Opens in a new window) is a straightforward, free tool for “students, teachers, educators, writers, employees, freelancers, copywriters, and everyone on earth,” which claims an accuracy rate of 98%. It works on a proprietary, undisclosed technology the company calls DeepAnalyse, which was trained on 10 million articles and text.
Users paste text into a box on the site and receive one of the following results: the text is human-written, AI/GPT-generated, mostly AI/GPT-generated, most likely AI/GPT-generated, likely AI/GPT-generated, contains mixed signals with some parts AI/GPT-generated, likely human-written but may include AI/GPT-generated parts, most likely human-written but may include AI/GPT-generated parts, and most likely human-written.
ZeroGPT knew what I was up to by submitting the ChatGPT-written piece. "Your text is AI/GPT Generated," it said, before giving it a score of 94.03% and highlighting all but one sentence as being written by AI. For my writing, I was relieved to see this conclusion: "Your text is human written," and a 0% AI-written score.
Humans Are Still the Best AI Detectors
While these AI detectors were indeed able to tell AI-written text from text written by a human, precautions against relying completely on their results still apply. I'm a professional writer; those who are not might not have the same results with their own work. I don't mean to brag—it's just some hope for me to cling to in these times of AI journalists taking jobs from human ones(Opens in a new window).