Marley Stevens posted a video on TikTok last semester that she described as a public service announcement to any college student. Her message: Don’t use grammar-checking software if your professor might run your paper through an AI-detection system.
Stevens is a junior at the University of North Georgia, and she has been unusually public about what she calls a “debacle,” in which she was accused of using AI to write a paper that she says she composed herself except for using standard grammar- and spell-checking features from Grammarly, which she has installed as an extension on her web browser.
That initial warning video she posted has been viewed more than 5.5 million times, and she has since made more than 25 follow-up videos answering comments from followers and documenting her battle with the college over the issue — including sharing pictures of emails sent to her from academic deans and images of her student work to try to prove her case — to raise awareness of what she sees as faulty AI-detection tools that are increasingly sanctioned by colleges and used by professors.
Stevens says that a professor in a criminal justice course she took last year gave her a zero on a paper because he said that the AI-detection system in Turnitin flagged it as robot-written. Stevens insists the work is entirely her own and that she did not use ChatGPT or any other chatbot to compose any part of her paper.
As a result of the zero on the paper, she says, her final grade in the class fell to a grade low enough that it kept her from qualifying for a HOPE Scholarship, which requires students to maintain a 3.0 GPA. And she says the university placed her on academic probation for violating its policies on academic misconduct, and she was required to pay $105 to attend a seminar about cheating.
The university declined repeated requests from EdSurge to talk about its policies for using AI detection. Officials instead sent a statement saying that federal student privacy laws prevent them from commenting on any individual cheating incident, and that: “Our faculty communicate specific guidelines regarding the use of AI for various classes, and those guidelines are included in the class syllabi. The inappropriate use of AI is also addressed in our Student Code of Conduct.”
The section of that student code of conduct defines plagiarism as: “Use of another person or agency’s (to include Artificial Intelligence) ideas or expressions without acknowledging the source. Themes, essays, term papers, tests and other similar requirements must be the work of the Student submitting them. When direct quotations or paraphrase are used, they must be indicated, and when the ideas of another are incorporated in the paper they must be appropriately acknowledged. All work of a Student needs to be original or cited according to the instructor's requirements or is otherwise considered plagiarism. Plagiarism includes, but is not limited to, the use, by paraphrase or direct quotation, of the published or unpublished work of another person without full and clear acknowledgement. It also includes the unacknowledged use of materials prepared by another person or agency in the selling of term papers or other academic materials.”
The incident raises complex questions about where to draw lines regarding new AI tools. When are they merely helping in acceptable ways, and when does their use mean academic misconduct? After all, many people use grammar and spelling autocorrect features in systems like Google Docs and other programs that suggest a word or phrase as users type. Is that cheating?
And as such grammar features become more robust as generative AI tools become more mainstream, can AI-detection tools possibly tell the difference between acceptable AI use and cheating?
“I’ve had other teachers at this same university recommend that I use [Grammarly] for papers,” Stevens said in another video. “So are they trying to tell us that we can’t use autocorrect or spell checkers or anything? What do they want us to do, type it into, like, a Notes app and turn it in that way?”
In an interview with EdSurge, the student put it this way:
“My whole thing is that AI detectors are garbage and there’s not much that we as students can do about it,” she says. “And that’s not fair because we do all this work and pay all this money to go to college, and then an AI detector can pretty much screw up your whole college career.”
Twists and Turns
Along the way, this University of North Georgia student’s story has taken some surprising turns.
For one, the university issued an email to all students about AI not long after Stevens posted her first viral video.
That email reminded students to follow the university’s code of academic conduct, and it also had an unusual warning: “Please be aware that some online tools used to assist students with grammar, punctuation, sentence structure, etc., utilize generative artificial intelligence (AI); which can be flagged by Turnitin. One of the most commonly used generative AI websites being flagged by Turnitin.com is Grammarly. Please use caution when considering these websites.”
The professor later told the student that he also checked her paper with another tool, Copyleaks, and it also flagged her paper as bot-written. And she says that when she ran her paper through Copyleaks recently, it deemed the work human-written. She sent this reporter a screenshot from that process, in which the tool concludes, in green text, “This is human text.”
“If I’m running it through now and getting a different result, that just goes to show that these things aren’t always accurate,” she says of AI detectors.
Officials from Copyleaks did not respond to requests for comment. Stevens declined to share the full text of her paper, explaining that she did not want it to wind up out on the internet where other students could copy it and possibly land her in more trouble with her university. “I’m already on academic probation,” she says.
Stevens says she has heard from students across the country who say they have also been falsely accused of cheating due to AI-detection software.
“A student said she wanted to be a doctor but she got accused, and then none of the schools would take her because of her misconduct charge,” says Stevens.
Stevens says she has been surprised by the amount of support she has received from people who watch her videos. Her followers on social media encouraged her to set up a GoFundMe campaign, which she did to cover the loss of her scholarship and to pay for a lawyer to potentially take legal action against the university. So far she has raised more than $6,100 from more than 90 people.
She was also surprised to be contacted by officials from Grammarly, who gave $4,000 to her GoFundMe and hired her as a student ambassador. As a result, Stevens now plans to make three promotional videos for Grammarly, for which she will be paid a small fee for each.
“At this point we’re trying to work together to get colleges to rethink their AI policies,” says Stevens.
For Grammarly, it seems clear that the goal is to change the narrative from that first video by Stevens, in which she said, “If you have a paper, essay, discussion post, anything that is getting submitted to TurnItIn, uninstall Grammarly right now.”
Grammarly’s head of education, Jenny Maxwell, says that she hopes to spread the message about how inaccurate AI detectors are.
“A lot of institutions at the faculty level are unaware of how often these AI-detection services are wrong,” she says. “We want to make sure that institutions are aware of just how dangerous having these AI detectors as the single source of truth can be.”
Such flaws have been well documented, and several researchers have said professors shouldn’t use the tools. Even Turnitin has publicly stated that its AI-detection tool is not always reliable.
Annie Chechitelli, Turnitin’s chief product officer, says that its AI detection tools have about a 1 percent false positive rate according to the company’s tests, and that it is working to get that as low as possible.
“We probably let about 15 percent [of bot-written text] go by unflagged,” she says. “We would rather turn down our accuracy than increase our false-positive rate.”
Chechitelli stresses that educators should use Turnitin’s detection system as a starting point for a conversation with a student, not as a final ruling on the academic integrity of the student’s work. And she says that has been the company’s advice for its plagiarism-detection system as well.
“We very much had to train the teachers that this is not proof that the student cheated,” she says. “We’ve always said the teacher needs to make a decision.”
AI puts educators in a more challenging position for that conversation, though, Chechitelli acknowledges. In cases where Turnitin’s tool detects plagiarism, the system points to source material that the student may have copied. In the case of AI detection, there’s no clear source material to look to, since tools like ChatGPT spit out different answers every time a user enters a prompt, making it much harder to prove that a bot is the source.
The Turnitin official says that in the company’s internal tests, traditional grammar-checking tools do not set off its alarms.
Maxwell, of Grammarly, points out that even if an AI-detection system is right 98 percent of the time, that means it falsely flags, say, 2 percent of papers. And since a single university may have 50,000 student papers turned in each year, that means if all the professors used an AI detection system, 1,000 papers would be falsely called cases of cheating.
Does Maxwell worry that colleges might discourage the use of her product? After all, the University of North Georgia recently removed Grammarly from a list of recommended resources after the TikTok videos by Stevens went viral, though they later added it back.
“We met with the University of North Georgia and they said this has nothing to do with Grammarly,” says Maxwell. “We are delighted by how many more professors and students are leaning the opposite way — saying, ‘This is the new world of work and we need to figure out the appropriate use of these tools.’ You cannot put the toothpaste back in the tube.”
For Tricia Bertram Gallant, director of the Academic Integrity Office at the University of California San Diego and a national expert on cheating, the most important issue in this student’s case is not about the technology. She says the bigger question is about whether colleges have effective systems for handling academic misconduct charges.
“I would be highly doubtful that a student would be accused of cheating just from a grammar and spelling checker,” she says, “but if that’s true, the AI chatbots are not the problem, the policy and process is the problem.”
“If a faculty member can use a tool, accuse a student and give them a zero and it’s done, that’s a problem,” she says. “That’s not a tool problem.”
She says that conceptually, AI tools aren’t any different than other ways students have cheated for years, such as hiring other students to write their papers for them.
“It’s strange to me when colleges are generating a whole separate policy for AI use,” she says. “All we did in our policy is adding the word ‘machine,’” she adds, noting that now the academic integrity policy explicitly forbids using a machine to do work that is meant to be done by the student.
She suggests that students should make sure to keep records of how they use any tools that assist them, even if a professor does allow the use of AI on the assignment. “They should make sure they’re keeping their chat history” in ChatGPT, she says, “so a conversation can be had about their process” if any questions are raised later.
A Fast-Changing Landscape
While grammar and spelling checkers have been around for years, many of them are now adding new AI features that complicate things for professors trying to understand whether students did the thinking behind the work they turn in.
For instance, Grammarly now has new options, most of them in a paid version that Stevens didn’t subscribe to, that use generative AI to do things like “help brainstorm topics for an assignment” or to “build a research plan,” as a recent press release from the company put it.
Maxwell, from Grammarly, says the company is trying to roll out those new features carefully, and is trying to build in safeguards to prevent students from just asking the bot to do their work for them. And she says that when schools adopt its tool, they can turn off the generative AI features. “I’m a parent of a 14-year-old,” she says, adding that younger students who are still learning the basics have different needs than older learners.
Chechitelli, of Turnitin, says it’s a problem for students that Grammarly and other productivity tools now integrate ChatGPT and do far more than just fix the syntax of writing. That’s because she says students may not understand the new features and their implications.
“One day they log in and they have new choices and different choices,” she says. “I do think it’s confusing.”
For the Turnitin leader, the most important message for educators today is transparency in what, if any, help AI provides.
“My advice would be to be thoughtful about the tools that you’re using and make sure you could show teachers the evolution of your assignments or be able to answer questions,” she says.
Bertram Gallant, the national expert on academic integrity, says that professors do need to be aware of the growing number of generative AI tools that students have access to.
“Grammarly is way beyond grammar and spelling check,” she says. “Grammarly is like any other tool — it can be used ethically or it can be used unethically. It’s how they are used or how their uses are obscured.”
Bertram Gallant says that even professors are running into these ethical boundaries in their own writing and publication in academic journals. She says she has heard of professors who use ChatGPT in composing journal articles and then “forget to take out part where AI suggested ideas.”
There’s something seductive about the ease of which these new generative AI tools can spit out well-formatted texts, she adds, and that can make people think they are doing work when all they are doing is putting a prompt in a machine.
“There’s this lack of self-regulation — for all humans but particularly for novices and young people — between when it’s assisting me and when it’s doing the work for me,” Bertram Gallant says.