Lecturer detects bot use in one-fifth of assessments as concerns mount over AI in exams | Australian universities
An associate communications lecturer at Deakin University has detected the use of bots in almost one-fifth of assessments, sparking concerns that the use of artificial technology to cheat in exams is widespread.
Of 54 summer postgraduate assessments Sally Brandon marked, 10 had “significant, detectable bot assistance” – the highest rate in the five years she has used software to detect bots.
ChatGPT – the latest viral chatbot software – has caused internal alarm for its ability to evade plagiarism detection tools. The software Brandon uses detects a range of bot assistance, including software that helps improve AI-generated text.
Tech developers who have created software they claim detects text composed by ChatGPT have had mixed results.
Brandon is awaiting confirmation as to whether the results of her bot detector will count as a formal breach of academic integrity.
“It’s still learning and the more we play with it, the more it learns,” Brandon said.
“The big issue is how do educators revise assessments in light of this technology – which is not going away – so that we can fairly and accurately assess student competencies.
“I’m looking at how it could be incorporated in learning design and how assessment needs to be revised to account for its use.”
Dr Anna Bunn, a senior law lecturer at Curtin University, said the chatbot would “revolutionise” education but appeared to be more accurate on US-centric responses.
When she asked the AI to answer a question on the tort of negligence specific to Australian law, there were factual inaccuracies and made-up references.
Pushing the technology further, she asked it to describe the recent supreme court case on abortion in the US. The current dataset ChatGPT is trained on expires after 2021, prior to the ruling being handed down.
It apologised and said it couldn’t find any case with that name.
“I’m happy about that, but I don’t think it’s going to necessarily always be the case,” she said. “The technology is quite incredible, in some cases it’s not just a pass, it’s a good answer.
“There are some really amazing uses even for academics … but it does worry me. It will revolutionise the way we assess … I’ll be looking to increase oral assessments.”
Dilan Thampapillai, associate dean at the University of NSW business school, has tried a few searches with the chatbot and also found it had misidentified information in multiple results.
“It’s only as good as the input data,” he said. “If that’s flawed then its outputs will be as well.”
ChatGPT currently doesn’t have access to search engines like Google, instead drawing its information from academic sites, including Trove.
Other similar software, including ChatSonic, can use live Google data in its responses.
“If it gets full access to the internet, with no constraints, then it actually could become a major privacy problem,” Thampapillai said. “It could also end up disseminating false information because it seems to have a misidentification problem.”
Beep Media digital producer Andrew Wrathall asked the bot to write a referenced essay on the appearance of Halley’s comet in regional Victoria.
He found the chatbot had fabricated references from academic website Trove and referenced fake quotes from the local paper drawn from other newspapers and articles.
“What I believe is happening is … the machine learning has created associations between words and is retrieving the associations from memory to create an article that seems real using grammar rules,” Wrathall said. “But the AI doesn’t have the nuance to know the sentence isn’t factual.”
Computer scientist and Curtin University associate professor Nik Thompson said ChatGPT excelled at generating coherent responses to simple prompts – like short reports – but still had no capacity for critical thinking and analysis.
“The language model is trained on a large dataset to learn the statistical structure of language and internally the responses are just a mathematical function,” he said.
“The goal is for the output to sound coherent, with less emphasis on whether it’s factually correct … When there are gaps, it’s not uncommon for the AI to just fabricate something that fits into the required structure.
“There could be a positive outcome if educators are now inspired to rethink the structure and role of assessments in their teaching to create authentic assessments that build on these higher level skills.”