Detection and flagging of articles that are AI/LLM-generated
Open, Needs TriagePublicFeature
Actions

Assigned To

Authored By

	Novem_Linguae
	Feb 22 2023, 10:13 PM

Description

LLM is Large Language Model, the fancy word for ChatGPT-like AI's where you can ask it a question or give it a topic, and it will output a fluent-sounding but factually incorrect answer.

Newer users who don't know our rules have been creating articles using AI. One AFC reviewer reports having seen 60 drafts that were LLM-generated.

There are open source detectors for LLM that are very accurate They'll give a probability, and many of the matches are above 99%. We should look into the feasibility of adding a feature to PageTriage that tags articles with LLM-generated text.

Technically, there's several approaches:

We could create something similar to the pagetriagetagcopyvio API
We could find an open source one and incorporate its code, assuming it is not resource-intensive, i.e. requiring a farm of AIs or something.
We could create a third party tool similar to Earwig's copyvio detector

Credit to S0091 for the idea.

Related Objects

Mentioned In: T330348: Incorporate Earwig's copyright detector into PageTriage

Event Timeline

Novem_Linguae created this task.Feb 22 2023, 10:13 PM

Restricted Application added a project: Growth-Team. · View Herald TranscriptFeb 22 2023, 10:13 PM

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Novem_Linguae mentioned this in T330348: Incorporate Earwig's copyright detector into PageTriage.Feb 22 2023, 10:27 PM

Novem_Linguae moved this task from Backlog to Priority big features on the PageTriage board.

Novem_Linguae updated the task description. (Show Details)Feb 22 2023, 10:31 PM

Samwalton9-WMF subscribed.Feb 23 2023, 9:44 AM

I like the idea, but suspect this will be a pretty large undertaking.

There are open source detectors for LLM that are very accurate They'll give a probability, and many of the matches are above 99%. W

Could you list a few? OpenAI's classifier for detecting AI-written text has a 26% true positive rate.

TheresNoTime subscribed.Feb 23 2023, 8:14 PM

EchidnaLives subscribed.Feb 23 2023, 10:32 PM

In T330346#8640155, @kostajh wrote:

Could you list a few? OpenAI's classifier for detecting AI-written text has a 26% true positive rate.

Sure. The following page appears to be a good summary.

https://en.wikipedia.org/wiki/Wikipedia:Using_neural_network_language_models_on_Wikipedia#Countermeasures

Algorithms [ edit source]
In a demo by Hugging Face at [1] (based on RoBERTa), even with a heavily edited paragraph (such as those in § Copyediting paragraphs), the detector can recognize AI text and real text with extremely high confidence (>99%); make sure to remove the reference notes "[1], [2]" beforehand. Such a model can be extremely useful for ORES, a MediaWiki machine learning API primarily used to detect vandalism in Special:RecentChanges. Over time however, these models will have a harder time finding "abnormalities" as AI text generation becomes more sophisticated.

Websites offering detection services ¶[ edit source]
https://gptzero.me/
https://www.zerogpt.com/
https://openai-openai-detector.hf.space/
https://detector.dng.ai/
https://contentatscale.ai/ai-content-detector/
https://corrector.app/ai-content-detector/
https://writer.com/ai-content-detector/
https://etedward-gptzero-main-zqgfwb.streamlit.app/

https://en.wikipedia.org/wiki/Wikipedia_talk:WikiProject_Articles_for_creation#ChatGPT_and_other_AI_generated_drafts has some additional discussion. @Qwerfjkl offers to write a tool or bot to run this on new drafts, and it is mentioned that GPTZero has an API. See bottom of https://gptzero.me/

Novem_Linguae added a subscriber: Qwerfjkl.Feb 24 2023, 12:05 AM

Eejit43 subscribed.Feb 24 2023, 4:40 AM

Tagging Machine-Learning-Team for awareness, or maybe they have something like this on their roadmap?

In T330346#8642831, @Novem_Linguae wrote:

https://en.wikipedia.org/wiki/Wikipedia_talk:WikiProject_Articles_for_creation#ChatGPT_and_other_AI_generated_drafts has some additional discussion.

FWIW that just got updated to say the opposite (by @HaeB, whose opinion I'd trust on this topic). My vague recollection from examples I have seen is also that these tools aren't really reliable (and while ChatGPT has been RLHF-ed into a recognizable style and it takes some effort at least to make it abandon that, other LLMs are all around the place).

calbon claimed this task.Mar 7 2023, 3:44 PM

calbon moved this task from Unsorted to In Progress on the Machine-Learning-Team board.Mar 7 2023, 3:46 PM

elukey moved this task from In Progress to Backlog WikiGPT on the Machine-Learning-Team board.Mar 17 2023, 9:46 AM

Hispano76 subscribed.Apr 6 2023, 2:33 PM

Mathglot subscribed.Jun 19 2023, 8:06 PM

Aklapper removed a project: PageTriage.Oct 17 2023, 9:06 PM

Aklapper edited projects, added PageTriage; removed Growth-Team.Oct 17 2023, 9:11 PM

Seddon awarded a token.Nov 8 2023, 11:24 PM