Page MenuHomePhabricator

Introduce an LLM-specific version of Paste Check
Open, Needs TriagePublic8 Estimated Story Points

Description

In T382608#11658361, we learned that 1) popular LLMs continue to append metadata to content people copy and paste from them and 2) this "metadata" continues to evolve.

This task involves the work of implementing an LLM-specific version of Paste Check that leverages this metadata.

NOTE: we recognize the instability of this metadata will lead to false negatives (read: the Check not appearing in cases when people paste text directly from LLMs). Even still, the Editing Team thinks the awareness experienced volunteers would gain, and productive friction newcomers would encounter, as a result of this Check seems worthwhile.

Stories

  • As someone who is pasting text into a Wikipedia article that I've copied from an LLM in good faith, I want to know what policies/guidelines are relevant to me doing so and what I ought to consider doing in response, so that I can be more confident other volunteers will consider the changes I'm making to be constructive
  • As an experienced editor reviewing recent changes, I want to know what edits might contain content pasted from an LLM, so that so that I can A) evaluate the extent to which they might be in violation of LLM-specific Wikipedia policies [1][2][3][4][5][6][7][8] and B) more efficiently review them for issues such as unverifiable claims, hallucinated references, etc.

Open question(s)

  • 1. What policy/guideline will each Wikipedia like to include within this Check?
  • 2. What options should appear in the decline survey that appears when people elect to Keep the text they're pasting?
  • 3.

Requirements

Default configuration

User experience
Check card

  • .
  • .
  • .

Decline survey

  • .
  • .
  • .

Work in Progress

  1. Visit https://564a50573d.catalyst.wmcloud.org/wiki/Regent's_Park?veaction=edit&ecenable=1 on desktop or mobile
  2. Copy at least one full paragraph of text from the web interface of ChatGPT, Claude or Gemini
  3. Paste the text you copied in "2." into the edit session you started in "1."
  4. ✅ Notice the Potential AI-generated content Check appear

References

  • Pangram: strives to detect content generated by AI
  • SynthID: watermarks and seeks to identify content generated through AI

  1. https://en.wikipedia.org/wiki/Wikipedia:Writing_articles_with_large_language_models
  2. fa:Article creation with large language models
  3. zh: Writing articles using large language models
  4. ru: Neuronetwork
  5. uk: Writer articles using LLM
  6. az: Use of artificial intelligence
  7. es: Drafting articles with great language models
  8. vi: Writing articles using the Big Language Model

Event Timeline

ppelberg set the point value for this task to 8.Mar 16 2026, 6:45 PM

During the 16 March 2026 Editing Team meeting, we estimated the engineering work to be relatively straightforward and the work involved with converging on the UX (including copy and policy/guideline links) to be complex.

Change #1254182 had a related patch set uploaded (by Esanders; author: Esanders):

[VisualEditor/VisualEditor@master] Add LLM paste source detectors

https://gerrit.wikimedia.org/r/1254182

the instability of this metadata will lead to false positives (read: the Check not appearing in cases when people paste text directly from LLMs)

You actually mean "false negatives": it does not appear (negative), and it is false.

Change #1254182 merged by jenkins-bot:

[VisualEditor/VisualEditor@master] Add LLM paste source detectors

https://gerrit.wikimedia.org/r/1254182

Change #1268700 had a related patch set uploaded (by DLynch; author: DLynch):

[mediawiki/extensions/VisualEditor@master] Update VE core submodule to master (2f5c8c924)

https://gerrit.wikimedia.org/r/1268700

Change #1268700 merged by jenkins-bot:

[mediawiki/extensions/VisualEditor@master] Update VE core submodule to master (2f5c8c924)

https://gerrit.wikimedia.org/r/1268700

Paste check includes a survey if you click "keep" - does LLM check need a survey too and if so are the options the same?

image.png (244×285 px, 20 KB)

Change #1285882 had a related patch set uploaded (by Esanders; author: Esanders):

[mediawiki/extensions/VisualEditor@master] [WIP] PasteCheck: Show different messages when AI source detected

https://gerrit.wikimedia.org/r/1285882

Paste check includes a survey if you click "keep" - does LLM check need a survey too and if so are the options the same?

image.png (244×285 px, 20 KB)

@Esanders: great spot. Yes, a survey will be needed. Exact options TBD. I've updated the task description to hold us accountable to defining these options.

In parallel, I'd like to share the patch demo you helpfully created with volunteers for feedback. Before doing so, accurate for me to think testing instructions are as follows?

  1. Visit https://564a50573d.catalyst.wmcloud.org/wiki/Regent's_Park?veaction=edit&ecenable=1 on desktop or mobile
  2. Copy at least one full paragraph of text from the web interface of ChatGPT, Claude or Gemini
  3. Paste the text you copied in "2." into the edit session you started in "1."
  4. ✅ Notice the Potential AI-generated content Check appear

This looks amazing! Btw there's a list of each project's policies on AI https://meta.wikimedia.org/wiki/Artificial_intelligence/Policies_by_project

On enwiki a decent amount of LLM use is by people not fluent/confident enough to write English without assistance, I don't know whether a message saying something like "Some editors use AI because they are not fluent enough in [the language]. If this is the case, you may find it easier to contribute to a Wikipedia in your native language.", linking to https://meta.wikimedia.org/wiki/List_of_Wikipedias, would be helpful (or even appropriate). I guess we would test that by how often the link gets clicked?

Wonderful!!! One observation from testing: choosing "No, remove it" from the popup doesn't always remove all of the bullet points or blockquotes from my pasted-in LLM text. I notice that in such cases, the bullets/quotes are not highlighted yellow as "possible AI-generated text" in the first place, so it may have to do with how the detection is working. It may not be a problem pragmatically -- the text is easy to delete manually, and hopefully someone choosing "No, remove it" is already really rethinking their edit -- but it has a "clunky" feeling.

Reporting my positive results too-- it popped up as desired for multiple-paragraph text from Claude (manual copy-paste but not "copy button") and Gemini (both forms of copying).

Fantastic work!

I've been testing it on mobile:

  • The standard way I paste on mobile is using the clipboard. It does not trigger either edit check.
  • when I paste using the paste button, AI detection works for Gemini, and half for Claude and chatGPT (not with their copy button, but yes if I manually copy)
  • when you paste substantial content, the check is off-screen at the top of the pasted content. So it's easy to miss (I assume this is true for all the checks, but most noticeable for paste check)