Page MenuHomePhabricator

References pasted from read mode should be dropped until we can support them properly
Closed, ResolvedPublic1 Story Points

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptNov 10 2016, 11:44 AM

If we had an HTML-based per target blacklist we could probably safely filter out something like sup.reference, or sup.reference:not([typeof]).

Change 320764 had a related patch set uploaded (by Esanders):
Setup htmlBlacklist and add rule for read-mode MW references

https://gerrit.wikimedia.org/r/320764

Jdforrester-WMF triaged this task as Normal priority.
Jdforrester-WMF set the point value for this task to 1.
Jdforrester-WMF moved this task from To Triage to TR1: Releases on the VisualEditor board.
Jdforrester-WMF closed this task as Resolved.Nov 17 2016, 5:27 PM
Jdforrester-WMF removed a project: Patch-For-Review.

Change 320764 merged by jenkins-bot:
Setup htmlBlacklist and add rule for read-mode MW references

https://gerrit.wikimedia.org/r/320764

Change 431655 had a related patch set uploaded (by Esanders; owner: Esanders):
[mediawiki/extensions/VisualEditor@master] Add comment to htmlBlacklist item

https://gerrit.wikimedia.org/r/431655

Restricted Application added a project: User-Ryasmeen. · View Herald TranscriptMay 7 2018, 8:50 PM

Change 431655 merged by jenkins-bot:
[mediawiki/extensions/VisualEditor@master] Add comment to htmlBlacklist item

https://gerrit.wikimedia.org/r/431655

Change 534480 had a related patch set uploaded (by Esanders; owner: Esanders):
[mediawiki/extensions/VisualEditor@master] Fix HTML blacklist inheritance

https://gerrit.wikimedia.org/r/534480

Esanders reopened this task as Open.Sep 4 2019, 4:16 PM

Looks like this fix regressed during a recent refactor on target inheritance. The above patch should fix it again.

Change 534480 merged by jenkins-bot:
[mediawiki/extensions/VisualEditor@master] Fix HTML blacklist inheritance

https://gerrit.wikimedia.org/r/534480

Change 534487 had a related patch set uploaded (by Jforrester; owner: Esanders):
[mediawiki/extensions/VisualEditor@wmf/1.34.0-wmf.21] Fix HTML blacklist inheritance

https://gerrit.wikimedia.org/r/534487

Change 534488 had a related patch set uploaded (by Jforrester; owner: Esanders):
[mediawiki/extensions/VisualEditor@wmf/1.34.0-wmf.20] Fix HTML blacklist inheritance

https://gerrit.wikimedia.org/r/534488

Change 534494 had a related patch set uploaded (by Esanders; owner: Esanders):
[mediawiki/extensions/VisualEditor@master] Add unit tests for read-mode reference filter

https://gerrit.wikimedia.org/r/534494

Esanders moved this task from Incoming to QA on the VisualEditor (Current work) board.

Change 534487 merged by jenkins-bot:
[mediawiki/extensions/VisualEditor@wmf/1.34.0-wmf.21] Fix HTML blacklist inheritance

https://gerrit.wikimedia.org/r/534487

Change 534488 merged by jenkins-bot:
[mediawiki/extensions/VisualEditor@wmf/1.34.0-wmf.20] Fix HTML blacklist inheritance

https://gerrit.wikimedia.org/r/534488

Mentioned in SAL (#wikimedia-operations) [2019-09-04T17:45:23Z] <jforrester@deploy1001> Synchronized php-1.34.0-wmf.21/extensions/VisualEditor/modules/ve-mw/init/ve.init.mw.Target.js: T150418 Fix HTML blacklist inheritance to avoid copy-pasted read <ref>s again (duration: 00m 56s)

Mentioned in SAL (#wikimedia-operations) [2019-09-04T17:47:33Z] <jforrester@deploy1001> Synchronized php-1.34.0-wmf.20/extensions/VisualEditor/modules/ve-mw/init/ve.init.mw.Target.js: T150418 Fix HTML blacklist inheritance to avoid copy-pasted read <ref>s again (duration: 00m 57s)

I wrote a bot to fix this error when it shows in the wikitext. Non-trivial because of determining underlying citation by its number. The most recent regress injected about 3000 bad citations on enwiki, which the bot has fixed. There are probably more in other wikis, and in non-mainspace. If the bot is needed again available at https://en.wikipedia.org/wiki/User:GreenC_bot/Job_18

Green_Cardamom added a comment.EditedSep 8 2019, 4:43 PM

This may still be active under certain conditions:

https://en.wikipedia.org/w/index.php?title=Special:AbuseLog&wpSearchFilter=861

It was added in this diff

https://en.wikipedia.org/w/index.php?title=Prospect_theory&type=revision&diff=914509204&oldid=913948047

I contacted the editor how the edit was made:

https://en.wikipedia.org/wiki/User_talk:7804j#VisualEditor_bug_question

It appears the content was copy-pasted either:

(2) from another paragraph of the same article using the visual editor, or (3) from another paragraph of the same article using the visual editor, but opened in a new tab (i.e., with two tabs of the same article opened on my browser)

The garbled text did not preexist anywhere but newly generated.

matmarex moved this task from Inbox to Low Priority on the Editing QA board.Sep 9 2019, 10:59 PM

Change 535586 had a related patch set uploaded (by Esanders; owner: Esanders):
[mediawiki/extensions/VisualEditor@master] Use MW import rules in MW tests

https://gerrit.wikimedia.org/r/535586

Change 535586 merged by jenkins-bot:
[mediawiki/extensions/VisualEditor@master] Use MW import rules in MW tests

https://gerrit.wikimedia.org/r/535586

Change 534494 merged by jenkins-bot:
[mediawiki/extensions/VisualEditor@master] Add unit tests for read-mode reference filter

https://gerrit.wikimedia.org/r/534494

ppelberg added a subscriber: ppelberg.EditedWed, Oct 9, 1:31 AM

@Esanders, how'd you arrive at not providing contributors any feedback about their attempted paste?

Reason for my question: I found it confusing that, despite nothing being shown on VE's edit surface, my attempted paste seemed to have some effect considering it activates the "Publish changes" button (watch this video, beginning at 0:09).

I'd have assumed that rather than pasting nothing, we'd paste the reference in plaintext to communicate to contributors something like: copy and paste is not broken; however, copying and pasting this type of content (in this case, a reference from read mode) is not supported

Izno moved this task from Unsorted backlog to External on the Cite board.Sat, Oct 12, 12:19 PM

Technically, pasting from Wikipedia read mode is no different than pasting from an external website, and there are many sanitisations that happen during paste of external HTML, including

  • Removing of certain tags that aren't editable in VE, and probably not intended to be preserved: <u>, <time>, <lang>, <span>, <font>, <fieldset> ...
  • Removing of addition tag attributes, that also aren't editable and may be adding unwanted styling (font size/colour)
  • Removing of external links
  • Removing of Wiki read mode citations

If the user pastes a large block of text, they may trigger multiple of these sanitisation rules, so rather than try to display a large warning displaying of these, it is just understood that you can't paste anything into VE.

Note that other rich content will not paste "correctly" into VE, such as templates, images, and extensions (code blocks, math equations).

Other than detecting things like citations, it would not be generically possible to know if the pasted content had come from Wikipedia or any other site, it is just regular HTML.

When we switch to Parsoid HTML for read mode, it should be possible to preserve all rich content, including references.

ppelberg closed this task as Resolved.EditedWed, Oct 23, 1:05 AM

I'm marking this task as "Resolved" considering this patch works as it's been intended to.

We will look at whether this behavior should be revisited in this task: T236220

👆 The above is an outcome of @Esanders and my conversation earlier today.