Page MenuHomePhabricator

Dirty diffs in headings in edits made with reply tool
Closed, ResolvedPublic

Description

Over the last few days there was a lot of dirty diffs in edits made with reply tool, mostly within headings (e.g. adding/removing spaces, removing leading : in links). Examples:

And more found at https://dtcheck.toolforge.org/.

image.png (315×1 px, 44 KB)

Timing matches up suspiciously well with 884013 ApiDiscussionToolsEdit: Unwrap Parsoid sections before parsing, but it could be something else in 1.40.0-wmf.20.

Event Timeline

Unwrap Parsoid sections seems like it would potentially change data-parsoid and trigger noramlisation, but also interesting that this isn't happening in VE (or is it?).

Unwrap Parsoid sections seems like it would potentially change data-parsoid and trigger noramlisation,

Parsoid strips all section tags first thing in original and edited HTML before running dom diffs, etc. So, at first glance, my hunch is that this shouldn't matter (unless the section stripping in DT does something beyond stripping section tags).

but also interesting that this isn't happening in VE (or is it?).

It might be useful to try to repro this in VE by adding a test page with extra whitespace or something in headings and seeing if this gets normalized and check if this is common to both or isolated to DT.

VE also calls stripParsoidFallbackIds(), which sounds like it might be relevant: https://gerrit.wikimedia.org/g/mediawiki/extensions/VisualEditor/+/2fdac8851c74f9dbee6f7a6f117a1c1bdf9dc535/modules/ve-mw/init/targets/ve.init.mw.Target.js#230 (and also some other things that don't look relevant)

Consistently reproducible by adding a comment directly before a heading: https://en.wikipedia.beta.wmflabs.org/w/index.php?title=Talk%3AT328268&diff=prev&oldid=572654

I can also reproduce locally.

Change 885059 had a related patch set uploaded (by Bartosz Dziewoński; author: Bartosz Dziewoński):

[mediawiki/extensions/DiscussionTools@master] Don't add custom attributes in unwrapParsoidSections()

https://gerrit.wikimedia.org/r/885059

Change 885364 had a related patch set uploaded (by Esanders; author: Esanders):

[mediawiki/extensions/VisualEditor@master] Document need to remove data-mw-section-id attributes when saving

https://gerrit.wikimedia.org/r/885364

Change 885364 merged by jenkins-bot:

[mediawiki/extensions/VisualEditor@master] Document need to remove data-mw-section-id attributes when saving

https://gerrit.wikimedia.org/r/885364

Change 885059 merged by jenkins-bot:

[mediawiki/extensions/DiscussionTools@master] Don't add custom attributes in unwrapParsoidSections()

https://gerrit.wikimedia.org/r/885059

I think I misjudged the severity of this problem. I thought it was causing just harmless syntax changes, but these bug reports describe more serious page corruption. I will backport the fix today. Sorry about that.

Change 886980 had a related patch set uploaded (by Bartosz Dziewoński; author: Bartosz Dziewoński):

[mediawiki/extensions/DiscussionTools@wmf/1.40.0-wmf.21] Don't add custom attributes in unwrapParsoidSections()

https://gerrit.wikimedia.org/r/886980

Change 886980 merged by jenkins-bot:

[mediawiki/extensions/DiscussionTools@wmf/1.40.0-wmf.21] Don't add custom attributes in unwrapParsoidSections()

https://gerrit.wikimedia.org/r/886980

Mentioned in SAL (#wikimedia-operations) [2023-02-07T15:39:38Z] <urbanecm@deploy1002> Started scap: 20a79c55b7073e791e297a5389fa66819f596178: Don't add custom attributes in unwrapParsoidSections() (T328268)

Mentioned in SAL (#wikimedia-operations) [2023-02-07T15:47:13Z] <urbanecm@deploy1002> Finished scap: 20a79c55b7073e791e297a5389fa66819f596178: Don't add custom attributes in unwrapParsoidSections() (T328268) (duration: 07m 34s)