Page MenuHomePhabricator

<div> attributes re-ordered in a DiscussionTools edit
Closed, ResolvedPublic

Description

<div> attributes re-ordered in a DiscussionTools edit: https://es.wikipedia.org/?diff=138322009

I don't complain about dirty diffs in broken HTML markup, but this is perfectly valid markup that probably shouldn't cause dirty diffs.

Also, a honorable mention (probably not actionable): https://es.wikipedia.org/?diff=138318189. This diff looks like much worse corruption at a glance, but the original markup here is actually broken (and the new version is equivalent to the old), and it only looks so weird because of, once again, attributes being re-ordered.

Event Timeline

I don't complain about dirty diffs in broken HTML markup, but this is perfectly valid markup that probably shouldn't cause dirty diffs.

"width" isn't a valid attribute for a div though and gets sanitized away <div ... data-parsoid='{"stx":"html","a":{"width":null},"sa":{"width":"95%"}}'>...</div>
https://github.com/wikimedia/parsoid/blob/master/src/Core/Sanitizer.php#L210
https://developer.mozilla.org/en-US/docs/Web/HTML/Element/div#attributes

Also, a honorable mention (probably not actionable)

:)

But these are cases where only the children are being changed so presumably the start tag source can be reused. Not sure why selser isn't doing that. I'll look into it.

Change 721101 had a related patch set uploaded (by Arlolra; author: Arlolra):

[mediawiki/services/parsoid@master] [WIP] Loosen constraints on wrapper unmodified

https://gerrit.wikimedia.org/r/721101

But these are cases where only the children are being changed so presumably the start tag source can be reused.

Seems it's the autoInsertedEnd for those divs that are preventing the reuse.

Arlolra triaged this task as Medium priority.Sep 14 2021, 11:53 PM
Arlolra moved this task from Needs Triage to Bugs & Crashers on the Parsoid board.

Change 721101 merged by jenkins-bot:

[mediawiki/services/parsoid@master] Loosen constraints on wrapper unmodified

https://gerrit.wikimedia.org/r/721101

Hmm, does this patch also prevent dirty diffs like these, particularly the insertion of the closing tag? (I'm guessing based on the test cases that it might, which would be great!)

In some ways, occasional dirty diffs that fix up the bad markup is not a bad thing, but I know tolerances for these vary. :)

Hmm, does this patch also prevent dirty diffs like these, particularly the insertion of the closing tag?

It should fix the whitespace diff in the start tag, yes, but not the closing tag. I can look into what's on with that though

Hmm, does this patch also prevent dirty diffs like these, particularly the insertion of the closing tag?

It should fix the whitespace diff in the start tag, yes, but not the closing tag. I can look into what's on with that though

In general, it is a bad idea to rely on autoInsertedStart/End flags in edited content. There is actually a phab task (T252464, interestingly based on a bug report filed by @matmarex :)) to reduce our usages. So, I would be cautious here.

And, where @matmarex complains: "The same problem occurs with an unclosed table {| ... where the closing |} is not inserted." .... ;-)

Change 722416 had a related patch set uploaded (by Sbailey; author: Sbailey):

[mediawiki/vendor@master] Bump parsoid to 0.14.0-a19

https://gerrit.wikimedia.org/r/722416

Change 722416 merged by jenkins-bot:

[mediawiki/vendor@master] Bump parsoid to 0.14.0-a19

https://gerrit.wikimedia.org/r/722416