Page MenuHomePhabricator

VE's gallery representation differs enough so that selser is never applied?
Closed, ResolvedPublic

Description

Starting from,
https://en.wikipedia.org/w/index.php?title=User:Arlolra/sandbox&oldid=880024323

copy(ve.init.target.doc.body.outerHTML)

<body id="mwAA" lang="en" class="mw-content-ltr sitedir-ltr ltr mw-body-content parsoid-body mediawiki mw-parser-output" dir="ltr"><ul class="gallery mw-gallery-packed" typeof="mw:Extension/gallery" about="#mwt3" data-mw="{&quot;name&quot;:&quot;gallery&quot;,&quot;attrs&quot;:{&quot;mode&quot;:&quot;packed&quot;},&quot;body&quot;:{}}" id="mwAg">
<li class="gallerybox" style="width: 202px;"><div class="thumb" style="width: 200px;"><figure-inline typeof="mw:Image"><a href="./File:Test.jpg"><img resource="./File:Test.jpg" src="//upload.wikimedia.org/wikipedia/commons/thumb/5/5b/Name.jpg/300px-Name.jpg" data-file-width="350" data-file-height="210" data-file-type="bitmap" height="120" width="200" srcset="//upload.wikimedia.org/wikipedia/commons/5/5b/Name.jpg 2x, //upload.wikimedia.org/wikipedia/commons/5/5b/Name.jpg 1.5x"></a></figure-inline></div><div class="gallerytext">123</div></li>
</ul>

<p id="mwAw">456</p></body>

and making an edit to an unrelated portion of the page result in,

copy(ve.init.target.docToSave.body.outerHTML)

<body id="mwAA" lang="en" class="mw-content-ltr sitedir-ltr ltr mw-body-content parsoid-body mediawiki mw-parser-output" dir="ltr"><ul class="gallery" typeof="mw:Extension/gallery" data-mw="{&quot;name&quot;:&quot;gallery&quot;,&quot;attrs&quot;:{&quot;mode&quot;:&quot;packed&quot;},&quot;body&quot;:{}}" about="#mwt3" id="mwAg">
<li class="gallerybox" style="width: 202px;"><div><div class="thumb"><div><a><img resource="File:Test.jpg" src="//upload.wikimedia.org/wikipedia/commons/thumb/5/5b/Name.jpg/300px-Name.jpg"></a></div></div></div><div class="gallerytext">123</div></li>
</ul>

<p id="mwAw">456</p><p id="mwAw">789</p></body>

For one, the class list changed.

The internal structure from,
https://github.com/wikimedia/mediawiki-extensions-VisualEditor/blob/6291dff109f4a04ed1432e78234b7db34dc7d9eb/modules/ve-mw/dm/nodes/ve.dm.MWGalleryImageNode.js#L90-L96

doesn't match the spec,
https://www.mediawiki.org/wiki/Specs/HTML/2.0.0/Extensions/Gallery

For a while, this was just causing attribute normalization,
https://phabricator.wikimedia.org/T211246#4829298

because the "extsrc" was present so the body would be serialized from that.

In https://github.com/wikimedia/parsoid/commit/e748b8ec831bf6c86487ca8e551274d62be16b1d, that was removed and now the normalizations are getting a little more drastic,
https://en.wikipedia.org/w/index.php?title=Kashubian_language&curid=17254&diff=880004780&oldid=879154010&diffmode=source .

That was reverted in https://gerrit.wikimedia.org/r/c/mediawiki/services/parsoid/+/486395

However, there's a hitch. Parsoid doesn't have special handling for galleries in selser so fixing this would cause edits to disappear (for the details, see T214648).

Related Objects

Event Timeline

Change 486395 had a related patch set uploaded (by Arlolra; owner: Arlolra):
[mediawiki/services/parsoid@master] Revert "Get rid of nativeGallery option and enable it by default"

https://gerrit.wikimedia.org/r/486395

Change 486395 merged by jenkins-bot:
[mediawiki/services/parsoid@master] Revert "Get rid of nativeGallery option and enable it by default"

https://gerrit.wikimedia.org/r/486395

Adding the regression tag as the bug happens now on every edit with a gallery tag since December 4th 2019, see duplicate task T239783.

Adding the regression tag as the bug happens now on every edit with a gallery tag since December 4th 2019, see duplicate task T239783.

Unfortunately, that's been the case for a year now so this isn't really a recent regression.

Adding the regression tag as the bug happens now on every edit with a gallery tag since December 4th 2019, see duplicate task T239783.

Unfortunately, that's been the case for a year now so this isn't really a recent regression.

Sorry but that is not correct. This task is ~ 1 year old, but adding "File:" to every line of gallery started (at least) on dewiki on begin of December 2019. Probably with Parsoid VRS: Switch all clients to Parsoid/PHP on Dec 2nd.

Change 558135 had a related patch set uploaded (by Arlolra; owner: Arlolra):
[mediawiki/services/parsoid@master] Disable nativeGallery in production

https://gerrit.wikimedia.org/r/558135

Change 558135 merged by jenkins-bot:
[mediawiki/services/parsoid@master] Disable nativeGallery in production

https://gerrit.wikimedia.org/r/558135

With that deploy, nativeGallery is now re-disabled. That should returns things to how they've been the past year, where only the extension tag attributes are dirty normalized, as in T211246#4829298, and the content normalization described here only happens when the gallery is edited and the extsrc is removed by VE.

However, the catch is that anything rendered and stored before that deploy won't have the extsrc and will therefore continue normalize as in the flagged regression. Which means pretty much everything in storage from Parsoid/PHP.

You can confirm this by doing a search for insource:"/<gallery/" and editing some non-gallery portion of a page with VE. It should dirty the gallery. Confirm that in the wikitext review pane. Then do an ?action=purge on the page to have Parsoid/PHP rerender and store it. Perform the same edit and it should be ok.

ppelberg subscribed.

Thank you writing this up, @Arlolra and making sure it got our attention, @ssastry. The Editing-team will look at this in January 2020.

Change 561235 had a related patch set uploaded (by Bartosz Dziewoński; owner: Bartosz Dziewoński):
[mediawiki/extensions/VisualEditor@master] Change gallery structure to match Parsoid

https://gerrit.wikimedia.org/r/561235

@Arlolra It isn't clear to me what changes we need to make. It's a bit of a pain to reproduce the structure exactly. Which of the differences in our output cause the problems, and is the patch above enough?

Change 561235 merged by jenkins-bot:
[mediawiki/extensions/VisualEditor@master] Change gallery structure to match Parsoid

https://gerrit.wikimedia.org/r/561235

@Arlolra It isn't clear to me what changes we need to make. It's a bit of a pain to reproduce the structure exactly. Which of the differences in our output cause the problems, and is the patch above enough?

Sorry for being slow here.

The DOMDiff'er doesn't yet traverse into the gallery (that's T214648), so for now we only need to be concerned with the wrapper.

Similar to what's in the description, I tested the following example,

<body data-parsoid="{&quot;dsr&quot;:[0,45,0,0]}" lang="en" class="mw-content-ltr sitedir-ltr ltr mw-body-content parsoid-body mediawiki mw-parser-output" dir="ltr"><ul class="gallery mw-gallery-packed" typeof="mw:Extension/gallery" about="#mwt3" data-parsoid="{&quot;dsr&quot;:[0,45,21,10]}" data-mw="{&quot;name&quot;:&quot;gallery&quot;,&quot;attrs&quot;:{&quot;mode&quot;:&quot;packed&quot;},&quot;body&quot;:{&quot;extsrc&quot;:&quot;\nTest.jpg|123\n&quot;}}">
<li class="gallerybox" style="width: 202px;" data-parsoid="{}"><div class="thumb" style="width: 200px;" data-parsoid="{}"><figure-inline typeof="mw:Image" data-parsoid="{&quot;optList&quot;:[{&quot;ck&quot;:&quot;width&quot;,&quot;ak&quot;:&quot;x180px&quot;},{&quot;ck&quot;:&quot;none&quot;,&quot;ak&quot;:&quot;none&quot;},{&quot;ck&quot;:&quot;caption&quot;,&quot;ak&quot;:&quot;123&quot;}]}"><a href="./File:Test.jpg" data-parsoid="{}"><img resource="./File:Test.jpg" src="https://upload.wikimedia.org/wikipedia/commons/thumb/5/5b/Name.jpg/300px-Name.jpg" data-file-width="350" data-file-height="210" data-file-type="bitmap" height="120" width="200" srcset="https://upload.wikimedia.org/wikipedia/commons/5/5b/Name.jpg 2x, https://upload.wikimedia.org/wikipedia/commons/5/5b/Name.jpg 1.5x" data-parsoid="{}"></a></figure-inline></div><div class="gallerytext" data-parsoid="{&quot;dsr&quot;:[31,34,0,0]}">123</div></li>
</ul></body>

with an edit,

<body data-parsoid="{&quot;dsr&quot;:[0,45,0,0]}" lang="en" class="mw-content-ltr sitedir-ltr ltr mw-body-content parsoid-body mediawiki mw-parser-output" dir="ltr"><ul class="gallery" typeof="mw:Extension/gallery" data-mw="{&quot;name&quot;:&quot;gallery&quot;,&quot;attrs&quot;:{&quot;mode&quot;:&quot;packed&quot;},&quot;body&quot;:{&quot;extsrc&quot;:&quot;\nTest.jpg|123\n&quot;}}" about="#mwt3" data-parsoid="{&quot;dsr&quot;:[0,45,21,10]}">
<li class="gallerybox" style="width: 202px;" data-parsoid="{}"><div><div class="thumb"><div><a><img resource="File:Test.jpg" src="https://upload.wikimedia.org/wikipedia/commons/thumb/5/5b/Name.jpg/300px-Name.jpg"></a></div></div></div><div class="gallerytext">123</div></li>
</ul><p>123</p></body>

and the DOMDiff'er produces,

----- DOM after running DOMDiff -----
<body lang="en" class="mw-content-ltr sitedir-ltr ltr mw-body-content parsoid-body mediawiki mw-parser-output" dir="ltr" data-parsoid='{"dsr":[0,45,0,0]}' data-parsoid-diff='{"id":15580374,"diff":["children-changed"]}'><ul class="gallery" typeof="mw:Extension/gallery" about="#mwt3" data-parsoid='{"dsr":[0,45,21,10]}' data-parsoid-diff='{"id":15580374,"diff":["modified-wrapper"]}' data-mw='{"name":"gallery","attrs":{"mode":"packed"},"body":{"extsrc":"\nTest.jpg|123\n"}}'>
<li class="gallerybox" style="width: 202px;" data-parsoid="{}"><div><div class="thumb"><div><a><img resource="File:Test.jpg" src="https://upload.wikimedia.org/wikipedia/commons/thumb/5/5b/Name.jpg/300px-Name.jpg"/></a></div></div></div><div class="gallerytext">123</div></li>
</ul><meta typeof="mw:DiffMarker/deleted" data-parsoid="{}"/><p data-parsoid-diff='{"id":15580374,"diff":["inserted"]}'>123</p><meta typeof="mw:DiffMarker/inserted" data-parsoid="{}"/>
</body>

The pertinent piece being data-parsoid-diff='{"id":15580374,"diff":["modified-wrapper"]}'

Looking at the ul, I see two changes. The about id position changed, which doesn't seem to be relevant, and the classList dropped "mw-gallery-packed", which is.

I pulled your patch and, while it's probably useful in the long run, didn't have any immediate effect. And, indeed, reading the changes, I wouldn't expect that it would considering it doesn't change the classes.

Change 571833 had a related patch set uploaded (by Bartosz Dziewoński; owner: Bartosz Dziewoński):
[mediawiki/extensions/VisualEditor@master] ve.dm.MWGalleryNode: Preserve 'class' attribute unchanged

https://gerrit.wikimedia.org/r/571833

Change 571833 merged by jenkins-bot:
[mediawiki/extensions/VisualEditor@master] ve.dm.MWGalleryNode: Preserve 'class' attribute unchanged

https://gerrit.wikimedia.org/r/571833

marcella awarded a token.