Edited gallery captions are ignored unless gallery's data-mw is dropped
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	Esanders
	Feb 14 2023, 6:09 PM

Description

The wikitext

<gallery>File:Example.jpg|thumb|''Hello''</gallery>

converts to

<ul ... data-mw='{"name":"gallery","attrs":{},"body":{"extsrc":"File:Example.jpg|thumb|&apos;&apos;Hello&apos;&apos;"}}'>
    <li>
        <div ...><span ...><a ...><img ... /></a></span></div>
        <div class="gallerytext"><i>Hello</i></div>
    </li>
</ul>

Note the caption data exists as both wikitext in data-mw.body.extsrc and html in the gallerytext div.

Parsoid prefers to use the data-mw wikitext when converting back to wikitext. This means we would need to detect a caption being edited for the first, and generate an attribute transaction on the parent gallery, which is very messy.

Details

Subject	Repo	Branch	Lines +/-
Enabled native gallery editing in Parsoid by default	mediawiki/core	master	+1 -1
Enabled native gallery editing in Parsoid	operations/mediawiki-config	master	+1 -1
Bump parsoid to 0.18.0-a2	mediawiki/vendor	master	+50 K -222 K
Reuse source in galleries	mediawiki/services/parsoid	master	+170 -13
Bump parsoid to 0.17.0-a19	mediawiki/vendor	master	+444 -500
Ignore gallery extsrc if native gallery is enabled	mediawiki/services/parsoid	master	+10 -360
[WIP] Ensure all DOM attributes are copied for MWGalleryImageNode	mediawiki/extensions/VisualEditor	master	+6 -4

Customize query in gerrit

Related Objects
Search...

Status	Subtype	Assigned	Task
Resolved		EAkinloose	T190224 Allow inline editing of gallery captions and gallery image captions
Resolved	BUG REPORT	None	T268991 Find and replace does not preserve replacements inside gallery
Resolved		Arlolra	T329662 Edited gallery captions are ignored unless gallery's data-mw is dropped

Event Timeline

Esanders created this task.Feb 14 2023, 6:09 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptFeb 14 2023, 6:09 PM

Esanders added a parent task: T190224: Allow inline editing of gallery captions and gallery image captions.Feb 14 2023, 6:10 PM

Native gallery editing isn't enabled in production,
https://github.com/wikimedia/operations-mediawiki-config/blob/master/wmf-config/CommonSettings.php#L4266

The FIXME here says we were waiting on VE to enable HTML editing of galleries,
https://github.com/wikimedia/mediawiki-services-parsoid/blob/master/src/Ext/Gallery/Gallery.php#L314

In T214649, the output VE was returning differed enough that selser never applied to any gallery.

In T329662#8615651, @Arlolra wrote:

The FIXME here says we were waiting on VE to enable HTML editing of galleries,
https://github.com/wikimedia/mediawiki-services-parsoid/blob/master/src/Ext/Gallery/Gallery.php#L314

Right now, the heuristic exists that if the extsrc is removed, Parsoid will serialize from the edited html. We could drop the caption from the data-mw if the extsrc is absent. That's pragmatic and sort of implied by,
https://github.com/wikimedia/mediawiki-services-parsoid/blob/master/src/Ext/Gallery/Gallery.php#L316-L321

But what we really want to do is set nativeGalleryEnabled so that we stop emitting extsrc altogether but that requires addressing the selser issues.

Up to you what you'd prefer us to do.

matmarex subscribed.Feb 14 2023, 9:10 PM

We could drop the caption from the data-mw if the extsrc is absent. That's pragmatic and sort of implied by,

This refers specifically to the caption on the extension tag, <gallery caption="123" ..., and not the captions of the individual files.

Change 889257 had a related patch set uploaded (by Arlolra; author: Arlolra):

[operations/mediawiki-config@master] [WIP] Enabled native gallery editing in Parsoid

https://gerrit.wikimedia.org/r/889257

gerritbot added a project: Patch-For-Review.Feb 14 2023, 11:41 PM

Arlolra triaged this task as Medium priority.Feb 14 2023, 11:42 PM

Arlolra added a project: Content-Transform-Team-WIP.

Arlolra moved this task from Backlog to In Progress on the Content-Transform-Team-WIP board.

In T329662#8615651, @Arlolra wrote:

In T214649, the output VE was returning differed enough that selser never applied to any gallery.

It looks like @matmarex resolved the differences in the wrapper in T214649#5887430. I'll do some more testing but we can probably just enable this production.

Please do test with,

$wgParsoidSettings = [
	'nativeGalleryEnabled' => true,
];

to confirm that it meets your editing needs.

I'll do some more testing but we can probably just enable this production.

Meh, a quick test reveals there are still a bunch of modified-wrapper, but in the nested structure,

----- DOM after running DOMDiff -----
<body id="mwAA" lang="en" class="mw-content-ltr sitedir-ltr ltr mw-body-content parsoid-body mediawiki mw-parser-output" dir="ltr" data-parsoid-diff='{"id":15580374,"diff":["children-changed"]}' data-parsoid='{"dsr":[0,81,0,0]}'><h2 id="123" data-parsoid-diff='{"id":15580374,"diff":["modified-wrapper","children-changed","subtree-changed"]}' data-parsoid='{"dsr":[0,7,2,2]}' data-mw="{}"><meta typeof="mw:DiffMarker/deleted"/>1234</h2>

<ul typeof="mw:Extension/gallery" class="gallery mw-gallery-traditional" about="#mwt2" id="mwAw" data-parsoid-diff='{"id":15580374,"diff":["subtree-changed"]}' data-parsoid='{"dsr":[9,81,23,10]}' data-mw='{"name":"gallery","attrs":{},"body":{}}'>
<li class="gallerycaption" id="mwBA" data-parsoid="{}" data-mw="{}">123</li>
<li class="gallerybox" style="width: 155px;" id="mwBQ" data-parsoid-diff='{"id":15580374,"diff":["children-changed","subtree-changed"]}' data-parsoid="{}" data-mw="{}"><div class="thumb" data-parsoid-diff='{"id":15580374,"diff":["modified-wrapper","children-changed","subtree-changed"]}' data-parsoid="{}" data-mw="{}"><span typeof="mw:File" data-parsoid-diff='{"id":15580374,"diff":["modified-wrapper","children-changed","subtree-changed"]}' data-parsoid="{}" data-mw="{}"><a data-parsoid-diff='{"id":15580374,"diff":["modified-wrapper","children-changed","subtree-changed"]}' data-parsoid="{}" data-mw="{}"><img resource="./File:Unix history-simple.png" src="https://upload.wikimedia.org/wikipedia/commons/thumb/5/50/Unix_history-simple.png/120px-Unix_history-simple.png" width="120" height="80" data-parsoid-diff='{"id":15580374,"diff":["modified-wrapper"]}' data-parsoid="{}" data-mw="{}"/></a></span></div><div class="gallerytext" data-parsoid-diff='{"id":15580374,"diff":["modified-wrapper"]}' data-parsoid="{}" data-mw="{}">testing</div></li>
</ul>

</body>
-------------------------------------

Looking at VE source,
https://github.com/wikimedia/mediawiki-extensions-VisualEditor/blob/master/modules/ve-mw/dm/nodes/ve.dm.MWGalleryImageNode.js#L135-L140

	// TODO: Support editing the link
	// FIXME: Dropping the href causes Parsoid to mark the node as wrapper modified,
	// making the whole gallery subtree edited, preventing selser.  When fixing,
	// preserving the imgWrapperClassAttr, as in the MW*ImageNodes, will also be
	// necessary.
	// a.setAttribute( 'href', attributes.src );

We can ensure all the DOM attributes are copied by setting ve.dm.MWGalleryImageNode.static.handlesOwnChildren to true, making that above FIXME redundant.

Change 889586 had a related patch set uploaded (by Arlolra; author: Arlolra):

[mediawiki/extensions/VisualEditor@master] [WIP] Ensure all DOM attributes are copied for MWGalleryImageNode

https://gerrit.wikimedia.org/r/889586

Change 889586 abandoned by Arlolra:

[mediawiki/extensions/VisualEditor@master] [WIP] Ensure all DOM attributes are copied for MWGalleryImageNode

Reason:

In favour of I2fdfc04aee5e8b3524214e25afd8443b5f6b240d

https://gerrit.wikimedia.org/r/889586

In T329662#8615732, @Arlolra wrote:

But what we really want to do is set nativeGalleryEnabled so that we stop emitting extsrc altogether but that requires addressing the selser issues.

I should confirm that flag also ignores extsrc that may be set in cached content from before it was enabled

• JMcLeod_WMF moved this task from Needs Triage to Tracking on the Parsoid board.Feb 16 2023, 3:18 PM

• JMcLeod_WMF edited projects, added Parsoid (Tracking); removed Parsoid.

So the state of play is that we need Ed's patches deployed, and then we can make the config change to turn on native galleries. My understanding is that we've shipped nativeGalleriesEnabled=>true as the default configuration for "quite a while" even though it's not enabled in production everywhere quite yet.

Arlolra edited projects, added Parsoid; removed Parsoid (Tracking).Feb 21 2023, 7:37 PM

• JMcLeod_WMF moved this task from Needs Triage to Bugs & Crashers on the Parsoid board.Feb 23 2023, 3:13 PM

Change 891889 had a related patch set uploaded (by Arlolra; author: Arlolra):

[mediawiki/services/parsoid@master] [WIP] Ignore gallery extsrc if native gallery is enabled

https://gerrit.wikimedia.org/r/891889

In T329662#8620669, @Arlolra wrote:

I should confirm that flag also ignores extsrc that may be set in cached content from before it was enabled

That's in T329662#8644807

• JMcLeod_WMF moved this task from In Progress to Code Review on the Content-Transform-Team-WIP board.Feb 27 2023, 4:23 PM

Change 891889 merged by jenkins-bot:

[mediawiki/services/parsoid@master] Ignore gallery extsrc if native gallery is enabled

https://gerrit.wikimedia.org/r/891889

dr0ptp4kt assigned this task to Arlolra.Mar 6 2023, 4:22 PM

dr0ptp4kt moved this task from Code Review to To Deploy on the Content-Transform-Team-WIP board.

Change 894720 had a related patch set uploaded (by Arlolra; author: Arlolra):

[mediawiki/vendor@master] Bump parsoid to 0.17.0-a19

https://gerrit.wikimedia.org/r/894720

Change 894720 merged by jenkins-bot:

[mediawiki/vendor@master] Bump parsoid to 0.17.0-a19

https://gerrit.wikimedia.org/r/894720

Change 896435 had a related patch set uploaded (by Arlolra; author: Arlolra):

[mediawiki/services/parsoid@master] [WIP] Gallery selser

https://gerrit.wikimedia.org/r/896435

Change 896435 merged by jenkins-bot:

[mediawiki/services/parsoid@master] Reuse source in galleries

https://gerrit.wikimedia.org/r/896435

Esanders mentioned this in T190224: Allow inline editing of gallery captions and gallery image captions.Mar 20 2023, 3:04 PM

dr0ptp4kt moved this task from To Deploy to To Verify on the Content-Transform-Team-WIP board.Mar 20 2023, 3:13 PM

Change 901245 had a related patch set uploaded (by Arlolra; author: Arlolra):

[mediawiki/vendor@master] Bump parsoid to 0.18.0-a2

https://gerrit.wikimedia.org/r/901245

Change 901245 merged by jenkins-bot:

[mediawiki/vendor@master] Bump parsoid to 0.18.0-a2

https://gerrit.wikimedia.org/r/901245

• JMcLeod_WMF moved this task from To Verify to In Progress on the Content-Transform-Team-WIP board.Mar 27 2023, 3:03 PM

Following up on T329662#8616990, an edit elsewhere in the page now produces (after I2fdfc04aee5e8b3524214e25afd8443b5f6b240d) pretty good results with nativeGalleryEnabled,

----- DOM after running DOMDiff -----
<body id="mwAA" lang="en" class="mw-content-ltr sitedir-ltr ltr mw-body-content parsoid-body mediawiki mw-parser-output" dir="ltr" data-parsoid-diff='{"diff":["children-changed"]}' data-parsoid="{}"><h2 id="1234" data-parsoid-diff='{"diff":["modified-wrapper","children-changed","subtree-changed"]}' data-parsoid="{}" data-mw="{}"><meta typeof="mw:DiffMarker/deleted"/>12345</h2>

<ul typeof="mw:Extension/gallery" class="gallery mw-gallery-traditional" about="#mwt2" id="mwAw" data-parsoid="{}" data-mw='{"name":"gallery","attrs":{},"body":{}}'>
<li class="gallerybox" style="width: 155px;" id="mwBA" data-parsoid="{}" data-mw="{}"><div class="thumb" style="width: 150px; height: 150px;" id="mwBQ" data-parsoid="{}" data-mw="{}"><span typeof="mw:File" id="mwBg" data-parsoid="{}" data-mw="{}"><a href="./File:Test.png" class="mw-file-description" title="123" id="mwBw" data-parsoid="{}" data-mw="{}"><img resource="./File:Test.png" src="https://upload.wikimedia.org/wikipedia/commons/thumb/6/6a/PNG_Test.png/95px-PNG_Test.png" width="95" height="120" alt="123" decoding="async" data-file-width="3120" data-file-height="3920" data-file-type="bitmap" srcset="https://upload.wikimedia.org/wikipedia/commons/thumb/6/6a/PNG_Test.png/143px-PNG_Test.png 1.5x, https://upload.wikimedia.org/wikipedia/commons/thumb/6/6a/PNG_Test.png/191px-PNG_Test.png 2x" id="mwCA" data-parsoid="{}" data-mw="{}"/></a></span></div><div class="gallerytext" id="mwCQ" data-parsoid="{}" data-mw="{}">123</div></li>
</ul>

</body>
-------------------------------------

I think this means we can move forward with T329662#8633332. I'll do that tomorrow.

Unfortunately, the benefits of T329662#8684251 and I55942bbb36c3bd3dd43a665fbca0f275e9870f7c are not felt because of an assortment of issues,

----- DOM after running DOMDiff -----
<body id="mwAA" lang="en" class="mw-content-ltr sitedir-ltr ltr mw-body-content parsoid-body mediawiki mw-parser-output" dir="ltr" data-parsoid-diff='{"diff":["children-changed"]}' data-parsoid="{}"><h2 id="1234" data-parsoid-diff='{"diff":["modified-wrapper"]}' data-parsoid="{}" data-mw="{}">1234</h2>

<ul typeof="mw:Extension/gallery" class="gallery mw-gallery-traditional" about="#mwt2" id="mwAw" data-parsoid-diff='{"diff":["children-changed","subtree-changed"]}' data-parsoid="{}" data-mw='{"name":"gallery","attrs":{},"body":{}}'><meta typeof="mw:DiffMarker/deleted"/><li class="gallerybox" data-parsoid-diff='{"diff":["modified-wrapper","children-changed","subtree-changed"]}' data-parsoid="{}" data-mw="{}"><div class="thumb" data-parsoid-diff='{"diff":["modified-wrapper","children-changed","subtree-changed"]}' data-parsoid="{}" data-mw="{}"><span typeof="mw:File" data-parsoid-diff='{"diff":["modified-wrapper","children-changed","subtree-changed"]}' data-parsoid="{}" data-mw="{}"><a href="./File:Test.png" class="mw-file-description" data-parsoid-diff='{"diff":["modified-wrapper","children-changed","subtree-changed"]}' data-parsoid="{}" data-mw="{}"><img resource="./File:Test.png" src="https://upload.wikimedia.org/wikipedia/commons/thumb/6/6a/PNG_Test.png/95px-PNG_Test.png" width="95" height="120" alt="123" data-parsoid-diff='{"diff":["modified-wrapper"]}' data-parsoid="{}" data-mw="{}"/></a></span></div><div class="gallerytext" data-parsoid-diff='{"diff":["modified-wrapper"]}' data-parsoid="{}" data-mw="{}">123</div></li><meta typeof="mw:DiffMarker/deleted"/><li class="gallerybox" data-parsoid-diff='{"diff":["inserted"]}' data-parsoid="{}" data-mw="{}"><div class="thumb" data-parsoid="{}"><span typeof="mw:File" data-parsoid="{}"><a href="./File:Odd Rode Church - geograph.org.uk - 2261064.jpg" class="mw-file-description" data-parsoid="{}"><img resource="./File:Odd Rode Church - geograph.org.uk - 2261064.jpg" src="https://upload.wikimedia.org/wikipedia/commons/thumb/8/89/Odd_Rode_Church_-_geograph.org.uk_-_2261064.jpg/200px-Odd_Rode_Church_-_geograph.org.uk_-_2261064.jpg" width="120" height="100" alt="" data-parsoid="{}"/></a></span></div><div class="gallerytext" data-parsoid="{}"></div></li></ul>

</body>
-------------------------------------

None of the internal id attributes are preserved, so data-parsoid is lost. The style attributes on the li and div are dropped. The newlines before each li are removed, so we get <meta typeof="mw:DiffMarker/deleted"/>. The img doesn't have decoding / data-* attributes. Etc.

Change 889257 merged by jenkins-bot:

[operations/mediawiki-config@master] Enabled native gallery editing in Parsoid

https://gerrit.wikimedia.org/r/889257

Mentioned in SAL (#wikimedia-operations) [2023-03-29T13:19:57Z] <lucaswerkmeister-wmde@deploy2002> Started scap: Backport for [[gerrit:889257|Enabled native gallery editing in Parsoid (T329662)]]

Mentioned in SAL (#wikimedia-operations) [2023-03-29T13:21:19Z] <lucaswerkmeister-wmde@deploy2002> lucaswerkmeister-wmde and arlolra: Backport for [[gerrit:889257|Enabled native gallery editing in Parsoid (T329662)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet

Mentioned in SAL (#wikimedia-operations) [2023-03-29T13:30:17Z] <lucaswerkmeister-wmde@deploy2002> Finished scap: Backport for [[gerrit:889257|Enabled native gallery editing in Parsoid (T329662)]] (duration: 10m 19s)

Maintenance_bot removed a project: Patch-For-Review.Mar 29 2023, 1:30 PM

I think this is done now. Thank you!

@matmarex Should I open a separate task for T329662#8736728? This task is done but T207620 / T211895 / T151367 would be improved.

@Arlolra I'm not really sure what it means. If it's not covered by these existing tasks, then please do file one.

Change 904199 had a related patch set uploaded (by Bartosz Dziewoński; author: Bartosz Dziewoński):

[mediawiki/core@master] Enabled native gallery editing in Parsoid by default

https://gerrit.wikimedia.org/r/904199

gerritbot added a project: Patch-For-Review.Mar 29 2023, 3:21 PM

In T329662#8738677, @matmarex wrote:

@Arlolra I'm not really sure what it means. If it's not covered by these existing tasks, then please do file one.

Sorry, what it means is that the changes done in I2fdfc04aee5e8b3524214e25afd8443b5f6b240d make it so that selser works for unedited galleries on a page, which is great.

In Id0e3c5a79f74d72257c6e70d97e55ed94410f156 though, selser can now be applied to unedited nodes in an edited gallery. So, for example, adding a new image to a gallery won't normalize the rest of the gallery lines, as in T211895. But this doesn't work because the returned edited gallery is heavy normalized by VE, as noted in the latter half of T329662#8736728.

Similarly, even if selser isn't applied in the edited gallery, not having the ids on old nodes means that data-parsoid is dropped for them and so any serialization information (like file prefixes, I55942bbb36c3bd3dd43a665fbca0f275e9870f7c) won't be able to be used.