Page MenuHomePhabricator

Null is not a valid image name
Closed, ResolvedPublicBUG REPORT

Description

What I did:

  • I needed to duplicate a previous section on the page, but I didn't want to deal with all of the <!--T:41--> codes from Extension:Translate, so I copied the "reader" version of the page instead of the wikitext (section heading, image, bullet list, a couple of short subsections).
  • Then I pasted it into the 2017WTE, and let it convert to wikitext.

What happens?:

The image, whose original wikitext was this:

[[File:Poster pour l'Appel avec la Commaunaute francophone de Wikipedia.png|alt=Poster publicizing the Community Call the Editing Team will be hosting on 3 May 2023 at 15:00 UTC.|thumb|331x331px|Community Call Details. ''Please feel free to share this!'']]

was turned into this:

[[null|link=File:Poster_pour_l'Appel_avec_la_Commaunaute_francophone_de_Wikipedia.png|alt=Poster publicizing the Community Call the Editing Team will be hosting on 3 May 2023 at 15:00 UTC.|thumb|331x331px|Community Call Details. ''Please feel free to share this!'']]

Event Timeline

mediawiki.org is seemingly a test wiki for Parsoid read views, which means that the image is generated in a form understood by VE, but it is missing the resource attribute which tells you the name of the image.

On a "normal" wiki, the HTML of an image would not be understood by VE at all and would just be dropped.

mediawiki.org is seemingly a test wiki for Parsoid read views

Not that I'm aware of. This looks like legacy output with !$wgParserEnableLegacyMediaDOM

I don't think Parsoid should be adding typeof=mw:File/Thumb unless the DOM is spec compliant.

Change 923364 had a related patch set uploaded (by Esanders; author: Esanders):

[mediawiki/extensions/VisualEditor@master] Drop incomplete images generated by legacy parser

https://gerrit.wikimedia.org/r/923364

I don't think Parsoid should be adding typeof=mw:File/Thumb unless the DOM is spec compliant.

Parsoid isn't outputting that though, the legacy parser is. But that's not your point.

This is the result trying to unify the parser outputs so that when we get to read views, it's less disruptive,
https://www.mediawiki.org/wiki/Parsoid/Parser_Unification/Media_structure/FAQ

And currently the styling depends on those typeofs,
https://github.com/wikimedia/mediawiki/blob/master/resources/src/mediawiki.skinning/content.media-common.less

I guess there's a few ways forward,

  • Maybe it's ok to include the typeof annotation if the component is spec compliant (as opposed to the entire dom), in which case we'll add the resource attribute and anything else that's missing for editing (there's already some precedent for adding the resource attribute in https://gerrit.wikimedia.org/r/c/mediawiki/core/+/919254)
  • Add some redundant classes and style based on those (there's already need for this in T318433)
  • If VE was already dropping media when serializing the legacy out, continue to drop it (as in the patch above). Maybe expectations about supported behaviour needs to be mapped out?

I suppose the question being asked here is what VE can expect / assume about the HTML/DOM when it encounters certain typeof attributes that originally were only present in Parsoid-generated HTML, and Arlo has addressed that above..

But, I wanted to flag something that is a bit tangential (which Arlo alludes to in the 3rd bullet point).

Reg this:

I pasted it into the 2017WTE, and let it convert to wikitext.

Parsoid doesn't currently support html2wt of arbitrary HTML -- it is mostly best effort and probably "mostly works" for legacy HTML. It may work in most cases, but may not in all cases. It is probably not worth our effort to try to make (legacy HTML -> wt) work in all cases at this time. So, we should probably set some realistic expectations of what might happen with copy-paste of legacy HTML.

Parsoid doesn't currently support html2wt of arbitrary HTML

We aren't sending arbitrary HTML to Parsoid. The paste handler imports it to VE and converts it to VE data model (with a null resource), then we serialise the VE data model to HTML via the normal path before sending it to Parsoid. The only issue here is that the resource is null.

On the wider issue, in this case the neatest fix is just to add the missing resource attribute, but if we expect more complex cases in the future of incomplete HTML with typeof attributes, then we may need a more generic way of handling that.

Change 923434 had a related patch set uploaded (by Arlolra; author: Arlolra):

[mediawiki/core@master] [WIP] Add "resource" attribute on mw-file-element

https://gerrit.wikimedia.org/r/923434

Change 923364 merged by jenkins-bot:

[mediawiki/extensions/VisualEditor@master] Drop incomplete images generated by legacy parser

https://gerrit.wikimedia.org/r/923364

@matmarex, I don't see the image when I pasted it. Is that the expected behaviour?

What I was getting before the fix:

Screenshot 2023-06-05 at 18.12.27.png (492×2 px, 150 KB)
Screenshot 2023-06-01 at 18.09.45.png (220×944 px, 46 KB)

Yes, that's expected. Actually supporting these pastes would be a more difficult task (but it might be magically supported when everything is switched over to Parsoid).

Yes, that's expected. Actually supporting these pastes would be a more difficult task (but it might be magically supported when everything is switched over to Parsoid).

All good then.