Page MenuHomePhabricator

[RFC] Use <figure> for media
Closed, ResolvedPublic

Description

As part of a long-term project to emit more semantic HTML (T12467) we would like to use <figure> tags around media (T51097). Specifically we would like to make the output of the PHP parser match the Parsoid DOM specification for images. A patch already exists in gerrit: 196532.

The benefits are:

  1. Smaller, more semantic markup, replacing the current nested <div>s and class attributes.
    • Even if the differences are minimized by gzip transfer encoding, smaller markup still results in less client-side memory in the browser DOM.
  2. More regular markup which can be more efficiently queried in user gadgets.
    • An example from a wikimania 2015 talk: document.querySelectorAll('figure, [typeof~="mw:Image"]'); will pull out all media from an article.
    • Efficient matching also allows for easier re-styling / re-arranging of media.
  3. Consistency between PHP and VisualEditor/Parsoid reduces CSS redundancy, visual differences during editing, and on-going maintenance costs.
  4. Accessibility benefits for non-inline media.
    • Captions are properly marked with semantic element, etc.

However, any change to our emitted HTML has some costs:

  1. User gadgets and other downstream tools may need to be updated to handle the new media output.
  2. IE6 through 8 will require a single line of JavaScript emitted in a <script> tag to ensure that the <figure> element is parsed correctly. (Modern HTML5 browsers have no issues, since <figure> is a valid HTML5 tag.)
  3. Stylesheets or skins may need to be updated (although WMF styles have support for the new markup already to support VisualEditor).

This RFC is a means to publicize the proposed change and give downstream users an opportunity to update their tools before it is deployed. Further, we'd like to collect blocking bugs here to ensure that any critical user gadgets are updated *before* the change goes live.

SUMMARY OF CHANGES
This section will give you a basic idea of what the mediawiki HTML would look like before and after the change; see the Parsoid DOM specification for full details on the new markup.

Example 1

[[Image:Foo.jpg|left|<p>caption</p>]]

Current output of PHP parser (linebreaks added for readability):

<div class="floatleft">
 <a href="/wiki/File:Foo.jpg" class="image" title="caption">
  <img alt="caption" src="http://upload.wikimedia.org/wikipedia/commons/3/3a/Foo.jpg" width="1941" height="220" />
 </a>
</div>

Proposed new output:

<figure typeof="mw:Image" class="mw-default-size">
 <a href="/wiki/File:Foo.jpg">
  <img resource="./File:Foo.jpg" src="http://upload.wikimedia.org/wikipedia/commons/3/3a/Foo.jpg" width="1941" height="220">
 </a>
 <figcaption><p>caption</p></figcaption>
</figure>

Example 2

[[Image:Foobar.jpg|thumb|left|baseline|caption content]]

Current output of PHP parser (linebreaks added for readability):

<div class="thumb tleft">
 <div class="thumbinner" style="width:222px;">
  <a href="/wiki/File:Foobar.jpg" class="image">
   <img alt="" src="//upload.wikimedia.org/wikipedia/commons/thumb/3/3a/Foobar.jpg/220px-Foobar.jpg"
      width="220" height="26" class="thumbimage"
      srcset="//upload.wikimedia.org/wikipedia/commons/3/3a/Foobar.jpg 1.5x, //upload.wikimedia.org/wikipedia/commons/3/3a/Foobar.jpg 2x"
      data-file-width="240" data-file-height="28">
  </a>
  <div class="thumbcaption">
   <div class="magnify">
    <a href="/wiki/File:Foobar.jpg" class="internal" title="Enlarge"></a>
   </div>
   caption content
  </div>
 </div>
</div>

Proposed new output:

<figure typeof="mw:Image/Thumb" class="mw-halign-left mw-valign-baseline mw-default-size">
   <a href="/wiki/File:Foobar.jpg">
     <img src="//upload.wikimedia.org/wikipedia/commons/thumb/3/3a/Foobar.jpg/220px-Foobar.jpg"
       data-file-width="240" data-file-height="28" data-file-type="bitmap"
        height="26" width="220" 
        resource="./Image:Foobar.jpg" />
   </a>
   <figcaption>caption content</figcaption>
</figure>

IRC meeting: E93#1118

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

I'm missing details and discussion on the various forms of thumbnailing and file formats that we currently support and how this would or should impact these forms (or not) and in as far they are discussed, the patch does not seem to match the proposal (notably .thumb remains currently)

Testing this against:
https://en.wikipedia.org/wiki/Wikipedia:Extended_image_syntax
https://en.wikipedia.org/wiki/Wikipedia:Picture_tutorial
https://en.wikipedia.org/wiki/Help:Gallery_tag
on a wmflabs instance would be interesting.

  • what's the reason the current thumb output has a separate thumb and thumbinner div? Seems like that might be used for some CSS trickery that will be impossible with the new output.
  • I assume the removal of srcset in example 2 is unintentional?
  • what's the resource attribute? Neither HTML 4.01 nor HTML 5 <img> has that.
  • when MediaViewer was deployed we discussed removing or reutilizing the magnifier icon, but it turned out to receive a surprisingly large number of clicks, possibly from people who did not realize the image itself is clickable. (On that note, updating MediaViewer would be one of the downstream blockers.)
  • does the plan include galleries? Also, what about bare images ([[File:Foo.png]])?

If a UA lacks <figure> or sufficient JavaScript support, what impact does this have? I often wonder if the right thing to do for such UAs is to have a click through option to get at images anyway (you can always use JS to suppress such markup, just don't anger the search engines), but It's Complicated.

If a UA lacks <figure> or sufficient JavaScript support, what impact does this have?

The W3C recommendation is that If a user agent encounters an element it does not recognize, it should try to render the element's content, and they generally do that, so the layout might break (because CSS rules do not get applied on the figure node or there is no such node at all) but the image itself should still be visible.

The W3C recommendation is that If a user agent encounters an element it does not recognize, it should try to render the element's content, and they generally do that, so the layout might break (because CSS rules do not get applied on the figure node or there is no such node at all) but the image itself should still be visible.

That is also what happens in practice in old IE versions, if JS is disabled. With JS enabled, even IE 5.0 supports <figure>.

what's the reason the current thumb output has a separate thumb and thumbinner div? Seems like that might be used for some CSS trickery that will be impossible with the new output.

I added that structure in order to be able to format the thumb with a 'frame' like appearance using the limited CSS supported at the time. That was when IE 5.0 was still a thing. These days, I think all relevant browsers support the CSS needed to format figures the same way, without the thumbinner thing. The CSS used to format Parsoid output already does this.

I'm missing details and discussion on the various forms of thumbnailing and file formats that we currently support and how this would or should impact these forms (or not) and in as far they are discussed, the patch does not seem to match the proposal (notably .thumb remains currently)

Testing this against:
https://en.wikipedia.org/wiki/Wikipedia:Extended_image_syntax

https://rest.wikimedia.org/en.wikipedia.org/v1/page/html/Wikipedia:Extended_image_syntax

https://en.wikipedia.org/wiki/Wikipedia:Picture_tutorial

https://rest.wikimedia.org/en.wikipedia.org/v1/page/html/Wikipedia:Picture_tutorial

https://en.wikipedia.org/wiki/Help:Gallery_tag

https://rest.wikimedia.org/en.wikipedia.org/v1/page/html/Help:Gallery_tag

These are all Parsoid renderings, but should give you a sense of how Parsoid's use of <figure>, <figcaption> and semantic markup renders and its suitability for use in the PHP parser as well.

Okay. I did a quick spot check on this to confirm at http://dr0ptp4kt.github.io/figure.html. Here's what I observed:

  • Nokia Asha 501, Opera Mini 8: image rendered. Not a perfect layout, but okay.
  • Apple 3GS, Safari: image didn't render, instead the image outline with a question mark box in the middle did. It's hyperlinked so a tap on it tries to load the image.

The Xpress browser on Nokia devices is generally being replaced with Opera Mini as I understand. That is, the Xpress browser actually prompts the user to replace it with Opera Mini. I observed this today.

As noted, a non-RL dependent JavaScript polyfill should handle other devices. So while it would need to be addressed, I don't see much of a point validating it at this point. In theory the JS polyfill should easily handle the Apple 3GS case. It may even be possible to get it working in Opera Mini with guidance starting from https://dev.opera.com/articles/opera-mini-and-javascript/.

As for apps (including Wikipedia for Android or Wikipedia Mobile for iOS) that might have assumptions about the DOM layout, I'm looping in @Dbrant and @JMinor on this ticket.

It is true that we don't have the "magnify" button any more (as @brion points out above). I'm curious about whether this is still desired/used?

cc @bearND , since the content service definitely has assumptions about DOM structure.

It is true that we don't have the "magnify" button any more (as @brion points out above). I'm curious about whether this is still desired/used?

IIRC design at the time actually thought it was good to remove it. Either way, the <figure> structure doesn't stop us from porting the magnify icon to JS.

cc @bearND , since the content service definitely has assumptions about DOM structure.

Right. I think here a transform can be applied server side. I think those legacy clients that will be on action=mobileview are a tricker piece of business.

Per http://multimedia-metrics.wmflabs.org/dashboards/mmv about 1% of the thumbnail clicks go to the magnifier icon; that's not as high as I remembered.

(assigning to self for cleanup)

ssastry triaged this task as Medium priority.Dec 17 2015, 5:49 PM
ssastry moved this task from Backlog to In Progress on the MediaWiki-Parser board.

Discussed this in 2016 Parsing team offsite.

Relationship of T118520: Use <figure-inline> instead of <span> for inline figures. was discussed. The rationale for T118520 seems to be that document.createElement('figure'); is required for IE6-8 compatibility of <figure> (this task) and so if you've got to add a custom element anyway, you might as well add the document.createElement('figure-inline') line at the same time so that you can use a more semantic element for inline figures.

Note that you can't use <figure> for inline figures because <figure> will break a <p> context:

> div = document.createElement('div');
> div.innerHTML = "<p><figure><figcaption>"
> div.innerHTML
"<p></p><figure><figcaption></figcaption></figure>"

On the other hand, from the perspective of parser parity, Parsoid currently uses <span> tags for inline figures. Changing those to <figure-inline> is a nice-to-have, perhaps, but it would require changing both Parsoid and PHP output and so is perhaps more trouble than it's worth. It's certainly not required for parser parity.

@brion Hope you don't mind me stealing this :P

Is this still active? @Arlolra should I poke you about this or someone else on parsing? If we're still going ahead we should move forward with fixes, otherwise if stalled remove the TechCom-RFC tag.

Yes, still active. The figure-inline changes to Parsoid's output got lumped in with a bunch of other things that took some time to deploy. I will be picking this up again something soon. Thanks.

@Arlolra the gerrit changes you mentioned above seem stalled, do you need the TechCom team's attention on this? If so, how can we help?

ssastry changed the task status from Open to Stalled.Mar 29 2019, 11:05 PM

@Arlolra the gerrit changes you mentioned above seem stalled, do you need the TechCom team's attention on this? If so, how can we help?

Arlo fixed a bunch of issues based on testing, and then we deprioritized this work since we started porting Parsoid. We'll pick this up once the porting is complete.

Adding a few details to @ssastry's update: Parsoid was changed to use <figure> and <figure-inline> in c9f404761cd288e7b58b89623ac459bbb2901a7d (T118520). The remaining work to be done is to transition core to use this same markup. The original plan was to do this in two steps: first convert block markup to use <figure>, and then as a follow-up convert inline markup to use <figure-inline>. Arlo has core patches written (linked above), but actually deploying them will take a careful process of communicating w/ local communities, linting, etc, which we do not plan to tackle until after the Parsoid port to PHP is complete.

Change 505645 had a related patch set uploaded (by C. Scott Ananian; owner: C. Scott Ananian):
[mediawiki/core@master] Allow <figure-inline> attributes through Sanitizer

https://gerrit.wikimedia.org/r/505645

Change 505645 merged by jenkins-bot:
[mediawiki/core@master] Allow <figure-inline> attributes through Sanitizer

https://gerrit.wikimedia.org/r/505645

Closing as resolved. The direction of using <figure> for block images was approved in 2015 in an ArchCom IRC meeting which was prior to our current process around "Last Call" process.

The implementation of this tracked further under T51097 and has since shipped in Parsoid with the Parsing Team also working on bringing this to core.

The IRC meeting notes from 2015 do specify an unresolved sub problem about inline images and the idea of <figure-inline> which does not yet have consensus. We decided at the Parsing Team offsite (which I attended) that we'll solve that with a separate RFC instead given that TechCom prefers smaller RFCs since the 2017 update to the TechCom RFC process.