Page MenuHomePhabricator

Use semantic HTML5 <main> tag for primary page content
Closed, ResolvedPublic

Description

Since T37247 there is a new <div> with the class "mw-parser-output". This is the innermost container for the page content.

In HTML5, there is a new <main> tag. As can be seen at https://developer.mozilla.org/en-US/docs/Web/HTML/Element/main, the <main> tag is for the main content, excluding any sidebars, navigation links, copyright information, site logos or other UI elements.

Once T136203 T248061 is resolved and IE8 without JavaScript is no longer a concern, this <div> can become a <main>.

This <div> is defined in https://phabricator.wikimedia.org/source/mediawiki/browse/master/includes/parser/Parser.php.

Event Timeline

I agree with use of <main>.

There's a skin specific sticking point, though. In Vector and Monobook, there's already a <div id="content" class="mw-body" role="main"> element further up the tree. This div includes .mw-parser-output, but also the #siteNotice, h1#firstHeading, #siteSub ("From Wikipedia, the free encyclopedia"), and #jump-to-nav (the visually hidden "Jump to: navigation, search").

Modern skin is similar, but with a <div id="mw_content" role="main">. In CologneBlue, it's <div id="article" class="mw-body" role="main">.

A page should only have one main region. So if a <main> is added, at the same time, role="main" should be removed from whatever elements in whatever skins have it.

It's also debatable which top-of-page material belongs inside the <main> and which parts don't. If you think an element other than the current div[role="main"] should be the new <main>, go for it.

Special note on Minerva: Minerva has no element with role="main", and therefore has no main region. It also uses the HTML5 specific elements <nav> and <footer> already. So, out of all the skins, I'd say Minerva most needs a <main>, and least poses a problem with adding a <main>.

Note that Parsoid also generates the mw-parser-output div (I believe) and so should also be changed to use <main> if/when this change occurs in the PHP parser.

We can also have multiple mw-parser-output per page, seems like a bad candidate to move it to <main>

Are you referring to Flow? I created T175937 in which I suggested that the divs in Flow with the mw-parser-output class become article elements. Is there anything preventing mw-parser-output from being an article if the content model is flow-board, and otherwise being a main?

@Nirmos Flow is one of them yes and i'm positive there are more. Especially in the future with multi content revision. I think it's very clear. mw-parser-output means nothing more, than "this came out of the wikicode parser". We have this specifically, so that we can scope the styling of user generated content to such blocks, to create separation of content and presentation. It is therefore incorrect to attach a presentational/layout concept to it. It's fine to have them as mw-parser-output blocks inside of article elements. Or to have mw-parser-output as a class on an <article (if it truly only contains content, which is suspect would be rare). But that latter probably needlessly complicates the page construction and caching.

Ok, so what if:

  • The tag with the class mw-parser-output defaults to being a div
  • wikitext pages explicitly pass "main" to ParserOptions::setWrapOutputTag()
  • flow-board pages explicitly pass "article" to ParserOptions::setWrapOutputTag()

Does that address your concerns?

Sbapple subscribed.

Claiming this issue to work on for Capstone Software Engineering class at MSU Denver of Colorado.

Leaving this issue as our team is closing out our project for the semester.

I found the following:
The divs added in Parser.php on lines 485-490 are actually often stripped off by https://phabricator.wikimedia.org/source/mediawiki/browse/master/includes/parser/ParserOutput.php on lines 281-298.
If lines 485-490 in Parser are changed to add a <main> instead of a div when the class is mw-parser-output then ParserOutput.php needs code added to strip off the excess <main>s as well.

With these two changes all of parserTests.txt pass with the exception of test "Non-word characters are valid in extension tags (T19663)" which failed before the changes. The associated Phabricator issue T19663: Parser interpretes <bXY> as <b> if XY begins with non-ascii character when $wgUseTidy=true has been closed out, would it be helpful to post on that issue that this test is failing on the master branch?

I would like to submit for review, but I do not think that this adequately addresses the concerns about the skins or flow-board pages. Nonetheless I hope the information posted here helps anyone to work on this issue in the future.

Izno claimed this task.
Izno subscribed.

Marking this resolved as I think <main> should be the responsibility of the skin. Vector uses it since T66477: Vector: Use semantic HTML5 elements where applicable. Other skins should use it too if they want, but this isn't filed against any other specific skins.

As noted by TheDJ, mw-parser-output wouldn't be the right place anyway.