Display proper section collapsing within article namespace
Closed, DeclinedPublic
Actions

Assigned To

None

Authored By

	ovasileva
	Sep 28 2017, 1:05 PM

Description

Functional Requirements

General

Sections must be collapsible on the mobile website. User must be able to collapse and uncollapse sections

Mobile devices

For mobile devices, sections must appear collapsed by default

if a user uncollapses a section, the system must save the user's preference for the length of the user session

Larger devices

For desktop and tablet devices, all sections must appear uncollapsed by default except for the following:

references
notes

If a user uncollapses a section, the system must save the preference for the length of the user session

Related Objects
Search...

Status	Assigned	Task
Declined	ovasileva	T176958 Display article content within article namespace
Declined	None	T176959 Display proper section collapsing within article namespace
Open	None	T179715 Enable section collapsing for readers

Event Timeline

ovasileva created this task.Sep 28 2017, 1:05 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 28 2017, 1:05 PM

ovasileva added projects: Epic, Marvin.Sep 28 2017, 1:21 PM

ovasileva updated the task description. (Show Details)

• Jhernandez renamed this task from [EPIC] Display proper section collapsing within article namespace to Display proper section collapsing within article namespace.Sep 28 2017, 1:28 PM

• Jhernandez moved this task from Needs Triage to Next Q Epics (2018 Q3) on the Marvin board.

• Jhernandez added a parent task: T176958: Display article content within article namespace.

• Jhernandez moved this task from Next Q Epics (2018 Q3) to Current Epics (2017-Q2) on the Marvin board.Oct 3 2017, 5:26 PM

One of my action items from the last Marvin sync meeting was to advise you of changes in sectioning in the MCS/PCS:

AFAIK Reading Infra is looking to defer to the sectioning algorithm implemented in Parsoid in MCS/PCS /cc @bearND (who can provide detail about a timeline, if one exists).

The section algorithm in Parsoid differs from that in MCS/PCS, which itself differs from that in MobileFrontend, but its design and underlying principles are well documented and tested. Critically, the design of the Parsoid algorithm introduces the notion of pseudo-sections that are section elements introduced in order to deal with markup wrapping wikitext sections. These pseudo-sections don't correspond to sections in the wikitext document (and are marked up to indicate this) and so will need special handling in any section collapsing code.

e.g. this template leaves an unclosed div tag in the HTML of the https://fr.wikipedia.org/wiki/Discussion_Projet:M%C3%A9decine page, which becomes difficult to section up. Indeed, MobileFormatter#makeSections fails to section the page appropriately: https://fr.m.wikipedia.org/wiki/Discussion_Projet:M%C3%A9decine

As of PS 23, @ssastry's patch for sectioning in Parsoid, in the edge case, Parsoid would combine the 1st wiki text section with the lead section. You would get all other sections as expected from Parsoid.
While this is not ideal, we're talking about cases that are flagged by the linter and should eventually be fixed by editors. At least that's my understanding.

BTW, this example we're talking about here is a talk page. The task title here talks about article namespace.

Adding <section> tags will be given by Parsoid with T114072. How much of additional DOM transformations should be done by
Parsoid or at the PCS layers to support section collapsing? I'm mainly talking about wrapping the section content (after the heading) in a <div>. Where should this live?

• Mholloway subscribed.Oct 17 2017, 3:22 PM

@phuedx To follow-up on our conversation today, it sounded like that wrapping the section content (after the heading) in a div is not needed. Is that correct?

In T176959#3692559, @bearND wrote:

@phuedx To follow-up on our conversation today, it sounded like that wrapping the section content (after the heading) in a div is not needed. Is that correct?

I believe this to be true as but it might have a slight performance impact for lower end Grade C/X browsers. For Grade C/X browsers (or Grade A browsers on low quality connections), we currently rely on inlined JavaScript to provide a basic section toggling experience but, essentially, what that does is apply a CSS class to the section container. So we rely on the browser's selector engine to do the work of picking the elements to show/hide.

If we didn't have a <div> inside the <section> element, then the DOM would look something like:

<section class="closed">
  <h2>Foo</h2>
  <p>Bar</p>
  <p>Baz</p>
  <img ... />
</section>

So we'd use [[ https://developer.mozilla.org/en-US/docs/Web/CSS/General_sibling_selectors | the general sibling combinator (~)]] to target the non-h2 tags, i.e. section.closed h2 ~ * { display: none }. The use of that combinator restricts this solution to IE7+, Opera 9+ (which is great but should be noted!) but the universal selector (*) does have a performance cost. However, I don't think that concern should carry much weight unless we can demonstrate that users are toggling sections very frequently.

Other members of Readers Web (RW) may disagree with me 😉 What do you think, @Jdlrobson?

Also, I did mention that we'd investigated simplifying the section collapsing code before. Here are the tickets that track the conversation (in order):

For Grade C/X browsers (or Grade A browsers on low quality connections), we currently rely on inlined JavaScript to provide a basic section toggling experience but, essentially, what that does is apply a CSS class to the section container. So we rely on the browser's selector engine to do the work of picking the elements to show/hide.

I don't think we do this... grade C and X browsers do not getting section collapsing at all, although we've talked about how it would be good if they do.

it sounded like that wrapping the section content (after the heading) in a div is not needed. Is that correct?

I see the proposed HTML as problematic for section collapsing.
Consider the following case:

<section class="closed">
  <h2>Foo</h2>
  Hello world
</section>

There's no way in CSS to collapse the text but not the heading. We encounter this in the PHP parser. For this reason alone, we made the decision to not wrap the heading element. I'm not sure if this is possible in Parsoid.

Other drawbacks of this approach, is that it limits what you can do - for instance if you wanted to animate the closing of the section - that would be impossible. I'd advise for more flexibility here as product vision can change.

Pulling out the h2 via JavaScript is an option, but could interfere with styling rules given it changes the document hierarchy.

If we're rendering via MCS of course this goes away, as you have both heading and text of each page but I assume we are shipping HTML to Marvin and that's what we're talking about?

In T176959#3695274, @Jdlrobson wrote:

Other drawbacks of this approach, is that it limits what you can do - for instance if you wanted to animate the closing of the section - that would be impossible. I'd advise for more flexibility here as product vision can change.

Could you explain why the animation the closing of the section is impossible? Are there any other concrete limitations?

CSS animations operate on a single element so you wouldn't be able to animate several elements.

Likewise you wouldn't be able to add a border around the entire content as you need a container to do that.

You wouldn't be able to give the content area a different background from the heading etc..

So, should we ask @ssastry if Parsoid could add the <div> wrapping of the section content to the <section> patch as well? (Or in a follow-up patch but deploy them at the same time.)

@phuedx @Jdlrobson It sounds like we could preemptively add this extra div in the Parsing patch and that way give you the flexibility you need. Does that sound reasonable to you?

cscott subscribed.Oct 25 2017, 8:29 PM

The proposal is for the Parsoid HTML to look like:

<section data-section-number="1"><h2 id="html5id"><span id="legacyHtml4id" typeof="mw:FallbackId"></span>Heading text</h2><div typeof="mw:CollapseWrapper">
...
</div></section>

and the wrapper <section>, wrapper <div> and empty <span> would be ignored when converting html2wt. Then you could write CSS rules for collapsing such as (say):

h2 + div { display: none; } /* collapse all top-level sections by default */
h2:target + div { display: block; } /* display the target of a link */
h2#References + div, h2#Notes + div { display: block; } /* display references and notes */
section[data-section-number="0"] > div { display: block } /* display lead section */
section:first-child > div { display: block } /* an alternative way to display lead section, works even if lead section has wikitext errors */
section[data-pseudo-section] > div { display: block } /* pseudo-sections can't be collapsed, due to some wikitext error in the source */

Does this seem reasonable?

EDIT: See below for a slightly updated proposal

ssastry mentioned this in T179104: VE should handle <section> wrapping code that Parsoid will add.Oct 26 2017, 6:24 PM

This looks great! It provides clear separation of section heading and content.

(I should defer to @Jhernandez have the final say though before this becomes finalised!)

Jdlrobson added a subscriber: • Jhernandez.Oct 27 2017, 1:18 PM

In T176959#3715586, @Jdlrobson wrote:

This looks great! It provides clear separation of section heading and content.

+1 The proposed Parsoid HTML looks good to me.

I think there needs to be a balance between which use cases require the client to unwrap nodes (e.g. editors which want to ignore the section wrappers) and which cases might require the client to add more wrapping.

Clearly the section wrappers provide additional semantic information, but do these CollapseWrappers add any additional information, or are they just to make life easier for reading clients which collapse content? If it's the latter perhaps it should just be up to the client to work out which part of the section is header and which part is content.

Also if they are kept I would call them sectionHeader/sectionBody, not CollapseWrapper, as the latter describes just one particular use case of a SectionBody.

In T176959#3718222, @Esanders wrote:

I think there needs to be a balance between which use cases require the client to unwrap nodes (e.g. editors which want to ignore the section wrappers) and which cases might require the client to add more wrapping.

Clearly the section wrappers provide additional semantic information, but do these CollapseWrappers add any additional information, or are they just to make life easier for reading clients which collapse content? If it's the latter perhaps it should just be up to the client to work out which part of the section is header and which part is content.

Indeed. We had this discussion and I was initially hesitant to add the additional <div> wrapper for that precise reason.

From what I understood from the discussion above (with my limited understanding of css), without the <div> tag, section collapsing functionality cannot be supported readily / easily.

Given that, I reasoned that since editing clients are going to be stripping the <section> tags already, it is better to provide full support for this use case in one place and have editing clients strip both in one shot instead of having editing clients strip <section> tags and reading clients add the <div> tags in MCS/PCS layers.

So, unless there is a solution that doesn't require adding the <div> tag anywhere, this is probably the best compromise solution.

Also if they are kept I would call them sectionHeader/sectionBody, not CollapseWrapper, as the latter describes just one particular use case of a SectionBody.

Yes, my gerrit patch uses mw:SectionContent instead of the non-semantic mw:CollapseWrapper.

So, unless there is a solution that doesn't require adding the <div> tag anywhere, this is probably the best compromise solution.

It depends if you think there is a requirement to have a CSS only solution. You'll need some JS to add click handlers, so that code could easily add the required wrappers too. It seems like we're adding redundant non-semantic HTML to support one client's specific use case.

In T176959#3718448, @Esanders wrote:

So, unless there is a solution that doesn't require adding the <div> tag anywhere, this is probably the best compromise solution.

It depends if you think there is a requirement to have a CSS only solution. You'll need some JS to add click handlers, so that code could easily add the required wrappers too.

That is something for @Jdlrobson or @phuedx to answer based on product needs.

That is something for @Jdlrobson or @phuedx to answer based on product needs.

Well it affects all users of Parsoid. Having the body wrapper in the HTML will make life easier for clients that want to collapse sections, but puts a burden on other clients (bandwidth and extra code).

FYI such wrapping code would be fairly simple:

Array.prototype.forEach.call( document.querySelectorAll( 'section > h2' ), function ( element ) {
  var wrapper = document.createElement( 'div' );
  while ( element.nextSibling ) { wrapper.appendChild( element.nextSibling ); }
  element.parentNode.appendChild( wrapper );
} );

In T176959#3718913, @Esanders wrote:

That is something for @Jdlrobson or @phuedx to answer based on product needs.

Well it affects all users of Parsoid. Having the body wrapper in the HTML will make life easier for clients that want to collapse sections, but puts a burden on other clients (bandwidth and extra code).

Understood. The initial design of Parsoid HTML was focused around semantic HTML and needs of editing clients. As we are now shifting the focus to supporting read clients, there will be changes to the HTML output to accommodate reading clients. And, sometimes these requirements will conflict. Sometimes, it is possible that mobile and desktop requirements may conflict. So, we'll have to figure this out for every scenario where these conflicting requirements pop up. Sometimes performance / rendering time considerations might influence that decision since read views are going to be the common case (ex: data-mw split). Of course, in some cases, another mobile-content /page-content service layers might have to tackle that, but ideally the more transformations are in Parsoid, the better. But, yes, we won't make decisions that will change the fundamental nature of Parsoid markup being semantic markup.

For this specific case, I agree that the <div> wrapper is primarily there for a UI need, which is why I asked for comment from @Jdlrobson and @phuedx about the specifics of CSS vs non-CSS based collapsing. It is only ~50 chars per section so, it is not serious bloat or network bandwidth heavy.

As for additional code, as I indicated earlier, editing clients (VE, CX) are already having to add code to strip the outer <section> tags, At that point, you might as well strip the <div> tag as well -- it is not substantially more complexity. The alternative would be to not add these <div> tags and then all clients would have extra code for it (editing clients to strip <section> tags and mobile reading clients to add the <div> tag) which feels worse.

Anyway, at this point, all this is hinging on some more specifics about the use case.

In T176959#3718938, @Esanders wrote:

FYI such wrapping code would be fairly simple:

Array.prototype.forEach.call( document.querySelectorAll( 'section > h2' ), function ( element ) {
  var wrapper = document.createElement( 'div' );
  while ( element.nextSibling ) { wrapper.appendChild( element.nextSibling ); }
  element.parentNode.appendChild( wrapper );
} );

Yes we have considered this in the past but the radical restructure would cause a reflow for us as well as a repaint since we need to provide server side rendered.
I imagine this is also not feasible on older mobile devices viewing large pages with low memory / low CPU. Long term we've been thinking lots about ways we can provide section collapsing to grade C browsers either via lightweight JS or CSS only solutions. Solutions like T176959#3711269 are appealing. If Parsoid didn't provide this we would have to reimplement it in the mobileapps repo.

In T176959#3718913, @Esanders wrote:

Well it affects all users of Parsoid. Having the body wrapper in the HTML will make life easier for clients that want to collapse sections, but puts a burden on other clients (bandwidth and extra code).

Interesting. Could you elaborate on this? What exactly is this burden? What extra code are we talking about here? What problems do you foresee? How does it compare to a client having to restructure the entire page ?

Backing up, let's describe a general range of markups:

At one extreme, Parsoid DOM is a compact representation of the semantic information in the original wikitext, suitable for storing in a database without redundancy. That is, every tag or attribute present in the DOM corresponds 1-to-1 with some wikitext property. IE in the wikitext [[Foo.jpg|45x23]] we have exactly one place where 45x23 shows up (the width and height of an <img> element). This format is best for long-term storage, since we're not wasting storage space on unnecessary or redundant information.
Backing off dogmatism a bit, we add *some* redundant information to aid editors and viewers. For example, we add data-file-width to indicate the native unscaled width of a linked resource, and we add img src tags pointing to the appropriate thumbnail image, even though when serializing both of these bits of information are ignored (we use the resource attribute for the source, and we currently don't need to know the native width/height in order to serialize). We add both "legacy HTML4 id" and "HTML5 id" attributes so that anchor navigation will work (even on old browsers), even though technically speaking the information is redundant with the contents of the <h2> etc tags.
Further off our dogmatism, we add some information which isn't strictly speaking a property of the underlying wikitext at all, like red-link information, and just resign ourselves to regenerating our HTML when redlink status changes.
Finally, on the other extreme, we generate markup identical to what we'd like readers to view in a browser, and teach the editor to strip out redundant or superfluous information. This is near where we are with the section wrapping <div>s.

I think Parsoid originally imagined that it could be a pure semantic markup, like #1. Over time we've added a number of "convenience" features. Unfortunately, we haven't quite reached #4, since mobile views still need to do quite a number of transformations on our output (and the editor doesn't want to use the read view as a source for editing because of the presence of browser "add ons" which could corrupt the read view HTML).

We've tried to get a compromise between #1 and #4 at least partly because we don't want to have to "render" stored DOM before it can be viewed. Ideally our storage layer would spit back viewable HTML w/o requiring further rendering.

One answer may be to go back and restructure our parsoid toolchain a little bit more (eventually) to separate out something closer to #1 as "storage format DOM", and then explicitly structure the path to #4 as a set of transformations, with something close to #4 as a "read format DOM" -- and then we already have separated out data-mw attributes to form an "edit format DOM", and data-parsoid attributes to form a further "round-trippable DOM".

Or else we'll keep muddling through with a compromise DOM format which tries to be directly viewable and not too inefficient for storage or data extraction.

In T176959#3721126, @cscott wrote:

Backing up, let's describe a general range of markups:
... snip ...
One answer may be to go back and restructure our parsoid toolchain a little bit more (eventually) to separate out something closer to #1 as "storage format DOM", and then explicitly structure the path to #4 as a set of transformations, with something close to #4 as a "read format DOM" -- and then we already have separated out data-mw attributes to form an "edit format DOM", and data-parsoid attributes to form a further "round-trippable DOM".

Or else we'll keep muddling through with a compromise DOM format which tries to be directly viewable and not too inefficient for storage or data extraction.

Thanks for that useful summary and perspective! The original plan was indeed to have things like redlinks, etc. be post processing passes, but performance and rendering parity pragmatism considerations overrode that. This decision cannot be made in isolation without consideration of other aspects of our tech stack -- something which has been the focus of internal discussions lately.

In any case, this is probably best discussed in a separate ticket / wiki page since this is not resolvable in the immediate term for the purposes of the section-ing discussion.

Sure, and I apologize for not wrapping up my comment with a concise and obviously-correct plan of action for this bug. ;) Alas, the matter is not simple. I guess I was trying to provide the long-view to reassure both @Esanders and @Jdlrobson that even if we ultimately don't select the format variant that they seem to prefer, that we recognize the inherent tension and believe that they are *both* fundamentally correct.
We just need to do a bit more long-term work to figure out how to restructure things to obtain the best of both possible representations.

I think it's fair to say that @ssastry and I feel that the important short-term task is to facilitate the use of Parsoid HTML for read views, since that's a blocker for the eventual reunification of Parsoid with core (and deprecation of the legacy parser). "HTML-only wikis" is the roadmap item we use as shorthand to describe using HTML for storage, and that's a bit further off/lower priority on our roadmap. That makes us more likely to add things like <div> wrappers now, and figure out how to refactor them out of the storage format later.

(And really, the Rubicon was crossed a while ago in terms of adding reader-oriented redundancies, so it seems unwise to attempt a quixotic stand at this late point.)

Lots of good points. Wrapping the content is a nice convenience for reads but we could maybe get away by other means if it is a big problem.

I agree with Ed that mw:CollapseWrapper is not a good name, I'd prefer mw:SectionContent that Subbu mentioned.

In T176959#3720643, @Jdlrobson wrote:

Yes we have considered this in the past but the radical restructure would cause a reflow for us as well as a repaint since we need to provide server side rendered.

Could you explain how the Parsoid content is making it to the page first time in your use case? I was imaging this pass would take place before the first paint so as to avoid a reflow (be that in the client or on the server).

I imagine this is also not feasible on older mobile devices viewing large pages with low memory / low CPU.

I'm sceptical about this. As cscott mentions there are other page-wide passes that take place to transform the content for mobile devices, I don't see how this would be any different. If the page is large and memory low, then parsing the untransformed DOM itself will represent the vast majority of the performance bottleneck. Splicing in a few extra nodes shouldn't add to that.

In T176959#3721126, @cscott wrote:

Further off our dogmatism, we add some information which isn't strictly speaking a property of the underlying wikitext at all, like red-link information, and just resign ourselves to regenerating our HTML when redlink status changes.

I think in this case there was a clear demand from pretty much every client (desktop, mobile, editors) and an obvious performance penalty if left up to the client (an extra API request).

At the moment the amount of code to do the transformation either way is going to be similar, but as this is the first time we're upstreaming mobile-specific DOM transformations we need to think carefully about where we want to set our boundaries (cf cscott's escalation of markup list).

Jdforrester-WMF subscribed.Oct 31 2017, 3:25 PM

Updated DOM proposal, just to keep the discussion in sync:

<section data-mw-section-id="1"><h2 id="html5id"><span id="legacyHtml4id" typeof="mw:FallbackId"></span>Heading text</h2><div typeof="mw:SectionContent">
...
</div></section>

With data-mw-section-id equal to -1 for uneditable sections and -2 for pseudo-sections. The wrapper <section>, wrapper <div> and empty <span> would be ignored when converting html2wt. Checking suitability for our use case, CSS rules for collapsing would look something like:

h2 + div { display: none; } /* collapse all top-level sections by default */
h2:target + div,                        /* display the target of a link */
h2#References + div, h2#Notes + div,    /* display references and notes */
section[data-section-number="0"] > div, /* display lead section */
section[data-mw-section-id="-1"] > div, /* uneditable sections can't be collapsed, due to some wikitext issue in the source */
section[data-mw-section-id="-2"] > div  /* pseudo-sections can't be collapsed, due to some wikitext issue in the source. */
{
  display: block;
}

Just want to jump in here with a perspective given the role of the Mobile-Content-Service (and its successor, the Page Content Service) which are responsible for formatting Parsoid content for Reading client consumption.

Currently, the MCS is performing various transforms on Parsoid content and then storing them in Cassandra.
While the MCS stored the results of it's transforms primarily in JSON, the PCS is being designed to store the results of these transforms in HTML (what we are calling the compatibility layer) with the purpose of upstreaming them to Parsoid if they are useful to all clients.
Unless Parsoid performs ALL transformations required by reading clients, we will always being storing a "Reading Optimized" version of the HTML.

Ideally, we would love if we could eliminate the compatibility layer… but are conscious of the fact that we have many needs of the HTML and that may never be entirely possible.

I guess there are 3 ways to to decide what to implement in Parsoid:

Favor the editing experience
Favor the reading experience
Do the conceptually "right" thing

I feel like the solution being suggested here is taking route 3, which has some elegance and reduces the amount of code needed in the PCS, but at the expense of both clients now needing to massage the Parsoid response.

I mentioned this (less eloquently) to @ssastry during the services sync, but now I wanted to ask this more specifically as the question has been raised again: Given that the PCS will continue to perform transforms on Parsoid content resulting in us having 2 cached versions of the HTML (one Parsoid and one PCS) for the foreseeable future, should we just do option 1 and let Parsoid just favor the editing experience so that only one team needs to massage the content?

Or to say it another way… the needs of VE should always be "upstreamed" into Parsoid, while the needs of Reading Clients should only upstreamed if they don't affect VE. And more concretely, is it better for us to implement this in PCS instead of Parsoid?

In T176959#3720643, @Jdlrobson wrote:

Yes we have considered this in the past but the radical restructure would cause a reflow for us as well as a repaint since we need to provide server side rendered.

The sections start collapsed, so you'd be manipulating display:none-d nodes, right? I'm pretty sure no modern browser does a repaint for that.

I imagine this is also not feasible on older mobile devices viewing large pages with low memory / low CPU. Long term we've been thinking lots about ways we can provide section collapsing to grade C browsers either via lightweight JS or CSS only solutions. Solutions like T176959#3711269 are appealing.

Again, as long as you are manipulating hidden nodes, there doesn't seem to be much CPU / memory impact. You'd add one node per top-level section, the browser would calculate the computed value of display, and after that you'd just remove hidden nodes and add nodes to a hidden parent - not even CSS computed values get calculated for something like that.

In T176959#3723286, @Esanders wrote:

At the moment the amount of code to do the transformation either way is going to be similar, but as this is the first time we're upstreaming mobile-specific DOM transformations we need to think carefully about where we want to set our boundaries (cf cscott's escalation of markup list).

Yup, but I feel unless there is a clear guiding principle for how we know this, this will always be on a case-by-case basis, i.e. unless we tow a very strict line about only semantic markup being part of Parsoid (which it isn't in the absolute sense as Scott already covered). So, we have to make this decision now for this scenario.

In T176959#3724504, @Fjalapeno wrote:

Ideally, we would love if we could eliminate the compatibility layer… but are conscious of the fact that we have many needs of the HTML and that may never be entirely possible.

I expect this will be true always (especially since mobile and desktop considerations will always be different). Even for the PHP parser, there was the mobile front end. Layered transformations are the way to go, but where we try to reduce the # of layers, as well as the thickness of the layers, so if you used a cache for an upper layer, a cache miss is not very expensive.

I guess there are 3 ways to to decide what to implement in Parsoid:

Favor the editing experience

Favor the reading experience

Do the conceptually "right" thing

I feel like the solution being suggested here is taking route 3, which has some elegance and reduces the amount of code needed in the PCS, but at the expense of both clients now needing to massage the Parsoid response.

I mentioned this (less eloquently) to @ssastry during the services sync, but now I wanted to ask this more specifically as the question has been raised again: Given that the PCS will continue to perform transforms on Parsoid content resulting in us having 2 cached versions of the HTML (one Parsoid and one PCS) for the foreseeable future, should we just do option 1 and let Parsoid just favor the editing experience so that only one team needs to massage the content?

Or to say it another way… the needs of VE should always be "upstreamed" into Parsoid, while the needs of Reading Clients should only upstreamed if they don't affect VE. And more concretely, is it better for us to implement this in PCS instead of Parsoid?

I think (3) will continue to be our guiding principle because it supports clients outside the WMF where the specific immediate needs of reading or editing don't apply. However, where pragmatic considerations dictate directions, we should make compromises towards 1. or 2. depending on the specific use case as long as it doesn't seriously undermine the conceptual markup model.

This specific example we are discussing here (the extra <div> tag to wrap a section's content) is a fairly minor irritant along any of of those 3 considerations you outline above. The div is specifically demarcating a section's content from its heading (independent of whitespace, comments, etc. that might intervene) and so is, in some sense, a semantic wrapper. Since it doesn't bloat the DOM in any meaningful way nor does it introduce serious performance issues that I can tell, it feels like its presence can be justified even if a reading use case motivated it. Similarly, if we favored the editing experience, it will be a minor irritant for reading from what I understand. Likewise it seems to be the case for editing.

So, I think this decision is best made on its own merits -- if there is a <div>less solution that works fine, that is the first choice. If not, I think it is a matter that reading and editing need to resolve in terms of who is most inconvenienced by it. If you want us to make the decision, we will go into a huddle and come out with a solution for you all. :-)

The sections start collapsed, so you'd be manipulating display:none-d nodes, right?

No.. if they're not wrapped they can't be hidden (unless you resort to sibling selectors specifically for this purpose but that would be over engineering from our side.

ssastry added a comment.Oct 31 2017, 8:37 PM

This comment was removed by ssastry.

ssastry added a comment.Oct 31 2017, 8:39 PM

This comment was removed by ssastry.

Yeah, @Tgr's comment (ironically?) indicates why the <div> wrapping may be a Good Thing, semantically -- without the <div> there's no clean way to name "the contents of the section" (as opposed to the heading) in an XPath/CSS sort of way, hence you can't "start out collapsed" without also hiding all the headings.

Although it seems you might be able to get away with:

section > * { display: none }
section > *:first-child { display: block }

in actual practice the child Text nodes are not styled by this, so <section><h2>Foo</h2> bar <p> bat </section> would still display bar (but not bat).

That is all irrelevant as along as the transformation is done before the HTML is first appended to the document, which is the question I asked @Jdlrobson in T176959#3723286

That is all irrelevant as along as the transformation is done before the HTML is first appended to the document

Marvin is an isomorphic web app that will need to provide a server side rendering fallback. Yes, the transformation needs to be done before the HTML is first appended to the document and that either needs to happen in the mobile content service or Parsoid on the server not in the client.

I'm getting the feeling this might benefit from an in-person sync.

@ssastry thanks… I just wanted to put it out there as a possibility in case it eases the problem. I'm personally not dogmatic here, so if it can work in Parsoid great, and if not, then putting in the PCS is also fine. Either way, the Reading web team can get what they need.

I will defer to you and Editing.

@cscott I thought the parser always wraps top-level content into a block tag?

Anyway, why not get rid of <section> and wrap the section body in a div? That way the level of client-specific bloat stays the same, and it doesn't seem as if section tags would be useful to anyone; and they don't have any real semantic meaning anyway, given that the MediaWiki section structure is not DOM-based.

In T176959#3724883, @Tgr wrote:

Anyway, why not get rid of <section> and wrap the section body in a div? That way the level of client-specific bloat stays the same, and it doesn't seem as if section tags would be useful to anyone; and they don't have any real semantic meaning anyway, given that the MediaWiki section structure is not DOM-based.

Except if we start shifting semantics in that direction when we use Parsoid HTML for all read/edit views. And, our plan is to start moving structured semantics for wikitext constructs. It is not happening anytime in the immediate future, but there will be well-formed / DOM semantics to these. So, we can as well embrace them now.

Quick note that section collapsing on mobile viewports is what we do now on mobile web, but it is very possible that the model will change in the future, based on product requirements from user tests. We have already seen that the apps are using a different TOC and section UX so that may be the direction that mobile web is asked to take in the future.

I would recommend going with the mentioned 3. Do the conceptually "right" thing as @cscott mentions, and then perform reading specific transforms in the PCS layer, with an eye to upstreaming them to the parsoid layer when proven stable. That way we avoid unnecessary changes and synchronization between teams because of the parsoid layer if the reading requirements shift.

Kind of unrelated, semantically speaking, I feel like the section header should also be marked. There are elements that are added to the section header, like the collapse chevron, or the section editing button, that should not be part of the heading but the header. Something like:

<section>
  <header>
    <h2>Title</h2>
  </header>
  <div typeof="mw:SectionContent">
    <!-- Content -->
  </div>
</section>

Which could be enhanced like:

<section>
  <header>
    <button>Collapse</button>
    <h2>Title</h2>
    <button>Edit</button>
  </header>
  ...

It is not a big deal if we don't use a header, but we should be mindful that a heading and a header are not necessarily the same.

In T176959#3725938, @Jhernandez wrote:

I would recommend going with the mentioned 3. Do the conceptually "right" thing as @cscott mentions, and then perform reading specific transforms in the PCS layer, with an eye to upstreaming them to the parsoid layer when proven stable. That way we avoid unnecessary changes and synchronization between teams because of the parsoid layer if the reading requirements shift.

In T176959#3724873, @Jdlrobson wrote:

I'm getting the feeling this might benefit from an in-person sync.

Thanks everybody. We'll not add the <div> wrapper then. I'll update my patch accordingly.

• Niedzielski subscribed.Nov 1 2017, 2:34 PM

+1 to adding <header> tags (in PCS or wherever). I wonder whether adding <header> and content wrapper would make round tripping more reliable.

@bearND can you create a ticket for the work you need to do in PCS to make this work for Reading Web?

@bearND or just repurpose this ticket if thats ok with the Web team

@Fjalapeno @bearND this tix is/was mostly about the UI part for us, but it has a lot of conversation. I'm happy to make a new one for us and unlink it this one from our board if you want to preserve the convo here! Let me know what you decide.

• bearND created subtask T179715: Enable section collapsing for readers.Nov 3 2017, 6:21 PM

• bearND mentioned this in T179715: Enable section collapsing for readers.Nov 3 2017, 6:24 PM

I've created a new task for the PCS implementation. @ovasileva and @Jhernandez sorry for hijacking your ticket. Feel free to create a new one for the web team and relink as you see fit.

Declining open Marvin tasks as per T203749#4605708

Display proper section collapsing within article namespaceClosed, DeclinedPublicActions