Page MenuHomePhabricator

Moving article lead breaks with deduplicated TemplateStyles
Closed, ResolvedPublic

Description

It looks like the then-resolved T263306 problem is still not fully fixed (might also be a new one, I'm not sure). Yesterday I was wondering why the mobile versions (m-domain) of most country articles on dewiki did not start with the first paragraph, but with the infobox (the paragraph order is correct this time, fortunately), eg Italy, while some others were displayed correctly (eg Germany). The difference between the articles is that the Template:IPA (using TemplateStyles) is only used once in the Germany (and Austria) article lead, but at least twice in all other articles I checked. I guess this means that the HTMLFormatter still stumbles on the TemplateStyles tags (this time the deduplicated ones). Since the paragraph order is not shuffled this time, it is less dramatic, but I hope it can be fixed!

Event Timeline

This looks like it's due to the following being output inside Infobox Staat:

<p><span style="display:none;"><a href="/w/index.php?title=Vorlage:Infobox_Staat/Wartung/NAME-DEUTSCH&amp;action=edit&amp;redlink=1" class="new" title="Vorlage:Infobox Staat/Wartung/NAME-DEUTSCH (Seite nicht vorhanden)">Vorlage:Infobox Staat/Wartung/NAME-DEUTSCH</a></span>
</p>

Ideally that should be a child of the infobox itself if needed.

My understanding is the lead paragraph cannot be shifted if there is a paragraph being hidden by CSS. The algorithm has no knowledge.

There is a rewrite planned in T262093 but I don't think this particular case can be fixed without editing the underlying article.

Oh, I see! So it is not related to the TemplateStyles problem? In that case I can obviously work on fixing the infobox template (although it would be nice if the algorithm could cope with such cases, as the use of hidden links as an alternative to tracking categories is quite common in infoboxes on dewiki).

EDIT: I have fixed it by using divs instead of spans for the hidden links. If there is any chance the algorithm can be adapted so as to recognise hidden paragraphs, I would modify this task, otherwise it can be closed.

Esanders claimed this task.

We can't evaluate the visibility of a paragraph in the PHP code as it depends on the stylesheet tree and inline style attributes. Accordingly we only detect empty paragraphs as those with no text content, and excluding some very specific known entities (<span id=coordinates>).

Using a <div> to contain hidden metadata seems like a reasonable workaround so I don't see any need to further complicate the logic.