Page MenuHomePhabricator

References section heading missing in Portuguese
Closed, ResolvedPublicBUG REPORT

Description

Hi. In the Portuguese Wikipedia, an article Reference section is defined using a template which outputs the section heading:

Predefinição:Referências

See, for example, a random article: https://pt.wikipedia.org/wiki/Sociedade_Floresta_Aurora

Unfortunately, this confuses the mobile-html output, which doesn't display any References section in the table of contents; the lists of references still appears under the previous section (normally, a See Also section).

Here is the template source code:
https://pt.wikipedia.org/w/index.php?title=Predefini%C3%A7%C3%A3o:Refer%C3%AAncias&action=edit

Event Timeline

Dbrant renamed this task from Android app: references section heading missing in Portuguese to References section heading missing in Portuguese.Aug 19 2021, 12:35 PM
Dbrant updated the task description. (Show Details)
Aklapper added a subscriber: Dbrant.

@Dbrant: I assume this is about the Page Content Service codebase? If not please correct. Thanks!

vadim-kovalenko changed the task status from Open to In Progress.Jan 16 2023, 10:32 AM
vadim-kovalenko claimed this task.
vadim-kovalenko moved this task from Needs Triage to In Progress on the Page Content Service board.

The problem is that Parsoid doesn't wrap the Reference list into the section. This happens for all pt wiki articles. Here are some examples of them:

In https://pt.wikipedia.org/api/rest_v1/page/html/Sebastian_Vettel topics Vitórias por equipe and Referências belong to the same section with data-mw-section-id=13 (see the shot)

pt wiki.png (1×2 px, 874 KB)

For instance, en wiki Parsoid output (check https://en.wikipedia.org/api/rest_v1/page/html/Canada) has separated sections for each topic (shot 2)

en wiki.png (916×2 px, 775 KB)

Upd: I left a comment in the template discussion section.

vadim-kovalenko changed the task status from In Progress to Open.Jan 17 2023, 12:47 PM
vadim-kovalenko removed vadim-kovalenko as the assignee of this task.
vadim-kovalenko moved this task from In Progress to Needs Triage on the Page Content Service board.
vadim-kovalenko subscribed.

Reproducible as follows. For some reason, the section wrapping code swallows the templated references section into the previous section. To be investigated.

~/work/wmf/parsoid (master ✘)✭ ᐅ cat /tmp/wt
== foo ==
 bar

{{referências}}
~/work/wmf/parsoid (master ✘)✭ ᐅ php bin/parse.php --domain pt.wikipedia.org --wrapSections < /tmp/wt
<section data-mw-section-id="0" data-parsoid="{}"></section><section data-mw-section-id="1" data-parsoid="{}"><h2 id="foo" data-parsoid='{"dsr":[0,9,2,2,1,1]}'>foo</h2>
<p data-parsoid='{"dsr":[10,13,0,0]}'>bar</p>

<h2 style="cursor: help;" title="Esta seção foi configurada para não ser editável diretamente. Edite a página toda ou a seção anterior em vez disso." about="#mwt1" typeof="mw:Transclusion" id="Referências" data-parsoid='{"stx":"html","dsr":[15,31,null,null],"pi":[[]]}' data-mw='{"parts":[{"template":{"target":{"wt":"referências","href":"./Predefinição:Referências"},"params":{},"i":0}}]}'><span id="Refer.C3.AAncias" typeof="mw:FallbackId"></span>Referências</h2><span about="#mwt1">

</span><div class="reflist" style=" list-style-type: decimal;" about="#mwt1" data-parsoid='{"stx":"html"}'><div class="mw-references-wrap" typeof="mw:Extension/references" about="#mwt3" data-parsoid='{"src":"&lt;references group=\"\">&lt;/references>"}' data-mw='{"name":"references","attrs":{"group":""},"body":{"html":""}}'><ol class="mw-references references" data-parsoid="{}"></ol></div></div>
</section>

So, it turns out that the ptwiki template referências emits HTML heading tags (via this template -- <tangent> Ugh!! string concatenation to generate a HTML tag name </tangent>). HTML headers aren't currently considered section headers in Parsoid.

However, seeing that this HTML-generated h-tag is part of the TOC, maybe Parsoid over-interpreted the absence of 'edit section' links as not to treat it as a section. So, we may need to fix the section wrapping algorithm to not ignore HTML h-tags.

if you have any recommended changes to the pt template, I can volunteer to modify it; maybe in a test template first.

Change 881732 had a related patch set uploaded (by Subramanya Sastry; author: Subramanya Sastry):

[mediawiki/services/parsoid@master] Section wrapping should accept HTML h-tags as well

https://gerrit.wikimedia.org/r/881732

Change 881732 merged by jenkins-bot:

[mediawiki/services/parsoid@master] Section wrapping should accept HTML h-tags as well

https://gerrit.wikimedia.org/r/881732

As the original creator of both templates, I just wanted to offer my apologies for the technical difficulties they created here and likely elsewhere. It was the way we found back then to avoid the constant stumbling block for new editors who were trying to edit references in the references section rather than in the paragraphs that actually included the source of those references.

As the original creator of both templates, I just wanted to offer my apologies for the technical difficulties they created here and likely elsewhere. It was the way we found back then to avoid the constant stumbling block for new editors who were trying to edit references in the references section rather than in the paragraphs that actually included the source of those references.

No worries. You helped us uncover a real bug which is now fixed. As for https://pt.wikipedia.org/wiki/Predefini%C3%A7%C3%A3o:Esconder_link_para_editar_se%C3%A7%C3%A3o , one option to fix that would be to use the switch parser function so you aren't concatenating strings to generate a <h*> tag. But, that is not causing any bugs now -- that is more a recommendation for cleanup. At some point in the future (2-5 years timeframe), we may stop supporting that feature (with linting and other support to migrate away), so you can get ahead by fixing it now.

Change 885031 had a related patch set uploaded (by C. Scott Ananian; author: C. Scott Ananian):

[mediawiki/vendor@master] Bump wikimedia/parsoid to 0.17.0-a13

https://gerrit.wikimedia.org/r/885031

Change 885031 merged by jenkins-bot:

[mediawiki/vendor@master] Bump wikimedia/parsoid to 0.17.0-a13

https://gerrit.wikimedia.org/r/885031

In https://pt.wikipedia.org/api/rest_v1/page/html/Sebastian_Vettel topics Vitórias por equipe and Referências belong to the same section with data-mw-section-id=13 (see the shot)

This is now fixed.