Page MenuHomePhabricator

Handle dewiki hatnotes
Closed, ResolvedPublic

Description

Hatnotes on dewiki are not contained in elements with the "hatnote" class. They come from a variety of templates that produce HTML tables with the desired notices; and these appear to have unique element ids based on the template used, rather than a common class.

They also create leading non-content paragraph elements that interfere with lead intro extraction.

These should be picked up in extractHatnotes as well as accounted for in lead intro and summary generation.

Examples:

https://de.wikipedia.org/api/rest_v1/page/html/Berliner_Mauer (Vorlage:Begriffsklärungshinweis)

<table id="Vorlage_Begriffsklärungshinweis" cellpadding="0" cellspacing="8" class="hintergrundfarbe1 rahmenfarbe1 noprint navigation-not-searchable" style="border-bottom-style: solid; clear: right; font-size: 95%; margin-bottom: 1em; width: 100%; " role="navigation" about="#mwt1">
 <tbody>
  <tr>
   <td style="width: 26px; vertical-align: middle" id="bksicon">
    <figure-inline typeof="mw:Image">
     <span>
      <img alt="" resource="./Datei:Disambig-dark.svg" src="//upload.wikimedia.org/wikipedia/commons/thumb/e/ea/Disambig-dark.svg/25px-Disambig-dark.svg.png" data-file-width="444" data-file-height="340" data-file-type="drawing" height="19" width="25" srcset="//upload.wikimedia.org/wikipedia/commons/thumb/e/ea/Disambig-dark.svg/50px-Disambig-dark.svg.png 2x, //upload.wikimedia.org/wikipedia/commons/thumb/e/ea/Disambig-dark.svg/38px-Disambig-dark.svg.png 1.5x"/>
     </span>
    </figure-inline>
   </td>
   <td> Der Titel dieses Artikels ist mehrdeutig. Weitere Bedeutungen sind unter <a rel="mw:WikiLink" href="./Berliner_Mauer_(Begriffsklärung)" title="Berliner Mauer (Begriffsklärung)">Berliner Mauer (Begriffsklärung)</a> aufgeführt.
   </td>
  </tr>
 </tbody>
</table>

https://de.wikipedia.org/api/rest_v1/page/html/Wikipedia (Vorlage:Dieser Artikel)

<table id="Vorlage_Dieser_Artikel" cellpadding="0" cellspacing="8" class="hintergrundfarbe1 rahmenfarbe1 noprint navigation-not-searchable" style="border-bottom-style: solid; clear: right; font-size: 95%; margin-bottom: 1em; width: 100%; " role="navigation" about="#mwt1">
 <tbody>
  <tr>
   <td style="width: 26px; vertical-align: middle" id="bksicon">
    <figure-inline typeof="mw:Image">
     <span>
      <img alt="" resource="./Datei:Disambig-dark.svg" src="//upload.wikimedia.org/wikipedia/commons/thumb/e/ea/Disambig-dark.svg/25px-Disambig-dark.svg.png" data-file-width="444" data-file-height="340" data-file-type="drawing" height="19" width="25" srcset="//upload.wikimedia.org/wikipedia/commons/thumb/e/ea/Disambig-dark.svg/50px-Disambig-dark.svg.png 2x, //upload.wikimedia.org/wikipedia/commons/thumb/e/ea/Disambig-dark.svg/38px-Disambig-dark.svg.png 1.5x"/>
     </span>
    </figure-inline>
   </td>
   <td style="vertical-align: middle"> Dieser Artikel behandelt die freie Online-Enzyklopädie Wikipedia. Für die deutschsprachigen Ausgabe siehe <a rel="mw:WikiLink" href="./Deutschsprachige_Wikipedia" title="Deutschsprachige Wikipedia">Deutschsprachige Wikipedia</a>. Für den gleichnamigen Asteroiden siehe <a rel="mw:WikiLink" href="./(274301)_Wikipedia" title="(274301) Wikipedia">(274301) Wikipedia</a>.
   </td>
  </tr>
 </tbody>
</table>

Event Timeline

Change 409197 had a related patch set uploaded (by Mholloway; owner: Mholloway):
[mediawiki/services/mobileapps@master] Skip transcluded leading non-content paragraphs when extracting lead intro

https://gerrit.wikimedia.org/r/409197

Change 409197 merged by jenkins-bot:
[mediawiki/services/mobileapps@master] Skip transcluded leading non-content paragraphs when extracting lead intro

https://gerrit.wikimedia.org/r/409197

Change 409419 had a related patch set uploaded (by Mholloway; owner: Mholloway):
[mediawiki/services/mobileapps@master] Handle hatnotes on German Wikipedia in extractHatnotes

https://gerrit.wikimedia.org/r/409419

Change 409419 merged by jenkins-bot:
[mediawiki/services/mobileapps@master] Handle hatnotes on German Wikipedia in extractHatnotes

https://gerrit.wikimedia.org/r/409419