Page MenuHomePhabricator

Wrong segmentation of content inside the reference cause partial publishing
Closed, ResolvedPublic

Description

Here is a sample segmented HTML from the cxserver, produced out of https://en.wikipedia.org/wiki/Central_Bank_of_the_Republic_of_Turkey#cite_ref-5

example.html
<li about="#cite_note-5" data-seqid="1830" id="cite_note-5">
  <span class="cx-segment" data-segmentid="1831">
      <a class="cx-link" data-linkid="1832" href="#cite_ref-5" rel="mw:referencedBy"><span class="mw-linkback-text"></span></a>  
      <span class="mw-reference-text" id="mw-reference-text-cite_note-5">
              <a class="cx-link" data-linkid="1833" href="http://www.tcmb.gov.tr/yeni/eng/" id="mwaw" rel="mw:ExtLink">Banco central de la Repúbl5ica de Turquía.</a>
      </span>
   </span>
   <span class="mw-reference-text" id="mw-reference-text-cite_note-5">
        <span class="cx-segment" data-segmentid="1834">Museo de billete: 7. </span>
   </span>
   <span class="mw-reference-text" id="mw-reference-text-cite_note-5">
         <span class="cx-segment" data-segmentid="1835">Grupo de emisión - Veinte mil turco Lira - 
               <a class="cx-link" data-linkid="1836" href="http://www.tcmb.gov.tr/yeni/banknote/E7/294.htm" id="mwbA" rel="mw:ExtLink">yo. Serie</a> &amp; II. 
         </span>
   </span>
   <span class="mw-reference-text" id="mw-reference-text-cite_note-5">
         <span class="cx-segment" data-segmentid="1838">
               <a class="cx-link" data-linkid="1837" href="http://www.tcmb.gov.tr/yeni/banknote/E7/296.htm" id="mwbQ" rel="mw:ExtLink">Serie.
               </a> 
         </span>
   </span>
   <span class="mw-reference-text" id="mw-reference-text-cite_note-5">
        <span class="cx-segment" data-segmentid="1839">@– Recuperó el 20 de abril de 2009.</span>
   </span>
</li>

This corresponds to the rendering

pasted_file (33×1 px, 12 KB)

When published the first segment only captured in output

pasted_file (30×312 px, 4 KB)

The multiple spans with same id mw-reference-text-cite_note-5 is problematic here.

Event Timeline

santhosh triaged this task as Medium priority.Aug 12 2016, 9:11 AM
santhosh updated the task description. (Show Details)
santhosh updated the task description. (Show Details)
Arrbee raised the priority of this task from Medium to Needs Triage.Oct 15 2018, 9:21 AM
Arrbee moved this task from Bugs to Check & Move on the ContentTranslation board.

Status to be checked after T99934 is verified.

@Etonkovidova can this ticket be verified now? Thanks.

Another issue is present with https://en.wikipedia.org/wiki/Central_Bank_of_the_Republic_of_Turkey translation - the first reference (in the infobox) is not counted and the Reference section presents only 4 references instead of 5. I filed it as a separate issue - T210563: CX2: Reference in infobox not displayed in Reference section .

Etonkovidova claimed this task.

Checked in wmf.6 - the ouput does not have duplicates of mw-reference-text-cite_note-5.

<li about="#cite_note-5" id="cite_note-5">
   <a href="./Central_Bank_of_the_Republic_of_Turkey#cite_ref-5" rel="mw:referencedBy">
   <span class="mw-linkback-text">↑ 
   </span>
   </a> 
   <span class="mw-reference-text" id="mw-reference-text-cite_note-5">
      <a class="external text" href="http://www.tcmb.gov.tr/yeni/eng/" id="mwkQ" rel="mw:ExtLink">Central Bank of the Republic of Turkey</a> <a about="#mwt17" class="external text" data-mw="{&quot;parts&quot;:[{&quot;template&quot;:{&quot;target&quot;:{&quot;wt&quot;:&quot;webarchive&quot;,&quot;href&quot;:&quot;./Template:Webarchive&quot;},&quot;params&quot;:{&quot;url&quot;:{&quot;wt&quot;:&quot;https://www.webcitation.org/5hFIaQq0J?url=http://www.tcmb.gov.tr/yeni/eng/&quot;},&quot;date&quot;:{&quot;wt&quot;:&quot;2009-06-03&quot;}},&quot;i&quot;:0}}]}" href="https://www.webcitation.org/5hFIaQq0J?url=http://www.tcmb.gov.tr/yeni/eng/" id="mwkg" rel="mw:ExtLink" typeof="mw:Transclusion">Archived</a>
      <span about="#mwt17" id="mwkw"> 2009-06-03 at </span>
      <a about="#mwt17" class="cx-link" data-linkid="220" href="./WebCite" id="mwlA" rel="mw:WikiLink" title="WebCite">WebCite</a>
      <link about="#mwt17" href="./Category:Webarchive_template_webcite_links" id="mwlQ" rel="mw:PageProp/Category">
      . Banknote Museum: 7. Emission Group - Twenty Thousand Turkish Lira - <a class="external text" href="http://www.tcmb.gov.tr/yeni/banknote/E7/294.htm" id="mwlg" rel="mw:ExtLink">I. Series</a> <a about="#mwt18" class="external text" data-mw="{&quot;parts&quot;:[{&quot;template&quot;:{&quot;target&quot;:{&quot;wt&quot;:&quot;Webarchive&quot;,&quot;href&quot;:&quot;./Template:Webarchive&quot;},&quot;params&quot;:{&quot;url&quot;:{&quot;wt&quot;:&quot;https://web.archive.org/web/20110616122458/http://www.tcmb.gov.tr/yeni/banknote/E7/294.htm#&quot;},&quot;date&quot;:{&quot;wt&quot;:&quot;2011-06-16&quot;}},&quot;i&quot;:0}}]}" href="https://web.archive.org/web/20110616122458/http://www.tcmb.gov.tr/yeni/banknote/E7/294.htm#" id="mwlw" rel="mw:ExtLink" typeof="mw:Transclusion">Archived</a><span about="#mwt18" id="mwmA"> 2011-06-16 at the </span><a about="#mwt18" class="cx-link" data-linkid="221" href="./Wayback_Machine" id="mwmQ" rel="mw:WikiLink" title="Wayback Machine">Wayback Machine</a><span about="#mwt18" id="mwmg">.</span>
      <link about="#mwt18" href="./Category:Webarchive_template_wayback_links" id="mwmw" rel="mw:PageProp/Category">
      &amp; <a class="external text" href="http://www.tcmb.gov.tr/yeni/banknote/E7/296.htm" id="mwnA" rel="mw:ExtLink">II. Series</a> <a about="#mwt19" class="external text" data-mw="{&quot;parts&quot;:[{&quot;template&quot;:{&quot;target&quot;:{&quot;wt&quot;:&quot;Webarchive&quot;,&quot;href&quot;:&quot;./Template:Webarchive&quot;},&quot;params&quot;:{&quot;url&quot;:{&quot;wt&quot;:&quot;https://web.archive.org/web/20110616122542/http://www.tcmb.gov.tr/yeni/banknote/E7/296.htm#&quot;},&quot;date&quot;:{&quot;wt&quot;:&quot;2011-06-16&quot;}},&quot;i&quot;:0}}]}" href="https://web.archive.org/web/20110616122542/http://www.tcmb.gov.tr/yeni/banknote/E7/296.htm#" id="mwnQ" rel="mw:ExtLink" typeof="mw:Transclusion">Archived</a><span about="#mwt19" id="mwng"> 2011-06-16 at the </span><a about="#mwt19" class="cx-link" data-linkid="222" href="./Wayback_Machine" id="mwnw" rel="mw:WikiLink" title="Wayback Machine">Wayback Machine</a><span about="#mwt19" id="mwoA">.</span>
      <link about="#mwt19" href="./Category:Webarchive_template_wayback_links" id="mwoQ" rel="mw:PageProp/Category">
      . – Retrieved on 20 April 2009.
   </span>
</li>