Page MenuHomePhabricator

CX adds unnecessary nowiki around ISBN
Closed, ResolvedPublic

Description

Check

https://en.wikipedia.org/w/index.php?title=Luciano_Castillo_Colonna&diff=678522659&oldid=601138310

where it added

<nowiki> Basadre Grohmann, Jorge: History of the Republic of the Peru (1822 - 1933), Volumes 14 and 15. Edited by the Company Editor The Trade S. To. Lima, 2005. ISBN 9972-205-76-2 (V.14) </nowiki>[[Special:BookSources/9972205762|ISBN 9972-205-77-0]] (V.15)

See also these examples from ptwiki:

Event Timeline

Magioladitis raised the priority of this task from to Needs Triage.
Magioladitis updated the task description. (Show Details)
Magioladitis added subscribers: Magioladitis, Bgwhite, NicoV.

And the ISBN is not even coherent : it links to ISBN 9972205762 but displays ISBN 9972205770...

Amire80 renamed this task from CT adds nowiki tags and ISBN in wrong way to CX adds nowiki tags and ISBN in wrong way.Sep 2 2015, 8:11 AM
Amire80 triaged this task as Medium priority.Sep 4 2015, 7:41 AM

I can still reproduce it: https://en.wikipedia.org/wiki/User:Amire80/Luciano_Castillo_Colonna

The nowiki issue is possibly related to the use of machine translation and annotation mapping. When I translate the same article from Spanish to Albanian, <nowiki> is not added, and the most likely difference is that between Spanish and English there is machine translation.

The ISBN issue is either an issue with CX's link adaptation (that it doesn't adapt ISBN links correctly) or with Parsoid.

Worth verifying that CX is generating the right HTML for ISBN links. If it generates plain text for ISBN numbers, Parsoid will correctly add nowikis around that text. Similar to T113565.

Worth verifying that CX is generating the right HTML for ISBN links. If it generates plain text for ISBN numbers, Parsoid will correctly add nowikis around that text. Similar to T113565.

In general, CX produces complete garbage with ISBN, see for example Eleanor Dark on frwiki: the ISBN produced is a link to the ISBN Special page on enwiki... [[:en:Special:BookSources/0732909031|ISBN 0-7329-0903-1]]

You also have constructs like [[International Standard Book Number|ISBN]]&nbsp;978-0-8018-8221-0 (Rat pygmée de rizière à longue queue)

The only time ISBN seems correct is when they are handled by a template, so CX can't do anything stupid with them.

This is one of the many damages produced by CX on real articles : any plan to fix those bugs in a timely manner (most of them have been reported months ago) or is it still going to produce for a long time many damages that volunteers have to fix ?

You also have a different kind of ISBN garbage, like in this edit (and ISBN is only one of the many problems in this edit) :
<cite class="citation book" contenteditable="false">Burton, Brian K. (2007). ''The Peninsula & Seven Days: A Battlefield Guide''. Lincoln: University of Nebraska Press. </cite><cite class="citation book" contenteditable="false">[[International Standard Book Number|ISBN]]&nbsp;978-0-8032-6246-1.</cite><span class="Z3988" title="ctx_ver=Z39.88-2004&rfr_id=info%3Asid%2Fen.wikipedia.org%3ABattle+of+Hanover+Court+House&rft.aufirst=Brian+K.&rft.aulast=Burton&rft.btitle=The+Peninsula+%26+Seven+Days%3A+A+Battlefield+Guide&rft.date=2007&rft.genre=book&rft.isbn=978-0-8032-6246-1&rft.place=Lincoln&rft.pub=University+of+Nebraska+Press&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook" contenteditable="false">&nbsp;</span>

This task was reported almost 3 months ago, and still nothing done... Meanwhile the damages keep going

Amire80 renamed this task from CX adds nowiki tags and ISBN in wrong way to CX adds unnecessary nowiki around ISBN.May 25 2016, 7:26 AM

Change 294466 had a related patch set uploaded (by Santhosh):
Support ISBN link adaptation

https://gerrit.wikimedia.org/r/294466

Change 294466 merged by jenkins-bot:
Support ISBN link adaptation

https://gerrit.wikimedia.org/r/294466

Amire80 moved this task from QA to Done on the Language-Q1-2016-17 Sprint 2 board.