CX adds unnecessary nowiki around ISBN
Closed, ResolvedPublic

Description

Check

https://en.wikipedia.org/w/index.php?title=Luciano_Castillo_Colonna&diff=678522659&oldid=601138310

where it added

<nowiki> Basadre Grohmann, Jorge: History of the Republic of the Peru (1822 - 1933), Volumes 14 and 15. Edited by the Company Editor The Trade S. To. Lima, 2005. ISBN 9972-205-76-2 (V.14) </nowiki>[[Special:BookSources/9972205762|ISBN 9972-205-77-0]] (V.15)

See also these examples from ptwiki:

Magioladitis updated the task description. (Show Details)
Magioladitis raised the priority of this task from to Needs Triage.
Magioladitis added subscribers: Magioladitis, Bgwhite, NicoV.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptAug 30 2015, 8:12 AM
NicoV set Security to None.
NicoV added a comment.Sep 2 2015, 7:46 AM

And the ISBN is not even coherent : it links to ISBN 9972205762 but displays ISBN 9972205770...

Amire80 renamed this task from CT adds nowiki tags and ISBN in wrong way to CX adds nowiki tags and ISBN in wrong way.Sep 2 2015, 8:11 AM
Amire80 triaged this task as Normal priority.Sep 4 2015, 7:41 AM
Amire80 moved this task from Backlog to CX6 on the ContentTranslation board.Sep 4 2015, 9:38 AM
Amire80 moved this task from CX6 to CX7 on the ContentTranslation board.Oct 1 2015, 6:06 PM

I can still reproduce it: https://en.wikipedia.org/wiki/User:Amire80/Luciano_Castillo_Colonna

The nowiki issue is possibly related to the use of machine translation and annotation mapping. When I translate the same article from Spanish to Albanian, <nowiki> is not added, and the most likely difference is that between Spanish and English there is machine translation.

The ISBN issue is either an issue with CX's link adaptation (that it doesn't adapt ISBN links correctly) or with Parsoid.

ssastry added a subscriber: ssastry.Nov 2 2015, 3:28 PM

Worth verifying that CX is generating the right HTML for ISBN links. If it generates plain text for ISBN numbers, Parsoid will correctly add nowikis around that text. Similar to T113565.

NicoV added a comment.Nov 7 2015, 12:37 PM

Worth verifying that CX is generating the right HTML for ISBN links. If it generates plain text for ISBN numbers, Parsoid will correctly add nowikis around that text. Similar to T113565.

In general, CX produces complete garbage with ISBN, see for example Eleanor Dark on frwiki: the ISBN produced is a link to the ISBN Special page on enwiki... [[:en:Special:BookSources/0732909031|ISBN 0-7329-0903-1]]

You also have constructs like [[International Standard Book Number|ISBN]]&nbsp;978-0-8018-8221-0 (Rat pygmée de rizière à longue queue)

The only time ISBN seems correct is when they are handled by a template, so CX can't do anything stupid with them.

This is one of the many damages produced by CX on real articles : any plan to fix those bugs in a timely manner (most of them have been reported months ago) or is it still going to produce for a long time many damages that volunteers have to fix ?

You also have a different kind of ISBN garbage, like in this edit (and ISBN is only one of the many problems in this edit) :
<cite class="citation book" contenteditable="false">Burton, Brian K. (2007). ''The Peninsula & Seven Days: A Battlefield Guide''. Lincoln: University of Nebraska Press. </cite><cite class="citation book" contenteditable="false">[[International Standard Book Number|ISBN]]&nbsp;978-0-8032-6246-1.</cite><span class="Z3988" title="ctx_ver=Z39.88-2004&rfr_id=info%3Asid%2Fen.wikipedia.org%3ABattle+of+Hanover+Court+House&rft.aufirst=Brian+K.&rft.aulast=Burton&rft.btitle=The+Peninsula+%26+Seven+Days%3A+A+Battlefield+Guide&rft.date=2007&rft.genre=book&rft.isbn=978-0-8032-6246-1&rft.place=Lincoln&rft.pub=University+of+Nebraska+Press&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook" contenteditable="false">&nbsp;</span>

This task was reported almost 3 months ago, and still nothing done... Meanwhile the damages keep going

ssastry moved this task from Backlog to Non-Parsoid Tasks on the Parsoid board.Dec 17 2015, 7:24 PM
Elitre added a subscriber: Elitre.Apr 6 2016, 6:54 PM
He7d3r updated the task description. (Show Details)May 1 2016, 6:11 PM
Amire80 renamed this task from CX adds nowiki tags and ISBN in wrong way to CX adds unnecessary nowiki around ISBN.May 25 2016, 7:26 AM

Change 294466 had a related patch set uploaded (by Santhosh):
Support ISBN link adaptation

https://gerrit.wikimedia.org/r/294466

Change 294466 merged by jenkins-bot:
Support ISBN link adaptation

https://gerrit.wikimedia.org/r/294466

Amire80 closed this task as Resolved.Aug 9 2016, 8:16 PM
Amire80 moved this task from QA to Done on the Language-Q1-2016-17 Sprint 2 board.