Page MenuHomePhabricator

"Error while publishing - parsoidserver" when diacritic in target and references are added
Closed, ResolvedPublic

Description

Example URL: http://en.wikipedia.beta.wmflabs.org/wiki/Special:ContentTranslation?page=Santos+Col%C3%B3n&from=es&to=pt&debug=true

Note that this mostly happens while we add References, otherwise page is published. This seems related to Parsoid service.


Version: master
Severity: major

Details

Reference
bz73119
Related Gerrit Patches:
mediawiki/extensions/ContentTranslation : masterKeep data-mw attributes for references to avoid parsoid error

Event Timeline

bzimport raised the priority of this task from to High.Nov 22 2014, 3:56 AM
bzimport added a project: ContentTranslation.
bzimport set Reference to bz73119.
bzimport added a subscriber: Unknown Object (MLST).
KartikMistry rescinded a token.

Could you dump the HTML you are sending to Parsoid?

See T75121 for an isolated test case.

Niklas: on T75121, I had added a comment indicating that if you added the about and data-mw attributes, that should work. The dom spec has been updated as well to make this clear.

Can someone confirm if this is still a problem once you fix the HTML that you send Parsoid?

Arrbee assigned this task to santhosh.Dec 3 2014, 7:28 AM
Arrbee added a project: LE-Sprint-79.

Subbu, Santhosh is taking a look at this later today. He can confirm in some time. Thanks.

Change 177189 had a related patch set uploaded (by Santhosh):
Keep data-mw attributes for references to avoid parsoid error

https://gerrit.wikimedia.org/r/177189

Patch-For-Review

@ssastry, I added data-mw to the references and it fixed the error. I am able to publish Santos_Colón from es after translation.

The data-mw , data-parsoid attributes were removed before sending the HTML to Machine translation engines to give them minimal HTML to work with. After Machine translation, we are now restoring data-mw for references.

santhosh moved this task from Backlog to In Review on the LE-Sprint-79 board.Dec 3 2014, 9:48 AM

Great! Glad that fixes it.

ssastry moved this task from Backlog to In Progress on the Parsoid board.Dec 4 2014, 8:22 AM

Change 177189 merged by jenkins-bot:
Keep data-mw attributes for references to avoid parsoid error

https://gerrit.wikimedia.org/r/177189

santhosh closed this task as Resolved.Dec 4 2014, 9:16 AM
santhosh moved this task from In Review to Done on the LE-Sprint-79 board.