This is my interpretation of this bad rendering:
http://parsoid.wmflabs.org/frwikisource/Auteur:Abb%C3%A9%20Pierre
versus the good one:
http://fr.wikisource.org/wiki/Auteur:Abb%C3%A9_Pierre
Version: unspecified
Severity: normal
This is my interpretation of this bad rendering:
http://parsoid.wmflabs.org/frwikisource/Auteur:Abb%C3%A9%20Pierre
versus the good one:
http://fr.wikisource.org/wiki/Auteur:Abb%C3%A9_Pierre
Version: unspecified
Severity: normal
It seems that somehow <time> tags are not recognized as HTML tags and parsed as text (encoded with HTML entities).
Here is the interesting part of the output for the example above (purged of Parsoid specific attributes - sorry for that):
Henri Grouès, dit l’abbé Pierre, était un prêtre catholique français, résistant puis député, fondateur du Mouvement Emmaüs</span> (<time class,<span>,=,</span>,"bday"='' datetime,<span>,=,</span>,"1912"=''>1912</time><link href=./Catégorie:Naissance_en_1912><link href=./Catégorie:Auteurs_du_XXe_siècle>– <time class,<span>,=,</span>,"dday"='' datetime,<span>,=,</span>,"2007"=''>2007</time><link href=./Catégorie:Décès_en_2007><link href=./Catégorie:Auteurs_du_XXIe_siècle>)
Can be verified on a simple test case:
[subbu@earth lib] echo "<time>foo</time>" | node parse --fetchConfig false
<body data-parsoid='{"dsr":[0,17,0,0]}'><p data-parsoid='{"dsr":[0,16,0,0]}'><time>foo</time></p>
</body>
The sanitizer in Parsoid is the culprit -- it uses a list of whitelisted html tags to accept in wikitext and <time> is not one of them. Maybe our port of PHP sanitizer has a bug or we need to update our port. To be investigated.
Change 99011 had a related patch set uploaded by GWicke:
Bug 54438: First part of core change 97caae596: support time/data/mark elements
Change 99011 merged by jenkins-bot:
Bug 54438: First part of core change 97caae596: support time/data/mark elements
It seems to me, but not sure this is related to this bug fix, Parsoid generates an additional/unnecessary " " character after the closing time tag.
Examples:
(In reply to comment #7)
It seems to me, but not sure this is related to this bug fix, Parsoid
generates
an additional/unnecessary " " character after the closing time tag.Examples:
This seems to work fine when testing with master:
echo '<time>1900</time>foo' | node parse
<body data-parsoid='{"dsr":[0,21,0,0]}'><p data-parsoid='{"dsr":[0,20,0,0]}'><time data-parsoid='{"stx":"html","dsr":[0,17,6,7]}'>1900</time>foo</p>
</body>
Can you try to find a minimal test case at http://parsoid.wmflabs.org/_wikitext/ ?
This patch was also deployed on Wednesday (see https://www.mediawiki.org/wiki/Parsoid/Deployments#Wednesday.2C_December_4.2C_13:00-14:00_PST_Y_Deployed_0ac82a28), so these tags are now supported in production.
The smaller example I was able to get with a difference is:
Your test proves probably that this "problem" has nothing to do with the original bug, should I open a new ticket?
(In reply to comment #9)
The smaller example I was able to get with a difference is:
http://parsoid-lb.eqiad.wikimedia.org/frwikisource/
Utilisateur%3AKelson%2FtestYour test proves probably that this "problem" has nothing to do with the
original bug, should I open a new ticket?
Yes, that would be great. This looks more like a template whitespace folding issue.
Here it is:
https://bugzilla.wikimedia.org/show_bug.cgi?id=58289
Certainly not the most exciting bug to investigate...
I close the ticket at this bug seems to be fixed now. Thank you very much.
Reopened this bug, as there are still HTML5-by-default changes from 97caae596 to port.
Change 101277 had a related patch set uploaded by GWicke:
Merge "Bug 54438: First part of core change 97caae596: support time/data/mark elements"
Change 101329 had a related patch set uploaded by GWicke:
Bug 54438: First part of core change 97caae596: support time/data/mark elements
Change 101277 merged by GWicke:
Merge "Bug 54438: First part of core change 97caae596: support time/data/mark elements"
Change 101329 merged by GWicke:
Bug 54438: First part of core change 97caae596: support time/data/mark elements