Page MenuHomePhabricator

Differences in Parsoid HTML compared to PHP parser output breaks javascript that is tailored ot the PHP parser output
Closed, ResolvedPublic


If I compare the html from
There are differences.
For example, look for the file Metropole_of_Lyon_map-locator-blank-2015.svg and it's surrounding HTML.

The difference in my example make the GeoBox_Init() from common.js to fail in the version from the api.

Event Timeline

To add a bit of context information: We need that information to get an offline version of the page in which the javascript works too.

ssastry renamed this task from differences in html served by wikipedia API to Parsoid doesn't add JS modules to <head> in its output.Apr 6 2017, 2:11 PM
ssastry triaged this task as Medium priority.
ssastry subscribed.
This comment was removed by ssastry.

About the new title, I would like to point that my issue is that this span (for example) :

<span typeof="mw:Image" data-mw='{"caption":"Voir sur la carte administrative de la&lt;span typeof=\"mw:Entity\" data-parsoid=&#39;{\"src\":\"&amp;amp;nbsp;\",\"srcContent\":\" \",\"dsr\":[5765,5771,null,null]}&#39;> &lt;/span>Métropole de Lyon"}'>

is returned by the api but is not in the source of

ssastry renamed this task from Parsoid doesn't add JS modules to <head> in its output to Differences in Parsoid HTML compared to PHP parser output breaks javascript that is tailored ot the PHP parser output.Apr 6 2017, 7:26 PM

Oops .. sorry, I was hasty there and moving too fast. I renamed the title to reflect what I understand now.

But, this is a known difference in Parsoid and PHP parser output. We do plan to adapt PHP parser image output to be similar to Parsoid output. For now, I don't have an immediate solution for you besides making changes to the JS code to handle Parsoid image markup properly. So, this would require a change to the JS code.

@Skylsmoi May you please tell what exactly breaks (or making in not working well) the Javascript running on the French article about "Lyon"? I heard from you that the problem was a missing "alt" attribute but here it seems you talk about a surnumerous "spam" node? Could you confirm this here and maybe share the problematic DOM node (both API and Parsoid)?

Here are the 2 html versions for the maps of the "location" part of the top right .infobox_v2

<div style="position:relative;;">
    <a href="/wiki/Fichier:Metropole_of_Lyon_map-locator-blank-2015.svg" class="image" title="Voir sur la carte administrative de la&#160;Métropole de Lyon">
        <img alt="Voir sur la carte administrative de la&#160;Métropole de Lyon" src="//" width="280" height="371" srcset="// 1.5x, // 2x" data-file-width="966" data-file-height="1280" />
    <div style="position:absolute;top:47.437844611528%;left:41.486805555555%;width:0px;height:0px;margin:0;padding:0;line-height:0px;background-color:transparent;">
        <div style="position:relative;top:-8px;left:-8px;width:16px;height:16px;background-color:transparent;">
            <a href="/wiki/Fichier:City_locator_14.svg" class="image">
                <img alt="City locator 14.svg" src="//" width="16" height="16" srcset="// 1.5x, // 2x" data-file-width="16" data-file-height="16" />
        <div style="position:relative;top:-16px;">
            <div style="font-size:90%;position:relative;top:-0.65em;left:-12.6em;text-align:right;width:12em;line-height:1.2em;">
                <span class="toponyme">Lyon</span>

Here, the <img> tag contains a 'alt' attribute which is used by the js to generate the links to switch maps

<div style=\"position:relative;;\">
    <span typeof=\"mw:Image\" data-mw='{\"caption\":\"Voir sur la carte administrative de la&lt;span typeof=\\\"mw:Entity\\\" data-parsoid=&#39;{\\\"src\\\":\\\"&amp;amp;nbsp;\\\",\\\"srcContent\\\":\\\"\u00a0\\\",\\\"dsr\\\":[5765,5771,null,null]}&#39;>\u00a0&lt;/span>M\u00e9tropole de Lyon\"}'>
        <a href=\"./Fichier:Metropole_of_Lyon_map-locator-blank-2015.svg\">
            <img resource=\"./Fichier:Metropole_of_Lyon_map-locator-blank-2015.svg\" src=\"//\" data-file-width=\"966\" data-file-height=\"1280\" data-file-type=\"drawing\" srcset=\"// 2x, // 1.5x\" height=\"371\" width=\"280\"/>
    <div style=\"position:absolute;top:47.437844611528%;left:41.486805555555%;width:0px;height:0px;margin:0;padding:0;line-height:0px;background-color:transparent;\">
        <div style=\"position:relative;top:-8px;left:-8px;width:16px;height:16px;background-color:transparent;\">
            <span class=\"mw-default-size\" typeof=\"mw:Image\">
                <a href=\"./Fichier:City_locator_14.svg\">
                    <img resource=\"./Fichier:City_locator_14.svg\" src=\"//\" data-file-width=\"16\" data-file-height=\"16\" data-file-type=\"drawing\" srcset=\"// 2x, // 1.5x\" height=\"16\" width=\"16\"/>
        <div style=\"position:relative;top:-16px;\">
            <div style=\"font-size:90%;position:relative;top:-0.65em;left:-12.6em;text-align:right;width:12em;line-height:1.2em;\">
                <span class=\"toponyme\">Lyon</span>

As you can see, the <a> tags are wrapped in <span> tag containing data but the <img> tag does not contains the 'alt' attribute

Mentioned this in the other ticket, but check out the android app for JS that works with Parsoid

Change 802643 had a related patch set uploaded (by Arlolra; author: Arlolra):

[mediawiki/services/parsoid@master] [WIP] Use caption as alt on imgs when not present and caption isn't visible

Change 803606 had a related patch set uploaded (by Arlolra; author: Arlolra):

[mediawiki/core@master] Set alt in galleries, despite caption being visible

Change 803606 merged by jenkins-bot:

[mediawiki/core@master] Set alt in galleries, despite caption being visible

Change 804404 had a related patch set uploaded (by Arlolra; author: Arlolra):

[mediawiki/services/parsoid@master] Add forward compatibility to avoid serializing alt from caption

Change 804404 merged by jenkins-bot:

[mediawiki/services/parsoid@master] Add forward compatibility to avoid serializing alt from caption

Change 805225 had a related patch set uploaded (by Arlolra; author: Arlolra):

[mediawiki/vendor@master] Bump parsoid to 0.16.0-a12

Change 805225 merged by jenkins-bot:

[mediawiki/vendor@master] Bump parsoid to 0.16.0-a12

Change 802643 merged by jenkins-bot:

[mediawiki/services/parsoid@master] Use caption as alt on imgs when not present and caption isn't visible

Change 808051 had a related patch set uploaded (by Arlolra; author: Arlolra):

[mediawiki/vendor@master] Bump parsoid to 0.16.0-a14

Change 808051 merged by jenkins-bot:

[mediawiki/vendor@master] Bump parsoid to 0.16.0-a14

Arlolra claimed this task.

In version 2.5.0 of Parsoid's output, the alt attribute is now on the img, <img alt="Voir sur la carte administrative de la métropole de Lyon" ...


<div class="geobox">
<div><small>Géolocalisation sur la carte<span typeof="mw:DisplaySpace"> </span>: <a rel="mw:WikiLink" href="./Métropole_de_Lyon" title="Métropole de Lyon">métropole de Lyon</a></small></div>
<table class="DebutCarte" style="width:auto; margin:0; border:none; border-spacing:0; padding:0; text-align:center; ">
<tbody><tr><td style="border:none; padding:0"><div style="position:relative;;"><span class="noviewer" typeof="mw:File" data-mw='{"caption":"Voir sur la carte administrative de la&lt;span typeof=\"mw:Entity\" data-parsoid=&apos;{\"src\":\"&amp;amp;#32;\",\"srcContent\":\" \"}&apos;> &lt;/span>métropole de Lyon"}'><a href="./Fichier:Metropole_of_Lyon_map-locator-blank-2015.svg" class="mw-file-description" title="Voir sur la carte administrative de la métropole de Lyon"><img alt="Voir sur la carte administrative de la métropole de Lyon" resource="./Fichier:Metropole_of_Lyon_map-locator-blank-2015.svg" src="//" decoding="async" data-file-width="966" data-file-height="1280" data-file-type="drawing" height="371" width="280" srcset="// 1.5x, // 2x"/></a></span>
<div style="position:absolute;top:calc(47.437844611528% - 8px);left:calc(41.486805555555% - 8px);line-height:0;background-color:transparent;"><span class="noviewer" typeof="mw:File"><a href="./Fichier:City_locator_14.svg" class="mw-file-description"><img resource="./Fichier:City_locator_14.svg" src="//" decoding="async" data-file-width="16" data-file-height="16" data-file-type="drawing" height="16" width="16" srcset="// 1.5x, // 2x"/></a></span></div><div style="position:absolute;font-size:90%;top:47.437844611528%;transform:translateY(-50%);right:calc(100% - 41.486805555555% + 0.6em);text-align:right; "><span class="toponyme">Lyon</span></div>