Page MenuHomePhabricator

Parsoid does not apply "format" from TemplateData if the template is transcluded via a redirect
Open, NormalPublic

Description

Identified by @ssastry while debugging T213922:

So I patched html2wt with as follows (since this is triggered only on edited HTML and I wanted to test it with regular HTML),

diff --git a/lib/html2wt/WikitextSerializer.js b/lib/html2wt/WikitextSerializer.js
index d052a47e0..9470070fa 100644
--- a/lib/html2wt/WikitextSerializer.js
+++ b/lib/html2wt/WikitextSerializer.js
@@ -782,11 +782,12 @@ WikitextSerializer.prototype.serializeFromParts = Promise.async(function *(state
                var fetched = false;
                try {
                        var apiResp = null;
-                       if (isTpl && useTplData) {
+                       if (isTpl && true) {
                                var href = tplHref.replace(/^\.\//, '');
                                apiResp = yield TemplateDataRequest.promise(env, href, Util.makeHash(["templatedata", href]));
                        }
                        tplData = apiResp && apiResp[Object.keys(apiResp)[0]];
+                       console.log("TPLDATA: " + JSON.stringify(tplData));

Now, when I rut it with the HTML of fawiki's infobox, see the output:

[subbu@earth:~/work/wmf/parsoid] parse.js --html2wt --prefix fawiki  --trace apirequest< /tmp/fa.html 
....
[ApiRequest]   | #3 Starting HTTP request:  {"method":"GET","qs":{"format":"json","action":"templatedata","includeMissingTitles":"1","titles":"الگو:Infobox_Officeholder"},"followRedirect":true,"uri":"https://fa.wikipedia.org/w/api.php","timeout":30000,"headers":{"X-Request-ID":null,"User-Agent":"Parsoid/0.10.0+git","Connection":"close"},"strictSSL":true}
....
TPLDATA: {"title":"الگو:Infobox Officeholder","notemplatedata":"","params":[]}

@ssastry Note that the Parsoid query is for "Infobox Officeholder" (with uppercase "O"), which is a redirect to "Infobox officeholder" (lowercase "o"). Only the latter page has templatedata. Parsoid should probably ask the MediaWiki API to follow redirects (&redirects=1). But that seems like a separate issue.

In my snippet I pasted above with the API trace, see that we already have "followRedirect":true. Is that the wrong param?

I'm not familiar with that library, but it looks like this is the library's parameter for following HTTP redirects, rather than a query parameter for following MediaWiki redirects.

Event Timeline

matmarex created this task.Jan 17 2019, 1:48 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJan 17 2019, 1:48 AM
ssastry triaged this task as Normal priority.Jan 17 2019, 4:44 AM
ssastry edited projects, added Parsoid-Edit-Support; removed Parsoid.
Restricted Application added a project: VisualEditor. · View Herald TranscriptMar 7 2019, 1:19 PM

@matmarex is this a parsoid issue?

stjn added a subscriber: stjn.Jun 13 2019, 6:59 PM

We have discovered a similar problem in Russian Wikipedia and think it might be instances of this bug. A user edited an infobox {{Карточка ФК}} that redirects to {{Футбольный клуб}}, and although TemplateData shows fine in the VisualEditor frontend, it clearly isn’t being read by backend when forming wiki code (messing up param order). Can you tell us if it’s the same problem?

Relevant edits:

  1. https://ru.wikipedia.org/?diff=100384644 (edit with {{Карточка ФК}})
  2. https://ru.wikipedia.org/wiki/Дрёбак-Фрогн_(футбольный_клуб)?action=history (3 tests showcasing the difference)
Restricted Application added a subscriber: Liuxinyu970226. · View Herald TranscriptJun 13 2019, 6:59 PM

Yes, probably the same thing.