Page MenuHomePhabricator

Parsoid does not apply "format" from TemplateData if the template is transcluded via a redirect
Closed, ResolvedPublic

Description

Identified by @ssastry while debugging T213922:

So I patched html2wt with as follows (since this is triggered only on edited HTML and I wanted to test it with regular HTML),

diff --git a/lib/html2wt/WikitextSerializer.js b/lib/html2wt/WikitextSerializer.js
index d052a47e0..9470070fa 100644
--- a/lib/html2wt/WikitextSerializer.js
+++ b/lib/html2wt/WikitextSerializer.js
@@ -782,11 +782,12 @@ WikitextSerializer.prototype.serializeFromParts = Promise.async(function *(state
                var fetched = false;
                try {
                        var apiResp = null;
-                       if (isTpl && useTplData) {
+                       if (isTpl && true) {
                                var href = tplHref.replace(/^\.\//, '');
                                apiResp = yield TemplateDataRequest.promise(env, href, Util.makeHash(["templatedata", href]));
                        }
                        tplData = apiResp && apiResp[Object.keys(apiResp)[0]];
+                       console.log("TPLDATA: " + JSON.stringify(tplData));

Now, when I rut it with the HTML of fawiki's infobox, see the output:

[subbu@earth:~/work/wmf/parsoid] parse.js --html2wt --prefix fawiki  --trace apirequest< /tmp/fa.html 
....
[ApiRequest]   | #3 Starting HTTP request:  {"method":"GET","qs":{"format":"json","action":"templatedata","includeMissingTitles":"1","titles":"الگو:Infobox_Officeholder"},"followRedirect":true,"uri":"https://fa.wikipedia.org/w/api.php","timeout":30000,"headers":{"X-Request-ID":null,"User-Agent":"Parsoid/0.10.0+git","Connection":"close"},"strictSSL":true}
....
TPLDATA: {"title":"الگو:Infobox Officeholder","notemplatedata":"","params":[]}

@ssastry Note that the Parsoid query is for "Infobox Officeholder" (with uppercase "O"), which is a redirect to "Infobox officeholder" (lowercase "o"). Only the latter page has templatedata. Parsoid should probably ask the MediaWiki API to follow redirects (&redirects=1). But that seems like a separate issue.

In my snippet I pasted above with the API trace, see that we already have "followRedirect":true. Is that the wrong param?

I'm not familiar with that library, but it looks like this is the library's parameter for following HTTP redirects, rather than a query parameter for following MediaWiki redirects.

Event Timeline

ssastry triaged this task as Medium priority.Jan 17 2019, 4:44 AM
ssastry edited projects, added Parsoid-Edit-Support; removed Parsoid.

We have discovered a similar problem in Russian Wikipedia and think it might be instances of this bug. A user edited an infobox {{Карточка ФК}} that redirects to {{Футбольный клуб}}, and although TemplateData shows fine in the VisualEditor frontend, it clearly isn’t being read by backend when forming wiki code (messing up param order). Can you tell us if it’s the same problem?

Relevant edits:

  1. https://ru.wikipedia.org/?diff=100384644 (edit with {{Карточка ФК}})
  2. https://ru.wikipedia.org/wiki/Дрёбак-Фрогн_(футбольный_клуб)?action=history (3 tests showcasing the difference)

Change 553129 had a related patch set uploaded (by C. Scott Ananian; owner: C. Scott Ananian):
[mediawiki/services/parsoid@master] Follow redirects when fetching template data

https://gerrit.wikimedia.org/r/553129

Change 553129 merged by jenkins-bot:
[mediawiki/services/parsoid@master] Follow redirects when fetching template data

https://gerrit.wikimedia.org/r/553129

Arlolra assigned this task to ssastry.