Page MenuHomePhabricator

[BUG] Citations not being parsed correctly
Closed, ResolvedPublic

Description

Hi. All of the inline references that use Wikipedia citation templates are displaying code text on my Samsung Tab S3, Android version 8.0.0, Samsung version 9.0.

Attached are two instances as examples.

Leslie Molson (User: Matuko)


Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 9 2018, 11:22 AM

I don't see a problem currently @ https://en.wikipedia.org/api/rest_v1/page/html/Nadia_Murad .. maybe some template was temporarily incorrectly edited? Can you confirm if this is still a problem?

The problem seems to be that each reference in the list of References now includes a <style> tag which contains a large block of inline styles, which seem to be identical from one reference to the next. This <style> tag doesn't seem to be present in the Web version of the article. Do we know its origin and/or purpose?

The problem seems to be that each reference in the list of References now includes a <style> tag which contains a large block of inline styles, which seem to be identical from one reference to the next. This <style> tag doesn't seem to be present in the Web version of the article. Do we know its origin and/or purpose?

That is from template styles being used for citation templates. Then, it seems the problem is likely from MCS mangling the style tags. @Mholloway @bearND may be able to investigate.

ssastry moved this task from Backlog to Non-Parsoid Tasks on the Parsoid board.Oct 11 2018, 7:31 PM

Parsing of the text displayed in this dialog comes from wikimedia-page-library.

Mholloway triaged this task as High priority.
bearND added a comment.EditedOct 15 2018, 2:20 AM

@ssastry: @Dbrant makes a good point in T206527#4659479. The parsoid output of a page with cite tags has unnecessarily duplicated content, while in the PHP parser version duplication works.
Compare running document.querySelectorAll('style[data-mw-deduplicate="TemplateStyles:r861714446"]').length on https://en.wikipedia.org/wiki/Cat#References vs. https://en.wikipedia.org/api/rest_v1/page/html/Cat#References. While the former has only one occurrence, the latter has 278.

This causes the Parsoid HTML to be bloated and results in clients taking longer to download and parse the payload. I cast my vote for Parsoid fixing T187142.

Still, I agree that the clients trying to parse references need to be updated to handle <style> tags but also <link> tags as well, assuming Parsoid will fix that in the future.

Possibly we could have MCS mobile-sections remove cite style and cite link tags to help out older client versions. (Maybe the same for the new /page/references endpoint as well, which has gotten quite bloated, too).

This causes the Parsoid HTML to be bloated and results in clients taking longer to download and parse the payload. I cast my vote for Parsoid fixing T187142.
Still, I agree that the clients trying to parse references need to be updated to handle <style> tags but also <link> tags as well, assuming Parsoid will fix that in the future.

Yes, the WIP patch for parsoid has been there for a while, and we'll sync with VE about this and figure out how to handle this.

Change 467401 had a related patch set uploaded (by BearND; owner: BearND):
[mediawiki/services/mobileapps@master] mobile-sections: rm style tags in ref lists

https://gerrit.wikimedia.org/r/467401

Anomie moved this task from Up next to External on the TemplateStyles board.Oct 15 2018, 3:29 PM

Change 467401 abandoned by BearND:
mobile-sections: rm style tags in ref lists

Reason:
This would strip styling in the reference lists when expanded. If the Android app has issues getting the page libr PR 161 released we could bring this patch back temporarily.

https://gerrit.wikimedia.org/r/467401

bearND claimed this task.Oct 15 2018, 5:23 PM

As mentioned during apps standup I recommend the Android app at least to incorporate the PR for the page library. In the meantime I'll also look into moving the style tags inside reference lists towards the end of the document (since the reference lists tend to come near the end of a page and are collapsed by default anyways).

Change 467468 had a related patch set uploaded (by BearND; owner: BearND):
[mediawiki/services/mobileapps@master] mobile-sections: move style tags in ref lists to end

https://gerrit.wikimedia.org/r/467468

Change 467468 merged by jenkins-bot:
[mediawiki/services/mobileapps@master] mobile-sections: move style tags in ref lists to end

https://gerrit.wikimedia.org/r/467468

Mentioned in SAL (#wikimedia-operations) [2018-10-25T17:20:32Z] <bsitzmann@deploy1001> Started deploy [mobileapps/deploy@95452cf]: Update mobileapps to 58cbdff (T206527)

Mentioned in SAL (#wikimedia-operations) [2018-10-25T17:24:22Z] <bsitzmann@deploy1001> Finished deploy [mobileapps/deploy@95452cf]: Update mobileapps to 58cbdff (T206527) (duration: 03m 50s)

Jhernandez closed this task as Resolved.Nov 19 2018, 4:02 PM