Page MenuHomePhabricator

Clicking on references is broken.
Closed, ResolvedPublic1 Estimated Story Points

Description

At the moment, clicking on any reference in any article causes the app to incorrectly launch an external browser with an invalid-formatted URL, instead of the expected behavior of showing a reference popup.

A quick look at the underlying HTML reveals that the reference anchors have changed to <article name>#<reference> instead of the expected #<reference>. Tagging with MCS, since this is likely due to a change in the service's output.

Event Timeline

Change 331824 had a related patch set uploaded (by Mholloway):
Handle new Parsoid reference links

https://gerrit.wikimedia.org/r/331824

Change 332423 had a related patch set uploaded (by BearND):
Remove title from reference links

https://gerrit.wikimedia.org/r/332423

I'm surprised that this change is visible on older Parsoid content formats, too:

MCS is using 1.2.0 instead of 1.3.0, and we're still getting the change in the href attribute:

curl -X GET --header 'Accept: text/html; charset=utf-8; profile="https://www.mediawiki.org/wiki/Specs/HTML/1.2.0"' 'https://en.wikipedia.org/api/rest_v1/page/html/Hulk_Hogan' | grep '<a href="Hulk_Hogan#'

<a href="Hulk_Hogan#cite_note-james-1"
...

@ssastry Another thing I noticed is that the most of the new reference link hrefs start with the title while some of the reference links hrefs start with './' (as most of the regular intrawiki links do) and then the title.

I'm surprised that this change is visible on older Parsoid content formats, too:

MCS is using 1.2.0 instead of 1.3.0, and we're still getting the change in the href attribute:

RESTBase (/cc @mobrovac @GWicke) doesn't yet enforce content format versions.

But when it starts doing that, note that older formats are meant to be used temporarily while clients migrate to new formats. It is expensive to support multiple formats for posterity (storage / caches are split OR if not cached, it would be expensive to regenerate the same page everytime OR it would add conversion latencies to auto-migrate formats in Parsoid).

@ssastry Another thing I noticed is that the most of the new reference link hrefs start with the title while some of the reference links hrefs start with './' (as most of the regular intrawiki links do) and then the title.

Can you provide me an example? I wonder if it is just a case of storage not having turned over. @GWicke was considering regenerating all stored content to use the latest format, but right now, they are in the middle of the node v6 migration project.

@ssastry The current version of https://en.wikipedia.org/api/rest_v1/page/html/Seven_Years%27_War has most of the references starting without ./, like this:

<li about="#cite_note-71" id="cite_note-71">
<a href="Seven_Years'_War#cite_ref-71" rel="mw:referencedBy">

but the one before has a link that starts with ./:

<span id="mw-reference-text-cite_note-Fish2003p2-70" class="mw-reference-text">
<a rel="mw:WikiLink" href="./Seven_Years'_War#CITEREFFish2003" about="#mwt215" typeof="mw:Transclusion"

Let's fix this in MCS for now. @Mholloway keep your Android patch around, though. It will be useful once the Android app switches to the new formatted endpoint.

Hm, dropping the leading ./ isn't correct, since then titles like "File:Foo" get (mis)interpreted as the file URL scheme. So there might be two problems here?

@ssastry The current version of https://en.wikipedia.org/api/rest_v1/page/html/Seven_Years%27_War has most of the references starting without ./, like this:

<li about="#cite_note-71" id="cite_note-71">
<a href="Seven_Years'_War#cite_ref-71" rel="mw:referencedBy">

but the one before has a link that starts with ./:

<span id="mw-reference-text-cite_note-Fish2003p2-70" class="mw-reference-text">
<a rel="mw:WikiLink" href="./Seven_Years'_War#CITEREFFish2003" about="#mwt215" typeof="mw:Transclusion"

@bearND, good catch! Yes, this is an inconsistency in my patch that fixed cite ids. @cscott: there isn't a problem with File:Foo links .. those are wikilinks and they continue to have the leading "./" prefix. This is more about reference internal links which should have that prefix as well for consistency reasons. I am fixing that now.

Change 332423 merged by jenkins-bot:
Make page internal links start with #

https://gerrit.wikimedia.org/r/332423

Device details
Model : Samsung-SM-J120A (Galaxy Express 3)
Android : 6.0.1
App version : 2.4.184-alpha-2017-01-24

Tapping on any reference in any article causes the app right now to launch links correctly through Chrome, is that valid behavior? I'm wondering. Also, if there are particular devices/articles this is still failing on please let me know.

Hey @Nicholas.tsg, just to be clear what we're referring to here: when you click a superscript reference link in an article (such as the [1] or [2] near the beginning of the article Barack Obama), a popup at the bottom of the screen should show the reference for the statement. If that's happening, it's fixed. (The links in the reference popup should usually offer to launch Chrome or another browser.)

If just tapping the [1] or [2] directly fires Chrome or an external app chooser, it's still broken.

Note that this still appears incorrectly for pages between sometime around Jan 5, 2017 and Jan 23, 2017 (but not since then). If this is deemed a big enough issue then we need to force a rerender of pages that have not been updated with the mobile-sections version 0.9.0 yet.

Device details
Model : Samsung-SM-J120A (Galaxy Express 3)
Android : 6.0.1
App version : 2.4.184-alpha-2017-01-24

Using the @Mholloway instructions I've found through screencaps that this is fixed

T155070 Reference 1.png (800×480 px, 181 KB)
T155070 Reference 2.png (800×480 px, 182 KB)

@ssastry The current version of https://en.wikipedia.org/api/rest_v1/page/html/Seven_Years%27_War has most of the references starting without ./, like this:

<li about="#cite_note-71" id="cite_note-71">
<a href="Seven_Years'_War#cite_ref-71" rel="mw:referencedBy">

but the one before has a link that starts with ./:

<span id="mw-reference-text-cite_note-Fish2003p2-70" class="mw-reference-text">
<a rel="mw:WikiLink" href="./Seven_Years'_War#CITEREFFish2003" about="#mwt215" typeof="mw:Transclusion"

@bearND, good catch! Yes, this is an inconsistency in my patch that fixed cite ids. @cscott: there isn't a problem with File:Foo links .. those are wikilinks and they continue to have the leading "./" prefix. This is more about reference internal links which should have that prefix as well for consistency reasons. I am fixing that now.

This fix has been now deployed to production today. So, as pages turn over in RESTBase, this inconsistency should get fixed up.

@ssastry Thanks. FYI, the DOM transformations in the mobile-sections endpoint handle both cases.

I believe this is still an issue today even with popular content like https://en.wikipedia.org/api/rest_v1/page/html/Mohamed_Abdullahi_Mohamed (front page, edited an hour ago)

We could always put the client-side patch in.

That particular page should be working now. References were showing up in the title-prefixed style in the /mobile-sections* copy, but then I manually requested the latest revision in the browser at https://en.wikipedia.org/api/rest_v1/page/mobile-sections/Mohamed_Abdullahi_Mohamed/764781919 and the cached copies were updated.

N.B. MCS is working as expected. This appears to be a matter of cached content not being updated.

Why did the page have to be manually purged when it was edited so recently?

Seems there's stale content issues again.

Em, sorry, not quite clear to me what happened here from the comments. Do I understand correctly that a fix to MCS was deployed, but after that even a very popular-emitted page wasn't updated to the newer content?

@Pchelolo Sorry for the incoherence. Yep, that's correct.

Hm... I'm gonna create a subtask and assign to myself to investigate, but won't test it today, since there's a pretty big change coming to RESTBase regarding MCS content early next week (https://github.com/wikimedia/restbase/pull/753), so even if there's a bug, we need to test after that's merged and deployed.