Page MenuHomePhabricator

[BUG] When using RestBase, links to pages with pluses in their title are broken
Closed, ResolvedPublic3 Estimated Story Points

Description

Steps to reproduce

  1. Search for sort.
  2. Tap Sort (disambiguation page).
  3. Tap Sort (C++).

Expected results

The Sort (C++) link preview is shown. When read article is tapped, the page is shown.

Actual results

The Sort (C++) link preview spins forever and the title says "Sort (C )". When read article is tapped, a 404 is shown.

screenshot-2016-05-25-12-18-43-906216102.png (800×480 px, 29 KB)

Environments observed

App version: 178e466e3d3a538326cdd2314a466848cfa11bc5
Android OS versions: API 23
Device model: AOSP Nexus S emulator
Device language: English

Event Timeline

Change 290960 had a related patch set uploaded (by Dbrant):
Fix links to page titles with plus signs.

https://gerrit.wikimedia.org/r/290960

Change 290960 abandoned by Dbrant:
Fix links to page titles with plus signs.

Reason:
Scratch that; this actually appears to be a RB vs. MW issue.

https://gerrit.wikimedia.org/r/290960

Dbrant renamed this task from [BUG] Links to pages with pluses in their title are broken to [BUG] When using RestBase, links to pages with pluses in their title are broken.May 26 2016, 6:10 PM
Dbrant subscribed.

I filed a task against Parsoid for this: T136346

Change 299872 had a related patch set uploaded (by BearND):
Don't decode hrefs from Parsoid content

https://gerrit.wikimedia.org/r/299872

Change 302745 had a related patch set uploaded (by BearND):
Change hrefs so they are the same as Mobileview API emits them

https://gerrit.wikimedia.org/r/302745

Change 299872 abandoned by BearND:
Don't decode hrefs from Parsoid content

Reason:
abandoned since I'm favoring the MCS fix I0e8e4fe011826fc4b007ed9d45b71b866dee7363

https://gerrit.wikimedia.org/r/299872

Could you summarize why the more minimal Parsoid encoding is creating issues for the app?

Change 303942 had a related patch set uploaded (by BearND):
WIP: Change way to get title from links

https://gerrit.wikimedia.org/r/303942

My MCS patch solves multiple problems:

#1 The main problem is:

The problem for the app is that since it can also handle content from MW API it assumes that the links are encoded, and therefore will decode the links, which makes the links invalid. + becomes a space after the decoding step. The same problem probably exist for any client moving from MW API to Parsoid content.

To clarify this a bit more and to provide some context, the app actually takes the wiki internal links and decodes them so that it can derive the page titles from them. That's why the app adds the (intermediate) decoding step.

It's possible to work around this in the app. The main disadvantage of solving this in the app, though, is that older version of the app don't get this fix. Also other clients (e.g. the iOS app or external clients) when they transition from using MW API to MCS would have to do the same changes.

These are the alternative app patches to fix problem 1:

#2 Another problem is that the anchors in links are not anchor-encoded. So, links with anchors from the web site or other external sources don't scroll to the correct position.

#3 A third problem is that page internal links look like links to other pages because they repeat the page title before the anchor. MW API does not do that. The effect of this is that the app thinks it's loading a new page or tries to load a link preview. Ok, this is quite rare, but an example which comes to mind is that [[Cation]] is a redirect to [[Ion#Anions_and_cations]]. So, if you go to the [[Ion]] page then scroll down a bit and click on the cation link then you'll get a link preview in the app.

Here's an overview of the which link causes which issue:

linkapp masterapp patch 1app patch 2app patch 3app master && MCS patch
C++"C "okokokok
100%okok100%25okok
#simpleAnchor (same as Ion -> cation)previewpreviewpreviewpreviewok
#specialCharsInAnchornothing(*)nothing(*)previewnothing(*)nothing (*)

(*) The ToC entry for this has
"Special_chars_.24.25.26amp.3B" instead of "Special_chars_.24.25.26". There's a mismatch between link and TOC entry (-> separate MCS patch)

Change 304785 had a related patch set uploaded (by BearND):
Preserve title attribute in links

https://gerrit.wikimedia.org/r/304785

Change 302745 abandoned by BearND:
Change hrefs so they are the same as Mobileview API emits them

Reason:
in favor of I0e8e4fe011826fc4b007ed9d45b71b866dee7363

https://gerrit.wikimedia.org/r/302745

During the reading/services sync meeting last Thursday @Pchelolo mentioned that Parsoid actually already produces a title attribute for internal links. The MCS transforms had previously removed this attribute. The new MCS patch 2 preserves the title attributes. In combination with app patch 3 this should be good, at least for the titles.
Let's address anchor-encoding issues in a separate ticket later to keep these patches simple.

the app actually takes the wiki internal links and decodes them so that it can derive the page titles from them

It sounds like the main use case for those titles is to then construct URLs. If you kept title & URL separate, then you wouldn't need to mess with the encoding of the title portion of the URL at all.

Change 304785 merged by jenkins-bot:
Preserve title attribute in links

https://gerrit.wikimedia.org/r/304785

the app actually takes the wiki internal links and decodes them so that it can derive the page titles from them

It sounds like the main use case for those titles is to then construct URLs. If you kept title & URL separate, then you wouldn't need to mess with the encoding of the title portion of the URL at all.

The app displays the title of the page, too. So, it was for both display and for constructing the URL. App patch 3 separates that.

bearND changed the point value for this task from 1 to 3.Sep 2 2016, 5:26 PM

Marked it with Regression tag to better track changes other clients of MCS need to be aware of when transitioning from mobileview action to MCS.

Change 303942 merged by jenkins-bot:
Change way to get title from links

https://gerrit.wikimedia.org/r/303942

Tested on Galaxy Express 3 (Android 6.0.1) and Wikipedia app 2.4.180-alpha-2016-10-27. According to the attached screencap, the page is shown when read article is tapped on the Sort (C++) link preview.

T136223 Galaxy Express 3 (Android 6.0.1).png (800×480 px, 78 KB)