Page MenuHomePhabricator

Bug: HTMLFormatter References drawer behaviour broken in stable and beta (closes instantly for some references)
Closed, ResolvedPublic3 Story Points

Description

There is a bug with lazy loaded references that leads to references not showing up and the drawer being closed.

Steps to reproduce

  1. Opt into beta
  2. Visit https://en.m.wikipedia.org/wiki/Barack_Obama
  3. Search for "[11]"
  4. Tap "[11]"

Expected results

The reference drawer is shown and stays open until dismissed manually

Actual results

Sometimes nothing happens, other times the drawer opens and immediately slams shut!
note: this is happening on stable as well

https://drive.google.com/file/d/1qEDv9fU89pAyl7StrHpu7uwoasB9bEis/view?usp=sharing

Environments observed

Browser Version:

  • Chromium v64.0.3282.167 (Official Build) Built on Ubuntu , running on Ubuntu 17.10 (64-bit)
  • Mobile web BETA mode

OS Version:

  • Ubuntu v17.10
  • Android v8.1.0

Device Model:

  • Desktop
  • Pixel XL

Device Language:

  • English

Developer notes

The bug is inside ReferencesHtmlScraperGateway.getReferenceFromContainer
I started by investigating beta. Inside ReferencesDrawer.showReference
gateway is an instance of ReferencesMobileViewGateway
The call to getReference throws an error
ReferencesGateway.ERROR_NOT_EXIST

This bug, as reported by @Mhurd in T183049, is caused because of unexpected encoding occurring on the HTML.

Mobile:

<sup id="cite_ref-Obama_1995,_2004,_pp._9–10_11-0" class="reference"><a href="#cite_note-Obama_1995,_2004,_pp._9%E2%80%9310-11">[11]</a></sup>

Desktop:

<sup id="cite_ref-Obama_1995,_2004,_pp._9–10_11-0" class="reference"><a href="#cite_note-Obama_1995,_2004,_pp._9–10-11">[11]</a></sup>

The problem lies somewhere inside the HTMLFormatter. After 20 mins debugging it seems that the problem relies inside MobileFormatter::filterContent or HtmlFormatter::getDoc

A client side fix can be made but it feels a little hacky. It would look like this:

  • If the reference cannot be found, attempt to call decodeURIComponent on the href and try again.

A server side fix will be trickier as it's likely to be related to the loadHTML method of DOMDocument. We do not directly modify the href attributes of the link.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptFeb 28 2018, 8:54 PM
Niedzielski updated the task description. (Show Details)Feb 28 2018, 9:01 PM
Jdlrobson renamed this task from [Bug] References drawer closes instantly for some references to [Bug/Beta] References drawer closes instantly for some references.Feb 28 2018, 11:05 PM
Jdlrobson updated the task description. (Show Details)
ovasileva triaged this task as Normal priority.Mar 5 2018, 5:13 PM
Jdlrobson added a subscriber: Jdlrobson.EditedMar 6 2018, 5:25 PM

There are 2 issues here - in stable clicking the reference jumps to the bottom of the page rather than opening the drawer and in beta it opens and closes the drawer.

ovasileva raised the priority of this task from Normal to High.Mar 6 2018, 5:26 PM
ovasileva updated the task description. (Show Details)
ovasileva set the point value for this task to 3.Mar 6 2018, 5:28 PM
ovasileva added a subscriber: ovasileva.

Currently consensus is around 3, but might be higher pending analysis

I was a 5, needs re-estimation once we know more. Jon has signed up to look into it a bit.

Jdlrobson renamed this task from [Bug/Beta] References drawer closes instantly for some references to Bug: References drawer behaviour broken in stable and beta (closes instantly for some references).Mar 6 2018, 5:28 PM
Jdlrobson moved this task from Inbox to Next up on the User-Jdlrobson board.Mar 6 2018, 5:56 PM
Jdlrobson renamed this task from Bug: References drawer behaviour broken in stable and beta (closes instantly for some references) to Bug: HTMLFormatter References drawer behaviour broken in stable and beta (closes instantly for some references).Mar 6 2018, 10:43 PM
Jdlrobson removed a project: User-Jdlrobson.
Jdlrobson updated the task description. (Show Details)
Jdlrobson added a subscriber: Mhurd.
Jdlrobson updated the task description. (Show Details)Mar 6 2018, 10:50 PM
Jdlrobson removed the point value for this task.

Given my analysis I suggest there is enough significant information to re-estimate this...

MBinder_WMF updated the task description. (Show Details)Mar 7 2018, 6:28 PM

Sounds like a good chance to add a failing test and then get it to pass 😄

ovasileva set the point value for this task to 3.Mar 7 2018, 6:32 PM
pmiazga claimed this task.Mar 15 2018, 4:45 PM
pmiazga moved this task from To Do to Doing on the Readers-Web-Kanbanana-Board-Old board.

I don't think it's easily fixable by PHP as this is default behaviour of libxml2 library. The node attributes are escaped on HTML save.

Flow:

a) HtmlFormatter:getText calls saveHTML
b) PHP DomDocument on saveHTML (internal method dom_document_save_html) calls htmlNodeDump from libxml2 library
c) htmlNodeDump in libxml2 library iterates over children and for each child calls htmlAttrDumpOutput which returns escaped content

Probably changing the JS behaviour is the only correct way as this is default libxml2 behavior.

Change 420750 had a related patch set uploaded (by Pmiazga; owner: Pmiazga):
[mediawiki/skins/MinervaNeue@master] References should decode the href attribute before references lookup

https://gerrit.wikimedia.org/r/420750

pmiazga removed pmiazga as the assignee of this task.
pmiazga added a subscriber: pmiazga.

Change 420812 had a related patch set uploaded (by Jdlrobson; owner: Jdlrobson):
[mediawiki/extensions/MobileFrontend@master] ReferenceGateway can handle encoded components

https://gerrit.wikimedia.org/r/420812

https://gerrit.wikimedia.org/r/#/c/420812/ fixes the problem

The issue was , and . characters as well as encoding...

Change 420750 abandoned by Jdlrobson:
References should decode the href attribute before references lookup

Reason:
We've decided to do this in MobileFrontend to keep all the processing together.

https://gerrit.wikimedia.org/r/420750

Change 420812 merged by jenkins-bot:
[mediawiki/extensions/MobileFrontend@master] ReferenceGateway can handle encoded components

https://gerrit.wikimedia.org/r/420812

ovasileva added a subscriber: ABorbaWMF.

over to you @ABorbaWMF

Jdlrobson reassigned this task from ABorbaWMF to ovasileva.Mar 26 2018, 11:06 PM

I don't think this needs design sign off or technical sign off.

ovasileva closed this task as Resolved.Mar 27 2018, 9:28 AM

Confirmed on beta for the Barack Obama article as well. Resolving