Page MenuHomePhabricator

Regression: Increase in time to first interactive; lead paragraph doesn't move on Barack Obama due to mw-empty-elt
Closed, ResolvedPublic3 Story Points

Description

tldr: for about 30% of pages the infobox doesn't get moved below the lead paragraph

While doing chores, I noticed a spike in first interactive time for the Barack Obama page
Investigating it in wpt (there were no active campaigns) it became quite obvious
http://wpt.wmftest.org/video/compare.php?tests=180710_NX_JK,180704_JD_HN

As you can see the image showsin one of the runs but not the other.
Visiting https://en.m.wikipedia.org/wiki/Barack_Obama it's clear the first paragraph is not getting moved.

Inspecting the HTML there are two instances of .mw-empty-elt preventing this move

<p class="mw-empty-elt">
</p>

There are no visible spikes in logstash relating to this error as the error logging incorrectly assumes that these empty paragraphs are the lead paragraph but found 3 in 10 articles had the problem.

QA steps

Visit the following pages on mobile and ensure the lead paragraph shows above the infobox

Visit some random articles and check they also position infobox below first paragraph. Report any violations.

Developer notes

Any "p" with the class mw-empty-elt should not be considered the lead paragraph
Possibly related to some Tidy changes in https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/295613/

Details

Related Gerrit Patches:
mediawiki/extensions/MobileFrontend : masterTreat p nodes with mw-empty-elt as empty paragraphs

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJul 10 2018, 9:34 PM
ovasileva triaged this task as High priority.Jul 11 2018, 9:39 AM
ovasileva moved this task from Triaged but Future to Upcoming on the Readers-Web-Backlog board.
ovasileva added a subscriber: ovasileva.

Let's try to estimate this during kickoff

phuedx added subscribers: pmiazga, phuedx.EditedJul 11 2018, 9:44 AM

Not sure if this is a regression or not. Unless the XPath queries have changed dramatically during refactoring, I don't think that we've been normalising whitespace when searching for paragraphs with content before the infobox, i.e. it's the "\n" character and not the class that's breaking the transform /cc @pmiazga.

AIUI the following HTML should also break the transform:

<p> </p><p></p><table class="infobox"></table><p>foo</p>

I agree that this is an edge case has been lurking for some time. It's a regression in the sense that something has changed to make this a much more prominent edge case than before.

ovasileva set the point value for this task to 3.Jul 17 2018, 4:10 PM
pmiazga claimed this task.Jul 25 2018, 5:23 PM

Change 448060 had a related patch set uploaded (by Pmiazga; owner: Pmiazga):
[mediawiki/extensions/MobileFrontend@master] Threat p nodes with mw-empty-elt as empty paragraphs

https://gerrit.wikimedia.org/r/448060

pmiazga reassigned this task from pmiazga to Jdlrobson.Jul 26 2018, 3:44 PM

@Jdlrobson could you review this change?

Added Technical-Debt as this patches FIXME: Make Special:MobileContributions use ContribsPager ASAP. that was added two years ago

pmiazga edited projects, added Technical-Debt (RW-Tech-Debt); removed Technical-Debt.

Ok, reverting the tech-deb label, wrong task :(

Jdlrobson updated the task description. (Show Details)

@alexhollender over to you for a design review.

Change 448060 merged by Pmiazga:
[mediawiki/extensions/MobileFrontend@master] Treat p nodes with mw-empty-elt as empty paragraphs

https://gerrit.wikimedia.org/r/448060

alexhollender added a project: Product-QA.
alexhollender updated the task description. (Show Details)
alexhollender added a subscriber: alexhollender.

@Jdlrobson: Tested the fix on the articles mentioned under QA steps in staging environment. As for checking it on some random articles, on staging I am finding it difficult to find one with lead paragraph/infobox in it. Is it on Beta cluster yet? I could then search for some more random articles and check on those.

ABorbaWMF moved this task from Backlog to Done on the Product-QA board.Aug 7 2018, 4:51 PM
ABorbaWMF added a subscriber: ABorbaWMF.

Looks good on staging across a bunch of browsers

Chrome

Firefox

Safari

Opera

@ABorbaWMF: Just wanted to mention that this seems to be not working on Beta cluster yet, some of the articles on Beta still has infobox as the first thing instead of the lead paragraph. So, I don't know if the fix has landed on Beta yet, but I was going to mark it as "Done" after verifying it on all other environments. Do you move tasks to Done column as soon as it's working on staging or when it is verified on all environments? Also, if/when you do verify tasks in other environments, how do you usually keep track of which environments it's been tested on?

@Ryasmeen, @ABorbaWMF - thank you for testing! I also looked on production and it seems to be working there. @Ryasmeen - do you have a link to one of the articles that wasn't working on beta? Maybe we've stumbled across a separate bug...

Jdlrobson added a subscriber: Ryasmeen.

@ovasileva: Sure, check these two articles for instance: https://en.m.wikipedia.beta.wmflabs.org/wiki/Bear or https://en.m.wikipedia.beta.wmflabs.org/wiki/Bearded_seal

Interestingly, on production the same articles have lead paragraph preceding the infobox but not on Beta cluster.

On Beta:

On en.wiki:

Jdlrobson reassigned this task from Jdlrobson to ovasileva.EditedAug 8 2018, 8:53 PM

Interestingly, on production the same articles have lead paragraph preceding the infobox but not on Beta cluster.

This is not a bug. These relate to problems with the user generated content and are picked up in chores

I've fixed the issue on beta cluster with this edit (which moves the wrapping container div) so these work again to avoid future confusion.

Looks done to me, so over to you @ovasileva to sign off.

ovasileva closed this task as Resolved.Aug 9 2018, 9:58 AM

Thanks for double checking. Looks good!

Restricted Application added a project: User-Ryasmeen. · View Herald TranscriptAug 9 2018, 9:58 AM

Sorry if we have and I've missed the comment but have we determined that the spike that @Jdlrobson previously observed (see below) has flattened out?

While doing chores, I noticed a spike in first interactive time for the Barack Obama page
Investigating it in wpt (there were no active campaigns) it became quite obvious
http://wpt.wmftest.org/video/compare.php?tests=180710_NX_JK,180704_JD_HN

This change would have gone live on the 2nd August (1.32.0-wmf.15) on English Wikipedia

Sure enough, the graph tells the story that this was fixed:

Thanks for checking!