MobileFormatter#moveFirstParagraphBeforeInfobox should also move infoboxes wrapped in mw-stack element
Closed, ResolvedPublic5 Story Points

Description

Certain pages wrap the .infobox element inside a .mw-stack element courtesy of the Stack template
As a result, infobox will appear first on these pages.

Examples:
https://en.m.wikipedia.org/wiki/Christopher_II_of_Denmark
https://en.m.wikipedia.org/wiki/O.%20J.%20Simpson
https://en.m.wikipedia.org/wiki/Beijing#/editor/0 uses {{stack begin}}
https://en.m.wikipedia.org/wiki/Xi'an uses {{stack begin}}

This impacts around 20k pages a day on English Wikipedia (note that it's hard to get an idea of unique pages due to the nature of the logging):
https://logstash.wikimedia.org/app/kibana#/discover?_g=(refreshInterval:(display:Off,pause:!f,value:0),time:(from:now%2Fw,mode:quick,to:now%2Fw))&_a=(columns:!(_source),index:'logstash-*',interval:d,query:(query_string:(analyze_wildcard:!t,query:'message:%22Found%20infobox%20wrapped%20with%20container%20on%22')),sort:!('@timestamp',desc)))

Changes

  • If an infobox is contained inside an mw-stack element, move the lead paragraph above the first mw-stack element. Do not log inside logInfoboxesWrappedInContainers
  • Currently we are logging events on cached HTML. Restrict the logging to only occur for logged in users (this will reduce a lot of the noise in the logs)
  • Add a test case for an .infobox element wrapped in a .mw-stack element
  • Add a test case for an .infobox element wrapped in an element which does not have the mw-stack class.

Testing criteria

Check the following pages on mobile and verify that the lead paragraph shows above the infobox.
http://reading-web-staging.wmflabs.org/wiki/Christopher_II_of_Denmark
http://reading-web-staging.wmflabs.org/wiki/O.%20J.%20Simpson
http://reading-web-staging.wmflabs.org/wiki/Beijing#/editor/0 uses {{stack begin}}
http://reading-web-staging.wmflabs.org/wiki/Xi'an uses {{stack begin}}
http://reading-web-staging.wmflabs.org/wiki/Albert Einstein (no stack)
http://reading-web-staging.wmflabs.org/wiki/Planet (no stack + list should be shifted alongside lead paragraph)

Pre sign off

Create new tasks if necessary...

  • Check how this impacts logged events
  • investigate remaining problematic templates
  • If logged events > 1000 switch on for German Wikipedia
  • In new task ask the question can we/should we turn it off. Note, an acceptable outcome might be to keep the logging and check it via chores. A new template introduction might break this functionality so we may want to monitor it as part of maintenance.

Per T170006#3910910, we're going to discuss this as part of chores.

Done in https://www.mediawiki.org/w/index.php?title=Extension%3AMobileFrontend&type=revision&diff=2694315&oldid=2686353.

  • Document any caveats around moving the first paragraph behaviour here.

@Jdlrobson should have an exhaustive list.

There are a very large number of changes, so older changes are hidden. Show Older Changes
phuedx updated the task description. (Show Details)Dec 6 2017, 5:39 PM
Kaartic removed a subscriber: Kaartic.Dec 8 2017, 5:10 PM

Change 397568 had a related patch set uploaded (by Pmiazga; owner: Pmiazga):
[mediawiki/extensions/MobileFrontend@master] Definition list should be also moved with first paragraph

https://gerrit.wikimedia.org/r/397568

pmiazga added a subscriber: pmiazga.EditedDec 11 2017, 6:24 PM

I found out that the move first paragraph logic will move leading paragraph with unordered/unordered lists but the definition list would stay not attached to the paragraph. I pushed a small patch to fix this behavior and move any type of list (ordered/unordered/definition) with the first paragraph.

Change 399562 had a related patch set uploaded (by Jdlrobson; owner: Jdlrobson):
[mediawiki/extensions/MobileFrontend@master] Treat mw-stack elements as if they are infoboxes

https://gerrit.wikimedia.org/r/399562

@pmiazga in a standup thread you mentioned

I had an idea to split MobileFormatter to many small "MobileFormatterSteps" and leave MobileFormatter as a manager that iterates over steps and applies one by one. Now everything is in one huge class that become very difficult to manage. I'll use my 10% time to work on it

Any progress there? I would be interested to split out the code into several parts that pass $doc by reference and allow better testability

e.g. filterContent would look like

public function filterContent(
		$removeDefaults = true, $removeReferences = false, $removeImages = false,
		$showFirstParagraphBeforeInfobox = false
	) {
		$ctx = MobileContext::singleton();
		$config = $ctx->getMFConfig();
		$doc = $this->getDoc();
               if ( $removeReferences ) {
                    new MobileReferenceFormatter( $doc );
               }
               if ( $showFirstParagraphBeforeInfobox ) {
                    new MobileLeadSectionFormatter( $doc );
               }
               return parent::filterContent();

Change 401776 had a related patch set uploaded (by Jdlrobson; owner: Jdlrobson):
[mediawiki/extensions/MobileFrontend@master] Hygiene: Begin refactoring of MobileFormatter mega class

https://gerrit.wikimedia.org/r/401776

Jdlrobson claimed this task.

@pmiazga over to you.
https://gerrit.wikimedia.org/r/401776 Hygiene: Begin refactoring of MobileFormatter mega class
https://gerrit.wikimedia.org/r/399562 Treat mw-stack elements as if they are infoboxes

Change 401776 merged by jenkins-bot:
[mediawiki/extensions/MobileFrontend@master] Hygiene: Begin refactoring of MobileFormatter mega class

https://gerrit.wikimedia.org/r/401776

Change 399562 merged by jenkins-bot:
[mediawiki/extensions/MobileFrontend@master] Treat mw-stack elements as if they are infoboxes

https://gerrit.wikimedia.org/r/399562

Jdlrobson updated the task description. (Show Details)Jan 5 2018, 6:06 PM
Jdlrobson reassigned this task from Jdlrobson to ABorbaWMF.Jan 5 2018, 6:08 PM
Jdlrobson removed a project: Patch-For-Review.

@Aborba this will go out on the train next week (Wed/Fri). We'd appreciate some more confidence for that! I've added testing instructions.

phuedx updated the task description. (Show Details)Jan 6 2018, 3:08 PM

Change 403161 had a related patch set uploaded (by Pmiazga; owner: Pmiazga):
[mediawiki/extensions/MobileFrontend@master] Hygiene: Improve infobox detection in MoveLeadParagraphTransform

https://gerrit.wikimedia.org/r/403161

@Jdlrobson - I am having some trouble getting the test URLs to work on Staging. They are bringing up pages with no text. Is there a way to copy the article source(s) from prod?

Jdlrobson added a comment.EditedJan 9 2018, 6:51 PM

@ABorbaWMF are you using the mobile skin ? http://reading-web-staging.wmflabs.org/wiki/Christopher_II_of_Denmark?useformat=mobile (click mobile view at the bottom of the page if the link exists)

Change 403223 had a related patch set uploaded (by Pmiazga; owner: Pmiazga):
[mediawiki/extensions/MobileFrontend@specialpages] Hygiene: Improve infobox detection in MoveLeadParagraphTransform

https://gerrit.wikimedia.org/r/403223

Change 403223 abandoned by Pmiazga:
Hygiene: Improve infobox detection in MoveLeadParagraphTransform

https://gerrit.wikimedia.org/r/403223

Change 403161 merged by jenkins-bot:
[mediawiki/extensions/MobileFrontend@master] Hygiene: Improve infobox detection in MoveLeadParagraphTransform

https://gerrit.wikimedia.org/r/403161

@Jdlrobson, Yes that did the trick. Testing now. So far no problems on chrome. More results to come.

Looks good so far. There not much visual difference on phone type devices, but looks good on tablets

ProductionStaging

Tested and looks good from my side, but @ABorbaWMF - just to double check, what did you mean by "there's not much visual difference on phone devices? The differences here that we're looking for are that the lead paragraph appears before the infobox on mobile devices on staging.

Sorry, it looks ok on mobile too.

ProdStaging
ovasileva closed this task as Resolved.Jan 10 2018, 9:16 PM

Great, all done then!

Jdlrobson reopened this task as Open.Jan 16 2018, 9:09 PM
Jdlrobson claimed this task.
Jdlrobson added a subscriber: ABorbaWMF.

I need to go through the Pre sign off steps today.

phuedx updated the task description. (Show Details)Jan 17 2018, 11:13 AM
Jdlrobson removed Jdlrobson as the assignee of this task.Jan 18 2018, 1:25 AM

Ack.
Now the logs are running a little clearer, I can see some genuine bugs - the majority relating to military articles using the stack template:

We currently only allow div.mw-stack but we should allow table.mw-stack too.
In fact we should probably remove all tagName checking - I think it's overzealous...

The template has been edited per my request on wiki

Remaining issues seem to be limited to inconsistently built pages eg.
https://en.wikipedia.org/w/index.php?title=MasterCard_Lola&type=revision&diff=821137852&oldid=810539463 and another template which I've flagged on wiki.
I'm seeing under 100 per hour, down from 440.

I'll keep an eye on the logs and see if this can be resolved on wiki... if not there's little we can do here other than feedback to community "inconsistent" pages which the logging is doing a great job of identifying.

phuedx updated the task description. (Show Details)Jan 18 2018, 6:07 PM
phuedx updated the task description. (Show Details)
Niedzielski updated the task description. (Show Details)Jan 18 2018, 7:51 PM
Niedzielski removed Niedzielski as the assignee of this task.
  • The logging is regular but far less often, about 50 per hour, and we think the remaining fixes lie in templates.
  • All the examples including the list one check out:
  • Tests were added for infoboxes within and without .mw-stack.

However, I believe the logging is still enabled for anons so regrettably this must move back to needs work.

In new task ask the question can we/should we turn it off. Note, an acceptable outcome might be to keep the logging and check it via chores. A new template introduction might break this functionality so we may want to monitor it as part of maintenance.

Do we really want to create a new task for this? We're already limiting reports to logged in users. If it's problem, it will show up in the chores. I'd advise against creating another task and would prefer to discuss it during chores as needed.

Document any caveats around moving the first paragraph behaviour here.

@Jdlrobson, would you mind filling this out?

Jdlrobson updated the task description. (Show Details)Jan 18 2018, 10:24 PM
Jdlrobson claimed this task.

A user I'm collaborating with is proving super responsive. I'm hoping to eradicate the remaining issues on enwiki with her help. I'd like to leave this open till the end of the week to do so!

Behaviour now documented: https://www.mediawiki.org/w/index.php?title=Reading%2FWeb%2FProjects%2FLead_Paragraph_Move&type=revision&diff=2695640&oldid=2475658

Update: down to 13ish an hour:

Change 405812 had a related patch set uploaded (by Jdlrobson; owner: Jdlrobson):
[mediawiki/extensions/MobileFrontend@master] Other less common lead paragraph edge cases

https://gerrit.wikimedia.org/r/405812

Patch was merged but nothing from gerritbot. Weird.

phuedx updated the task description. (Show Details)Jan 31 2018, 11:09 AM
phuedx claimed this task.Jan 31 2018, 11:15 AM
phuedx closed this task as Resolved.
phuedx added a subscriber: phuedx.

We're still seeing ~40 log entries per day and that seems acceptable.

Currently we are logging events on cached HTML. Restrict the logging to only occur for logged in users (this will reduce a lot of the noise in the logs)

I'm not sure that I understand this point. AIUI anonymous users are primarily served responses from the edge caches, meaning that the application server never runs the mobile formatting code (and associated logging code), i.e. this is satisfied by construction. Feel free to reopen this task if I've missed the point :]

Change 397568 abandoned by Jdlrobson:
Definition list should be also moved with first paragraph

Reason:
This work is now tracked in https://phabricator.wikimedia.org/T186136 given it needs a rebase anyway.

https://gerrit.wikimedia.org/r/397568

I've added this here to chores. I think it's important to keep an eye on this number as it gives us clues that editors have changed template markup: https://www.mediawiki.org/wiki/Reading/Web/Chore_list#Review_the_MobileFormatter_lead_paragraph_transform_on_Wikimedia_content