Page MenuHomePhabricator

MobileFormatter#moveFirstParagraphBeforeInfobox should also move infoboxes wrapped in mw-stack element
Closed, ResolvedPublic5 Estimated Story Points

Description

Certain pages wrap the .infobox element inside a .mw-stack element courtesy of the Stack template
As a result, infobox will appear first on these pages.

Examples:
https://en.m.wikipedia.org/wiki/Christopher_II_of_Denmark
https://en.m.wikipedia.org/wiki/O.%20J.%20Simpson
https://en.m.wikipedia.org/wiki/Beijing#/editor/0 uses {{stack begin}}
https://en.m.wikipedia.org/wiki/Xi'an uses {{stack begin}}

This impacts around 20k pages a day on English Wikipedia (note that it's hard to get an idea of unique pages due to the nature of the logging):
https://logstash.wikimedia.org/app/kibana#/discover?_g=(refreshInterval:(display:Off,pause:!f,value:0),time:(from:now%2Fw,mode:quick,to:now%2Fw))&_a=(columns:!(_source),index:'logstash-*',interval:d,query:(query_string:(analyze_wildcard:!t,query:'message:%22Found%20infobox%20wrapped%20with%20container%20on%22')),sort:!('@timestamp',desc)))

Changes

  • If an infobox is contained inside an mw-stack element, move the lead paragraph above the first mw-stack element. Do not log inside logInfoboxesWrappedInContainers
  • Currently we are logging events on cached HTML. Restrict the logging to only occur for logged in users (this will reduce a lot of the noise in the logs)
  • Add a test case for an .infobox element wrapped in a .mw-stack element
  • Add a test case for an .infobox element wrapped in an element which does not have the mw-stack class.

Testing criteria

Check the following pages on mobile and verify that the lead paragraph shows above the infobox.
http://reading-web-staging.wmflabs.org/wiki/Christopher_II_of_Denmark
http://reading-web-staging.wmflabs.org/wiki/O.%20J.%20Simpson
http://reading-web-staging.wmflabs.org/wiki/Beijing#/editor/0 uses {{stack begin}}
http://reading-web-staging.wmflabs.org/wiki/Xi'an uses {{stack begin}}
http://reading-web-staging.wmflabs.org/wiki/Albert Einstein (no stack)
http://reading-web-staging.wmflabs.org/wiki/Planet (no stack + list should be shifted alongside lead paragraph)

Pre sign off

Create new tasks if necessary...

  • Check how this impacts logged events
  • investigate remaining problematic templates
  • If logged events > 1000 switch on for German Wikipedia
  • In new task ask the question can we/should we turn it off. Note, an acceptable outcome might be to keep the logging and check it via chores. A new template introduction might break this functionality so we may want to monitor it as part of maintenance.

Per T170006#3910910, we're going to discuss this as part of chores.

Done in https://www.mediawiki.org/w/index.php?title=Extension%3AMobileFrontend&type=revision&diff=2694315&oldid=2686353.

  • Document any caveats around moving the first paragraph behaviour here.

@Jdlrobson should have an exhaustive list.

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Change 397568 had a related patch set uploaded (by Pmiazga; owner: Pmiazga):
[mediawiki/extensions/MobileFrontend@master] Definition list should be also moved with first paragraph

https://gerrit.wikimedia.org/r/397568

I found out that the move first paragraph logic will move leading paragraph with unordered/unordered lists but the definition list would stay not attached to the paragraph. I pushed a small patch to fix this behavior and move any type of list (ordered/unordered/definition) with the first paragraph.

Change 399562 had a related patch set uploaded (by Jdlrobson; owner: Jdlrobson):
[mediawiki/extensions/MobileFrontend@master] Treat mw-stack elements as if they are infoboxes

https://gerrit.wikimedia.org/r/399562

@pmiazga in a standup thread you mentioned

I had an idea to split MobileFormatter to many small "MobileFormatterSteps" and leave MobileFormatter as a manager that iterates over steps and applies one by one. Now everything is in one huge class that become very difficult to manage. I'll use my 10% time to work on it

Any progress there? I would be interested to split out the code into several parts that pass $doc by reference and allow better testability

e.g. filterContent would look like

public function filterContent(
		$removeDefaults = true, $removeReferences = false, $removeImages = false,
		$showFirstParagraphBeforeInfobox = false
	) {
		$ctx = MobileContext::singleton();
		$config = $ctx->getMFConfig();
		$doc = $this->getDoc();
               if ( $removeReferences ) {
                    new MobileReferenceFormatter( $doc );
               }
               if ( $showFirstParagraphBeforeInfobox ) {
                    new MobileLeadSectionFormatter( $doc );
               }
               return parent::filterContent();

Change 401776 had a related patch set uploaded (by Jdlrobson; owner: Jdlrobson):
[mediawiki/extensions/MobileFrontend@master] Hygiene: Begin refactoring of MobileFormatter mega class

https://gerrit.wikimedia.org/r/401776

@pmiazga over to you.
https://gerrit.wikimedia.org/r/401776 Hygiene: Begin refactoring of MobileFormatter mega class
https://gerrit.wikimedia.org/r/399562 Treat mw-stack elements as if they are infoboxes

Change 401776 merged by jenkins-bot:
[mediawiki/extensions/MobileFrontend@master] Hygiene: Begin refactoring of MobileFormatter mega class

https://gerrit.wikimedia.org/r/401776

Change 399562 merged by jenkins-bot:
[mediawiki/extensions/MobileFrontend@master] Treat mw-stack elements as if they are infoboxes

https://gerrit.wikimedia.org/r/399562

Jdlrobson removed a project: Patch-For-Review.

@Aborba this will go out on the train next week (Wed/Fri). We'd appreciate some more confidence for that! I've added testing instructions.

Change 403161 had a related patch set uploaded (by Pmiazga; owner: Pmiazga):
[mediawiki/extensions/MobileFrontend@master] Hygiene: Improve infobox detection in MoveLeadParagraphTransform

https://gerrit.wikimedia.org/r/403161

@Jdlrobson - I am having some trouble getting the test URLs to work on Staging. They are bringing up pages with no text. Is there a way to copy the article source(s) from prod?

@ABorbaWMF are you using the mobile skin ? http://reading-web-staging.wmflabs.org/wiki/Christopher_II_of_Denmark?useformat=mobile (click mobile view at the bottom of the page if the link exists)

Change 403223 had a related patch set uploaded (by Pmiazga; owner: Pmiazga):
[mediawiki/extensions/MobileFrontend@specialpages] Hygiene: Improve infobox detection in MoveLeadParagraphTransform

https://gerrit.wikimedia.org/r/403223

Change 403223 abandoned by Pmiazga:
Hygiene: Improve infobox detection in MoveLeadParagraphTransform

https://gerrit.wikimedia.org/r/403223

Change 403161 merged by jenkins-bot:
[mediawiki/extensions/MobileFrontend@master] Hygiene: Improve infobox detection in MoveLeadParagraphTransform

https://gerrit.wikimedia.org/r/403161

@Jdlrobson, Yes that did the trick. Testing now. So far no problems on chrome. More results to come.

Looks good so far. There not much visual difference on phone type devices, but looks good on tablets

ProductionStaging
image.png (1×768 px, 240 KB)
image.png (1×768 px, 300 KB)
image.png (1×768 px, 284 KB)
image.png (1×768 px, 464 KB)
image.png (1×720 px, 277 KB)
image.png (1×720 px, 281 KB)
image.png (768×1 px, 193 KB)
image.png (768×1 px, 269 KB)

Tested and looks good from my side, but @ABorbaWMF - just to double check, what did you mean by "there's not much visual difference on phone devices? The differences here that we're looking for are that the lead paragraph appears before the infobox on mobile devices on staging.

Sorry, it looks ok on mobile too.

ProdStaging
image.png (667×376 px, 165 KB)
image.png (667×376 px, 142 KB)
image.png (984×540 px, 193 KB)
image.png (984×540 px, 260 KB)

Great, all done then!

Jdlrobson claimed this task.
Jdlrobson added a subscriber: ABorbaWMF.

I need to go through the Pre sign off steps today.

Ack.
Now the logs are running a little clearer, I can see some genuine bugs - the majority relating to military articles using the stack template:

We currently only allow div.mw-stack but we should allow table.mw-stack too.
In fact we should probably remove all tagName checking - I think it's overzealous...

The template has been edited per my request on wiki

Remaining issues seem to be limited to inconsistently built pages eg.
https://en.wikipedia.org/w/index.php?title=MasterCard_Lola&type=revision&diff=821137852&oldid=810539463 and another template which I've flagged on wiki.
I'm seeing under 100 per hour, down from 440.

I'll keep an eye on the logs and see if this can be resolved on wiki... if not there's little we can do here other than feedback to community "inconsistent" pages which the logging is doing a great job of identifying.

phuedx updated the task description. (Show Details)
Niedzielski updated the task description. (Show Details)
  • The logging is regular but far less often, about 50 per hour, and we think the remaining fixes lie in templates.
  • All the examples including the list one check out:
    reading-web-staging.wmflabs.org_wiki_Planet(Nexus 5X).png (1×1 px, 381 KB)
  • Tests were added for infoboxes within and without .mw-stack.

However, I believe the logging is still enabled for anons so regrettably this must move back to needs work.

In new task ask the question can we/should we turn it off. Note, an acceptable outcome might be to keep the logging and check it via chores. A new template introduction might break this functionality so we may want to monitor it as part of maintenance.

Do we really want to create a new task for this? We're already limiting reports to logged in users. If it's problem, it will show up in the chores. I'd advise against creating another task and would prefer to discuss it during chores as needed.

Document any caveats around moving the first paragraph behaviour here.

@Jdlrobson, would you mind filling this out?

Jdlrobson updated the task description. (Show Details)

A user I'm collaborating with is proving super responsive. I'm hoping to eradicate the remaining issues on enwiki with her help. I'd like to leave this open till the end of the week to do so!

Behaviour now documented: https://www.mediawiki.org/w/index.php?title=Reading%2FWeb%2FProjects%2FLead_Paragraph_Move&type=revision&diff=2695640&oldid=2475658

Update: down to 13ish an hour:

Screen Shot 2018-01-22 at 10.53.36 AM.png (232×669 px, 29 KB)

Change 405812 had a related patch set uploaded (by Jdlrobson; owner: Jdlrobson):
[mediawiki/extensions/MobileFrontend@master] Other less common lead paragraph edge cases

https://gerrit.wikimedia.org/r/405812

Patch was merged but nothing from gerritbot. Weird.

phuedx claimed this task.
phuedx subscribed.

We're still seeing ~40 log entries per day and that seems acceptable.

Currently we are logging events on cached HTML. Restrict the logging to only occur for logged in users (this will reduce a lot of the noise in the logs)

I'm not sure that I understand this point. AIUI anonymous users are primarily served responses from the edge caches, meaning that the application server never runs the mobile formatting code (and associated logging code), i.e. this is satisfied by construction. Feel free to reopen this task if I've missed the point :]

Change 397568 abandoned by Jdlrobson:
Definition list should be also moved with first paragraph

Reason:
This work is now tracked in https://phabricator.wikimedia.org/T186136 given it needs a rebase anyway.

https://gerrit.wikimedia.org/r/397568

I've added this here to chores. I think it's important to keep an eye on this number as it gives us clues that editors have changed template markup: https://www.mediawiki.org/wiki/Reading/Web/Chore_list#Review_the_MobileFormatter_lead_paragraph_transform_on_Wikimedia_content