Page MenuHomePhabricator

[Spike, 4hrs] Did new main menu changes degrade performance?
Closed, ResolvedPublicSpike

Description

According to our graphs performance metrics including first paint and last visual change took a hit around 9th, corresponding with our main menu rollout.

Screen Shot 2020-01-17 at 1.32.00 PM.png (1×2 px, 291 KB)

On 14th a further degradation in performance can also be seen on mobile but also on desktop

Take a closer look at the webpagetest runs for this period and determine some theories around what caused the problem. Chat to the performance team to see if they can help identify any further issues and/or confirm any theories we might glean.

Questions to answer

  • Was the first spike likely due to the changes to the main menu in T225213?
  • Did a banner campaign contribute to the second spike?
  • Is there anything we can do to restore the metrics/previous benchmark?

Event Timeline

Restricted Application changed the subtype of this task from "Task" to "Spike". · View Herald TranscriptJan 17 2020, 9:34 PM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript

See also the primary dashboards for WebPageReplay and WebPageTest:

I can confirm the regression there, in context:

Here you can enable "Show each test" which will provide a popup from which you can see the screenshot for that run, as well as access to other artefacts such as the HAR file for the DevTools.

This shows that both spikes are banner campaigns:

Screenshot 2020-01-17 at 22.31.58.png (1×1 px, 264 KB)
Screenshot 2020-01-17 at 22.32.02.png (1×1 px, 258 KB)

From that same dashboard, one can also switch to different wiki pages (wiki-Obama, wiki-Sweden, wiki-Sweden) and wikis (enwiki, arwiki tec.) and see consistently that the metrics are either stable/unchanged with no banners, or equally elevated and with a banner. This strongly suggests there is no other regression on-going at the moment.

The static "Banksy" page (not wiki-Banksy) also remains stable which futher confirms that our test environment and other infra outside MediaWiki (the VM/OS/browser we use for testing, and the internet connectivity from there to WMF data centres etc) also did not experience a regression currently:

Screenshot 2020-01-17 at 22.40.09.png (1×1 px, 258 KB)
Jdlrobson claimed this task.

TIL about being able to overlay visual metrics screenshots (and documented on https://grafana.wikimedia.org/d/000000205/mobile-2g?orgId=1). Really helpful ! thanks @Krinkle

Noted this blip on https://www.mediawiki.org/wiki/Reading/Web/Notable_incidents#2020