How do we want to compare the current search performance to the new one? This task is to make sure we have the proper user journeys defined, implementing them in the Performance test repo, and setting up a dashboard.
This task does not cover any behavioral regression tests such as "the new UI is confusing."
User journeys
Legacy Vector:
For anon cached and uncached, and logged-in uncached scenarios:
- Visit the Barack Obama page.
- Tap search.
- Type "b".
- Wait for a response.
- Type "ananb" one character at a time.
- Wait for a response.
- Type "<backspace>a" one character at a time.
- Wait for a response.
- Tap the first result.
- Wait for the page to load.
Latest Vector:
For anon cached and uncached, and logged-in uncached scenarios:
- Visit the Barack Obama page.
- Tap search.
- Wait for the form to load.
- Type "b".
- Wait for a response.
- Type "ananb" one character at a time.
- Wait for a response.
- Type "<backspace>a" one character at a time.
- Wait for a response.
- Tap the first result.
- Wait for the page to load.
Questions
- Confirm with @ovasileva and @alexhollender that these journeys make sense.
- Is the Barack Obama page a good test for the tests wikis? The Performance team has also used the Sweden and Facebook pages. Is there a universal page that works on the test wikis (French Wikipedia and Wiktionary, Hebrew Wikipedia, Portuguese Wikiversity, Basque Wikipedia, and Persian Wikipedia)? Answer: there is no universal page. The Obama page is usually a good example of a large page. See T251544#6103760.
- Do we want to also test Special:BlankPage? Answer: no, but we may want to test an empty page in the user namespace. See T251544#6103760.
- Are parameters possible or do these break caching (see T215088#4993170)? E.g., safemode=1, useskin=vector&useskinversion=2, and banner=null. If not, how do we set the skin version and enable features?
- Do we need to test loading the largest dependencies (e.g., Vue) without using them (see T250336#6075053)?
Dashboards
For Legacy and Latest modes, mimic at least the metrics described in the reference dashboard:
- TTFB
- FirstVisualChange
- Heading
- LargestImage
- LastVisualChange
- SpeedIndex
- PerceptualSpeedIndex
- Fully Loaded
- ...
Both reports should be comparable on the same dashboard.
Acceptance criteria
- Legacy and latest reports are comparable on the test wikis.
- Chores are updated to include the new dashboards
- The wiki (either a link off of the Vue.js page or one of the Performance pages) is updated with recommendations for future Vue.js projects
- Performance team signs off on tests and dashboards
QA steps
Measurement definitions:
mwVector{Legacy/Vue}SearchLoadStartToLoadEnd: Intended to measure the time it takes to load and execute the search module. Note, I've noticed this measurement can be the most skewed of the three measurements because the load end time was implemented using the "mw.loader.using" callback which places that code in a queue that can be influenced by other modules that are loading (e.g. ULS module). Although not desirable, this problem affects both Vue and legacy search.
mwVector{Legacy/Vue}SearchLoadStartToFirstRender: Intended to measure the time it takes from the start of lazy loading the search module to when the search results are visible to the user.
mwVector{Legacy/Vue}SearchQueryToRender: Intended to measure the time it takes from the start of the ajax request to fetch search results to when the search results are visible to the user.
Because of its technical depth, QA here is probably best done by a dev/someone familiar with the performance timeline. Here is how I'd suggest QAing it:
Legacy Search
- Clear localstorage/all cache and visit https://fr.wikipedia.org/wiki/Barack_Obama?useskinversion=1 with Chrome
- Go to the performance tab and hit the record button
- Focus the search input and type the letter "a" in the input
- Wait for the search results to be shown
- Stop profiling
- Using the "Network", "Timings", and "Main" sections in the performance timeline, ensure the bars in the "Timings" section make sense according to the measurement definitions above. I've attached a screenshot of a timeline I did for Vue search which is annotated and might be useful as a reference
Vue search
Repeat steps 1-6 but now measure Vue search using https://fr.wikipedia.org/wiki/Barack_Obama?useskinversion=2
Dashboard at: https://grafana.wikimedia.org/d/GivPpdsGk/vue-vs-legacy-search?orgId=1
QA Results - Beta
AC | Status | Details |
---|---|---|
1 | ✅ | T251544#6914967 |