Page MenuHomePhabricator

Revise schema and performance dashboards for Vue.js search
Closed, ResolvedPublic

Description

For the past few years Web has been especially focused on the mobile site and has developed tooling to help monitor its health. A useful team chore has been to track the state of performance across a number of different metrics. Similar to the dashboards in T249826 or the CI tests in T244276, this task is about improving the monitoring parity of desktop (Vector) as needed.

Questions

  • Do we need to add / remove schemas for Vue.js? Are Analytics best positioned to help us ensure our dashboard is relevant?
  • What performance metrics are missing? Parity with mobile would be great but consulting with the Performance-Team team would be ideal.
  • What performance and engagement metrics do we need to feel confident with deployments and over the long term?

Acceptance criteria

Related Objects

Event Timeline

Since you're working on a specific feature (search result suggestions), it would be ideal to come up with a user-centric performance metric related to that feature. For instance, time between the user being done typing in the search box and the suggestions for that search term actually being displayed.

As for the code loading strategy, still within the context of the user interaction with the feature, it would be good to know which proportion of users actually wait on some client-side code that hadn't been loaded yet after they've actually typed their search terms, and for how long things are delayed for as a result of that.

@Niedzielski I can sync with you when you have time. I've added some measurement for search before in https://grafana.wikimedia.org/d/7zvDe0JZk/user-journey-drilldown?orgId=1 (switch scenario) and the documentation of what's measured is here: https://wikitech.wikimedia.org/wiki/Performance/User_Journey

It mostly measures navigating from the search to one page, not getting the results at the moment, but could probably be doable, let us sync.

Thank you, @Gilles and @Peter.

Since you're working on a specific feature (search result suggestions), it would be ideal to come up with a user-centric performance metric related to that feature. For instance, time between the user being done typing in the search box and the suggestions for that search term actually being displayed.

👍

My understanding of the search case study is that we're not replacing much existing JavaScript initially. On the contrary, we will be adding a lot with just the Vue.js framework itself and any dependencies. Given all subsequent Vue.js work will require this baseline, it would be useful to characterize its performance as best as we can.

I think the most important high level question to answer is: what information can we collect that the Performance team would want to derisk most other use cases? Or, if we can't answer that question fully, what is the missing information and the risks for other teams to start building all their user interfaces using Vue.js? I hope the answers can inform impact, risks, and what are our options are for future work.

You are the experts but I think some useful metrics would be:

  • How long does loading the baseline itself take, when can components be rendered with it, and when can users interact with it?
  • (Search input example already mentioned) We request search results based on user input. How does that workflow perform from initial user input, to a (cached or not) network request, to response, to DOM reconciliation?
  • What is the memory impact of loading the baseline and rendering with it?
  • How do these measurements vary across the diversity of devices on the test wikis (French Wikipedia and Wiktionary, Hebrew Wikipedia, Portuguese Wikiversity, Basque Wikipedia, and Persian Wikipedia)?

I've added some measurement for search before in https://grafana.wikimedia.org/d/7zvDe0JZk/user-journey-drilldown?orgId=1 (switch scenario) and the documentation of what's measured is here: https://wikitech.wikimedia.org/wiki/Performance/User_Journey

@Peter, do you think the user journey adequately covers the baseline characterization? I am pinging my team to see who's interested and will reach out to you.

It's also relevant to add that Vector's "latest mode," which will eventually include the new Vue.js search experience, will be deployed not just for logged in users but anonymous users (T236176).

@Peter, do you think the user journey adequately covers the baseline characterization? I am pinging my team to see who's interested and will reach out to you.

Let us sync in the meeting, it's not much work to add more use cases or add "latest mode" when thats available for our synthetic testing, I can do that when we have synced so I focus on the right thing.

We should also look and see if we can add instrumentation to get more RUM metrics and do try to add that to the current site too, so we can compare. One thing could be to use the Element Timing API for the first rendered image in the search result. It will only work for Chromium based browsers but it would at least gives us metrics when things appear on the screen at that would be cool (we can also pickup the same metric in synthetic). We can also try to find some good User Timings to get more metrics on how things work internally (like loading the baseline).

@Niedzielski does it mean that vue.js + all the newly introduced JS will be downloaded for all article pageviews on those wikis that will get the experimental feature deployed to?

If so, we will probably need to oversample our general performance measurements for those wikis ahead of time, in order to measure the impact of the extra payload when it gets deployed for the first time.

@Peter, sounds good

@Gilles, yes, assuming that everything goes as planned, the new search feature will be deployed to at least all main namespace pages (/cc @ovasileva). We could do one test wiki* at a time but I'm guessing that latest mode will be the default for anonymous users by the time the new search experience is ready to deploy. This means that search would go from the old experience to the new experience for everyone except logged in opt-outs. Do you think that would be an issue? We might have some flexibility to limit the deployment to logged in users only initially if necessary but we would need to know to build that infrastructure ahead of time.

*Reminder: the test wikis are currently French Wikipedia and Wiktionary, Hebrew Wikipedia, Portuguese Wikiversity, Basque Wikipedia, and Persian Wikipedia.

Not an issue, but I would advise testing the extra payload being loaded without it being used first, just to measure the impact of the extra weight on its own, both on performance metrics and on engagement.

Jdlrobson triaged this task as Medium priority.Jan 13 2021, 6:32 PM
Jdlrobson subscribed.

Blocked for now on T249826, so pulling out of board for now. Added sign off step on that task to make sure this doesn't get lost.

Jdlrobson added a subscriber: nray.

@nray this is done if I'm not mistaken with the dashboard you created... can you confirm? (Also welcome back! :))

It's not clear to me whether this task is focused on real user monitoring (RUM), but synthetic tests to measure search performance were added to the dashboard as part of T251544 . I will close this ticket under the assumption that a more focused ticket for RUM can be created in the future if desired