Page MenuHomePhabricator

Create a dashboard of key user-centric metrics
Open, MediumPublic

Description

As mentioned by @ori during out last meeting, we should come up with the main metrics we aim to move the needle on as a team and decide what form it should take in terms of dashboard.

Some ideas we might want to explore further. This is my take on it, and can be subject to further debating:

  • Maybe all our core performance metrics should be expressed in terms of end-user experience. I.e. "as a reader how long does it take for me to be able to view an article". This pretty much maps to firstPaint 1-to-1. If we go down this road, we should always remember to start the thinking from the UX, and not start from the technical data. This vast amount of data we have can trick us into tracking the wrong thing. Another UX metric idea that came up was "as an editor, how long does it take for my contribution to be visible by everyone". We should probably brainstorm more of those in this task.

Ideas:

Related Objects

StatusSubtypeAssignedTask
OpenNone
ResolvedPeter
ResolvedPeter
ResolvedKrinkle
ResolvedKrinkle
ResolvedKrinkle
Resolved Gilles
ResolvedKrinkle
DuplicatePeter
ResolvedKrinkle
ResolvedPeter
OpenNone
OpenNone
Declinedori
Declinedori
DeclinedNone
ResolvedPeter
DeclinedNone
ResolvedPeter
ResolvedPeter
ResolvedPeter
ResolvedPeter
ResolvedPeter
ResolvedPeter

Event Timeline

Gilles raised the priority of this task from to Needs Triage.
Gilles updated the task description. (Show Details)
Gilles added subscribers: Gilles, ori.
Gilles removed a project: Performance Issue.
Gilles set Security to None.
Gilles updated the task description. (Show Details)

Here's a first stab at this.

Time it takes to see an article's lead section

Pretty straightforward, this is firstPaint in practice. Right now the biggest clients (desktop and mobile web) load the whole article, so "seeing the article's lead section" equals "seeing the article's text" and that's ok for the purpose of tracking this metric long term. As we convert clients to only loading the lead section, this metric should be dramatically improved. Filter: HTTPS, US.

Time it takes to see an article, media included

I think that this one is interesting because it encompasses a lot of moving parts, including our media thumbnailing. Filter: HTTPS, US.

Time it takes for an edit to be seen by everyone

Filter: HTTPS.

How fast is the wiki compared to the local internet connection's capabilities?

As a first attempt we could use the catchpoint probes for this. And the baseline could be high traffic local websites. Beyond that, if we want to generate this figure for real users, we would need to start hosting pixels in the areas we want to track, which would be used by JS to benchmark bandwidth and latency. The holy grail is a map showing us where we're doing well and where we aren't. A map of absolute performance with the NavigationTimine data is not that, because it doesn't tell us how good or bad the internet is at those places in general. What matters is making the best of the available pipe, but we can't help it if the pipe is inherently bad. Filter: HTTPS.

I didn't include an HTTP vs HTTPS metric for the following reasons.The average user doesn't understand the difference. Whether or not our HTTPS performance is good is a concern for Ops, not a concern for the end user. Sooner or later we're going to serve all our content securely, I think it's an inevitable evolution of the web, making this concern a short-term one. We've always considered the slowness of secure browsing a tax, but HTTP and TCP/IP themselves also had their inefficiencies. And now suddenly that SPDY and HTTP 2.0 are around the corner, we remember that. My point is that if insecure browsing never existed and encryption had always been part of the web, I don't think anyone would have suggested to invent an insecure way of serving things faster. Finally, since I've filtered our performance by HTTPS in the above metrics, HTTPS performance is one of the many things where we can move the needle on those user-centric metrics. I.e. make HTTPS faster and "Time it takes to see an article's lead section" is improved.

I also didn't include things that are product-specific, because those tend to turn into success metrics rather than performance metrics. "Time to first edit" for example, probably boils down to 90% product. VE needs to track that as one of its success metrics.

An idea from GitHub's engineering blog: use a stacked graph to represent navigation timing metric.

http://githubengineering.com/browser-monitoring-for-github-com/

Another idea for a metric: time-to-revert. How long vandalism is live on the site.

Change 214570 had a related patch set uploaded (by Ori.livneh):
Report time to first edit as ttfe

https://gerrit.wikimedia.org/r/214570

Change 214570 merged by jenkins-bot:
Report time to first edit as ttfe

https://gerrit.wikimedia.org/r/214570

Change 214662 had a related patch set uploaded (by Ori.livneh):
Report time to first edit as ttfe

https://gerrit.wikimedia.org/r/214662

Change 214663 had a related patch set uploaded (by Ori.livneh):
Report time to first edit as ttfe

https://gerrit.wikimedia.org/r/214663

Change 214662 merged by jenkins-bot:
Report time to first edit as ttfe

https://gerrit.wikimedia.org/r/214662

Change 214663 merged by jenkins-bot:
Report time to first edit as ttfe

https://gerrit.wikimedia.org/r/214663

Time it takes to see an article, media included
I think that would be really cool. I think a first step could be to add a user timing on the onload event for the first images (do we have a way today to separate images on a page, meaning do we know which ones that are displayed earliest/on top in a page?).

How fast is the wiki compared to the local internet connection's capabilities
Wonder about the value about this or what values we actually get and how we can act on it? I'm probably missing the background :) I think comparing with others on different location will not give us any extras, the numbers we get from these companies (if we are using Catchpoint or others) are numbers that are less perfect (they test to few times etc), but I'm not 100% sure what we have access to, so show me :) What's good however is accessing the site from different locations to hit different data centers/pop. Thinking maybe we can use our WebPageTest instances for that (the different locations provided by AWS should be quite ok for us). But that maybe should be another task.

Oh forgot one thing: About measuring bandwith and latency, we should use the opportunity to talk to Philip Tellis at Velocity. He's the author of https://github.com/lognormal/boomerang/ and he's a really nice guy and could tell us more about how they do it in Boomerang and what kind of things we should watch out for and think extra about.

Oh forgot one thing: About measuring bandwith and latency, we should use the opportunity to talk to Philip Tellis at Velocity. He's the author of https://github.com/lognormal/boomerang/ and he's a really nice guy and could tell us more about how they do it in Boomerang and what kind of things we should watch out for and think extra about.

I actually saw Philip present exactly that information at a small conference I attended in 2012. He went through the mechanics of Boomerang and what to watch out for. I believe those are the slides he used: http://www.slideshare.net/bluesmoon/messing-with-javascript-and-the-dom-to-measure-network-characteristics

But we could just use boomerang, honestly.

cool, yep I agree on using Boomerang. Think it also would be good when we take the next step and start using the Resource Timing API, if use the plugin in Boomerang we don't need to actively take care of the glitches that happens in different browsers.

Krinkle updated the task description. (Show Details)
Krinkle added a subscriber: Krinkle.
aaron renamed this task from Performance key metrics dashboard(s) to Create a dashboard of key user-centric performance metrics.Sep 13 2017, 12:25 PM
aaron renamed this task from Create a dashboard of key user-centric performance metrics to Create a dashboard of key user-centric metrics.Jun 6 2019, 10:51 AM
aaron added a project: Product-Analytics.
aaron edited projects, added Performance-Team (Radar); removed Performance-Team.

@Krinkle, @aaron, are you requesting something from Product-Analytics here? We're assuming not, so please let us know if we're missing something.

@Neil_P._Quinn_WMF There are ideas in this task that the Performance Team was thinking would be useful to track and correlate with Web Performance metrics. For example, correlating web perf metrics like "how fast our login page loads" or "how much data is consumed when reading articles" with quality metrics for products such as "proportion of edits that isn't reverted within a month" or "percentage of editors that remain active" or "time between account creation and first edit" etc.

However we found that most of these metrics are not publicly tracked currently, and we found it to be out of scope for us to introduce instrumentation for these. If this is something of interest, we'd love to know about it. You could re-purpose this task for that or we can close it and continue the conversation by other means.