Page MenuHomePhabricator

A Large-scale Study of Wikipedia Users' Quality of Experience: data release
Open, NormalPublic

Description

This is a task to coordinate the release a subset of real user performance data that was collected while conducting the first round of research from this project: https://meta.wikimedia.org/wiki/Research:Study_of_performance_perception_on_Wikimedia_projects Which led to a short paper entitled "A Large-scale Study of Wikipedia Users' Quality of Experience", due to be presented and published at The Web Conference 2019.

I can share the camera-ready version of the paper privately with anyone at Wikimedia who might be interested before its publication, as it might help understand why this specific chunk of data is being requested for publication.

I expect that Analytics, Legal and Security will want to review this dataset. Feel free to create dedicated subtasks for each team.

Timespan

2018-05-24 12:55:12 -> 2018-10-15 11:59:52

Wikis

Data was collected on cawiki, frwiki, enwikivoyage and ruwiki. We need at the very least data for ruwiki.

Data fields

The following have all been collected client-side, via the NavigationTiming extension:

  1. wiki Which wiki the request was on (ruwiki, cawiki, eswiki, frwiki or enwikivoyage)
  2. time Timestamp, can be rounded to the minute or the hour if needed. We don't need second accuracy at all. But it's useful in the study to demonstrate like of temporal correlation (time of day, day of week, day of month). Since we don't need the timestamp to be the real one to prove lack of temporal correlation, the timestamp values should be shifted by an arbitrary value for the entire dataset.
  3. unload [1] The time spent on unload (unloadEventEnd - unloadEventStart).
  4. redirecting [1] Time spent following redirects.
  5. fetchStart [1] The time immediately before the user agent starts checking any relevant application caches.
  6. dnsLookup [1] Time it took to resolve names (domainLookupEnd - domainLookupStart).
  7. secureConnectionStart [1] The time immediately before the user agent starts the handshake process to secure the current connection.
  8. connectStart [1] The time immediately before the user agent start establishing the connection to the server to retrieve the document.
  9. connectEnd [1] The time immediately after the user agent finishes establishing the connection to the server to retrieve the current document.
  10. requestStart [1] The time immediately before the user agent starts requesting the current document from the server, or from relevant application caches or from local resources.
  11. responseStart [1] The time immediately after the user agent receives the first byte of the response from the server, or from relevant application caches or from local resources.
  12. responseEnd [1] The time immediately after the user agent receives the last byte of the current document or immediately before the transport connection is closed, whichever comes first.
  13. loadEventStart [1] The time immediately before the load event of the current document is fired.
  14. loadEventEnd [1] The time when the load event of the current document is completed.
  15. mediawikiLoadEnd Mediawiki-specific. The time at which all ResourceLoader modules for this page have completed loading and executing.
  16. domComplete [1] The time immediately before the user agent sets the current document readiness to "complete".
  17. domInteractive [1] The time immediately before the user agent sets the current document readiness to "interactive".
  18. gaps [1] The gaps in the Navigation Timing metrics. Calculated by taking the sum of: domainLookupStart - fetchStart, connectStart - domainLookupEnd, requestStart - connectEnd and loadEventStart - domComplete.
  19. firstPaint [2] The time when something is first displayed on the screen.
  20. rsi [3] RUMSpeedIndex. Estimate of the SpeedIndex value based on ResourceTiming data. Now moved to the RUMSpeedIndex EventLogging schema, but was collected as part of the NavigationTiming schema at the time of the study.

And the following metrics, that are derivatives of metrics coming from NavigationTiming, designed to preserve privacy:

  1. speed_quantized The page download speed evaluated as (transferSize *8)/(loadEventStart  - fetchStart) quantized in these bins = [0,100,200,300,400,500,600, 700, 800,900,1000,20000] (the sensitive metric is transferSize [1], the size of the gzipped html of the article measured)
  2. speed_over_median_per_country The page download speed (evaluated as above) normalized over the median per-country speed observed in the dataset.

Finally, the response users gave to the perception survey:

  1. surveyResponseValue Can be "yes", "no", or "not sure". The question asked being "Did this page load fast enough?".

[1] metrics coming from the browsers' implementation of the NavigationTiming API (level 1 and level 2).
[2] firstPaint comes from the Paint Timing API or vendor-specific implementations predating the standards.
[3] RUMSpeedIndex is a compound metric combining several NavigationTiming and ResourceTiming (level 1 and level 2) metrics into a single score. It's a 3rd-party FLOSS library found here: https://github.com/WPO-Foundation/RUM-SpeedIndex

EventLogging schemas these fields are coming from:

Event Timeline

Gilles removed Gilles as the assignee of this task.Feb 28 2019, 11:02 AM
Gilles created this task.
Gilles updated the task description. (Show Details)Feb 28 2019, 3:07 PM
Gilles triaged this task as Normal priority.Feb 28 2019, 3:24 PM
Milimetric raised the priority of this task from Normal to High.Feb 28 2019, 5:51 PM
Milimetric lowered the priority of this task from High to Normal.
Milimetric moved this task from Incoming to Data Quality on the Analytics board.
Milimetric added a subscriber: Milimetric.
Gilles added a subscriber: JBennett.Mar 5 2019, 8:49 AM

@JBennett @JFishback_WMF could we get an update on when this might get looked at?

leila added a comment.Jul 11 2019, 3:55 PM

@JBennett and @JFishback_WMF can you please assign this task to someone on your end so we can make sure it has an owner and it will be processed? Also if you can provide a sense of timelines for getting back to us, that'd be great.

@leila and @Gilles I'll work on this. I'll get started on it as soon as I can, but is there a particular timeline we're tracking to?

@JFishback_WMF Gilles can speak to the timelines better. From my perspective, the sooner the better so it doesn't fall in the backlog of things to do. :)

Well... the researchers ended up redacting their journal submission to reflect the fact that they couldn't release the dataset at this time. But I think they're still very eager to do so. If would be nice to be able to do this before the end of the calendar year?

Nuria added a comment.Aug 20 2019, 3:26 PM

Let's see, this dataset has no page info, neither timestamps, is that correct?

My bad. We probably want timestamp as well, but it can be very coarse (rounded to the hour is fine), as well as which wiki we're dealing with (since the study ran on multiple wikis). I'll add that to the task description.

Gilles updated the task description. (Show Details)Aug 20 2019, 3:40 PM
Nuria added a comment.Aug 20 2019, 3:49 PM

@Gilles it would be good to shift timestamps so this data cannot be linked (or rather, obviously linked) with any data existing (say, pageviews per wiki per hour which are released hourly).
This is a technique we have used in prior data releases. Also, worth thinking that if wiki does not add much to the dataset (that is, perception of performance is not dependent on the wiki per your study) it might be better to remove it to make the dataset more opaque.

Sure, we can shift the timestamps by an arbitrary amount. It would still prove lack of temporal correlation.

The satisfaction ratios per wiki are a bit different: https://grafana.wikimedia.org/d/000000551/performance-perception-survey?orgId=1 We had the same findings when looking at each wiki separately, but we can't really mix data between different wikis. Another possibility is to only keep the ruwiki data, which has by far the largest traffic during the study period.

Nuria added a comment.Aug 20 2019, 5:22 PM

Another possibility is to only keep the ruwiki data, which has by far the largest traffic during the study period.

Sounds good, if you modify ticket description of fields maybe we can take a look together with @JFishback_WMF laster on this month?

Nuria added a comment.Aug 20 2019, 5:23 PM

FYI that released one-off datasets get documented in meta like, for example: https://meta.wikimedia.org/wiki/Research:Wikipedia_clickstream

Gilles updated the task description. (Show Details)Aug 20 2019, 5:24 PM