Page MenuHomePhabricator

Retrieve metric "Median time of loading the Wikidata item and property data from Wikidata database"
Closed, ResolvedPublic

Description

Definition:

  • We want to find out how long (median) it takes for a Wikidata item and property to be served to the user i.e. the time that goes from the user request till when it gets sent to user, in whichever device the user is at the moment.
  • this metric will be "minus cache" meaning that we do not want to measure how fast our cache is but rather how long it takes to load the Wikidata item and properties where there is no cache in the game

Purpose:

  • Set a baseline of the current state
  • Detect long loading times - then investigate where the issue comes from e.g. a bad query, big sized item, too high complexity in the code, too many redirects... It will also be easily caught in case a newly released fetaure is having an impact on this metric.
  • In the case that the change of trend can easily lead back to a new release the looking for the root cause will be made easier e.g. the scope will be reduced
  • Plus it will give our Product team some more data that may be relevant for their short, mid and long term strategy and planning.

Notes:

  • we could already do the dictinction whether the item is requested and served to a mobile device or to a desktop browser. The team working in this will make an estimation of the added effort that such distinction would involve. With that information it will be decided whether we have the MVP with an unique value or not.
  • we have firstly only contemplated the case of successful responses thinking that the error cases will be covered in another metric (T274420). It will be discussed with the team in case this assumption is false.

AC:

  • retrieve metric
  • the metric is to be seen on the dashboard

Event Timeline

darthmon_wmde renamed this task from Retrieve metric "Median payload weight (I/O)" to Retrieve metric "Median time of loading the Wikidata item and property data from Wikidata database".Feb 10 2021, 7:35 PM
darthmon_wmde updated the task description. (Show Details)
SELECT
  percentile_approx(time_firstbyte, 0.5) as median_time
FROM
  wmf.webrequest
WHERE
  uri_host = 'www.wikidata.org'
  AND namespace_id IN (0,120)
  AND http_status = 200
  AND is_pageview = TRUE
  AND cache_status = 'miss'
  AND year = 2021
  AND month = 5
  AND day = 3
  AND hour = 13;

Change 686066 had a related patch set uploaded (by Ladsgroup; author: Ladsgroup):

[analytics/refinery/source@master] Add scala job for reliability metrics of Wikidata

https://gerrit.wikimedia.org/r/686066

Change 686383 had a related patch set uploaded (by Ladsgroup; author: Ladsgroup):

[analytics/refinery@master] oozie: Add oozie job for gather wikidata reliability metrics

https://gerrit.wikimedia.org/r/686383

Change 686066 merged by jenkins-bot:

[analytics/refinery/source@master] Add scala job for reliability metrics of Wikidata

https://gerrit.wikimedia.org/r/686066

Change 686383 merged by Mforns:

[analytics/refinery@master] oozie: Add oozie job for gather wikidata reliability metrics

https://gerrit.wikimedia.org/r/686383