Ish? Until this is done, we're limited to using Ubuntu for the VMs that host our dashboards. Since the WM Cloud team (formerly WM Labs) is deprecating Ubuntu Trusty in favor of only offering Debian for VMs, we'll have to file a Phab ticket requesting a Trusty instance if we have to shut one down and launch a replacement. I don't think this task should be declined, but I am gonna adjust the priority to reflect where we are on this.
Tue, Oct 17
Fri, Oct 13
@chelsyx do you wanna add your stuff to https://github.com/wikimedia-research/SDoC-Initial-Metrics ?
Queries & data uploaded to https://github.com/wikimedia-research/SDoC-Initial-Metrics
Growth of number of deleters over time:
Total files uploaded to Commons (as of right now) by extension:
Thu, Oct 12
Wed, Oct 11
- Most copyright-related deletions happen within 1 day of upload across almost all media types, with the exception of 'drawing' (SVGs)
- A lot of audio files are deleted within 1 minute or 1 week of upload
- Half of all images and PDFs deleted were deleted within 1 month of upload for non-copyright reasons
Reasons for files deleted in 2017:
Tue, Oct 10
Fri, Oct 6
@chelsyx: thanks and good job!
Thu, Oct 5
Tue, Oct 3
It would depend on how often things below the top 20 move into the top 20 in practice, not just in theory. We can use the search logs to find this out, no?
Mon, Oct 2
Thu, Sep 28
Using just the event logging data from 2017-08-01 to today (2017-09-28), here's a glimpse at queries from abandoned searches:
Bootstrapping finally finished -_- second draft up at https://people.wikimedia.org/~bearloga/reports/ltr-test.html
Wed, Sep 27
@mforns: we specify the analytics-store hostname in our R package (the function that makes sql queries: https://github.com/wikimedia/wikimedia-discovery-wmf/blob/master/R/mysql.R#L39--L76) which is used for querying both wiki content dbs as well as the log db. If we add a type argument that sets the hostname ("db1047" in case of type == "events", for example), what hostname should we use for non-eventlogging queries?
Deployed at https://discovery.wmflabs.org/forecasts/
Tue, Sep 26
Mon, Sep 25
Sep 22 2017
Deployed to prod. Good job, @chelsyx!
Deployed to prod. Good job, @chelsyx!
Sep 21 2017
Result of running T170022#3611637:
Sep 20 2017
Still need to add the logic that auto-selects "(None)" in the languages list if the user selects "Commons" in the projects list, but here's what I have so far:
Sep 18 2017
Sep 16 2017
R script & Hive query that finds static map thumbnail requests and then uses those to find the pages that have a mapframe and how many pageviews those pages have and the total pageviews the respective project has:
Sep 15 2017
Haven't seen esclicks yet but the hover-on/offs appear to be working along with the rest of the test:
Sep 14 2017
I don't remember any other changes we wanted to make and since it's deployed to production (https://discovery.wmflabs.org/metrics/#spr_surv), I'm moving this ticket to Done.
First draft up at https://people.wikimedia.org/~bearloga/reports/ltr-test.html
Update: I fixed the query for prevalence stats in https://people.wikimedia.org/~bearloga/reports/maps-usage.html -- specifically I am now counting only pages that are articles and that are not redirects. I also added an "% of sessions that activated mapframe" to https://people.wikimedia.org/~bearloga/reports/maps-interactions.html
Sep 7 2017
Aug 31 2017
Aug 30 2017
Good work, @chelsyx! Minor changes here and there: https://github.com/wikimedia-research/Discovery-Search-Test-Swap2and3/pull/1
- Reads really nice; super easy to follow along
- "not comparing apple to apple" => "not comparing apples to apples"
- "According to our eventlogging schema" => "According to our EL schema" (since you already introduced the term above)
Aug 29 2017
Ah, got it. Cool, that makes things easier! Yeah, adding some kind of a marker to the extra field somewhere in https://github.com/wikimedia/mediawiki-extensions-WikimediaEvents/blob/master/modules/ext.wikimediaEvents.kartographer.js#L141--L232 would be the way to go.
Backfilling data from 2017-04-01 through 2017-08-28. Adding that data to the dashboard should be relatively straightforward.
@TheDJ Event logging would be quite an undertaking, but we can start with tracking tiles requested specifically by the gadget (as opposed to mobile web in general). When the gadget (or leaflet, maybe?) makes the API calls to Kartotherian for the tiles, do you use a custom user agent or are you able to specify one? Because if you can specify a custom user agent, on our side we can then look for that UA specifically when we count tiles served. A good UA would include name of gadget & URL or your name & contact info, for example.
@chelsyx there should also be a tab that shows the total usage (across all APIs) broken down by referrer with the option to switch between raw counts and %s
Aug 28 2017
Final draft up at https://wikimedia-research.github.io/Discovery-Search-Adhoc-SurveyMVP/
Marking search_api_usage for a recount and then recounting using the new UDF so we have referrer breakdown for the past 60 days:
Aug 24 2017
I just checked the refinery commits log and the UDF is available now :) "Add refinery-source jars for v0.0.51 to artifacts" https://github.com/wikimedia/analytics-refinery/commit/712bf13a8689fda40530c072384d355b1dd694d5
P.S. I should also add that we currently have several teams without performance metrics from the past 10 days (and counting), so getting this done is pretty important — hence the high priority. On 13 August 2017 I asked Guillaume to fix the permissions on the datasets so that I could run golden/main.sh as myself just to backfill metrics that we were missing since 23 July 2017 by that point. We can go back to that running-under-staff-account solution but that's just not sustainable (as discussed at length in T129260), so the switch to a non-person executing these scripts has to be done anyway.