Wed, Aug 16
Mon, Aug 14
Results from the first test:
Code for estimating and visualizing dwell-time on full-text search results pages arrived at from autocomplete search: https://github.com/wikimedia-research/Discovery-Search-Adhoc-SRPDwellTime
Thanks, good job!
Yep, it looks like the patch worked! :D
Sun, Aug 13
Wed, Aug 9
Mon, Aug 7
Status update: JK will look into giving Chelsy and/or me some kind of access so we can take a look into it. I have previous experience with Google APIs and building R bindings to web APIs, so that will be helpful here :D
Sat, Aug 5
@EBernhardson: I went through that large-scale article. Right now I'm writing tools for calculating sample size in interleaved experiments and analyzing results.
Thu, Aug 3
Tue, Aug 1
Mon, Jul 31
Based on WikimediaEvents/kartographer.js#L13:
Fri, Jul 28
Thu, Jul 27
Sys.setenv( http_proxy = "http://webproxy.eqiad.wmnet:8080", https_proxy = "http://webproxy.eqiad.wmnet:8080" ) install.packages("dplyr", repos = c(CRAN = "https://www.stats.bris.ac.uk/R/"))
Wed, Jul 26
Haven't started properly working on this but I did just upload the auto-generated report to stat1005:/srv/published-datasets/discovery/reports/
Tue, Jul 25
I am not familiar enough with the event object in the context of Maps and had to make a lot of assumptions based on the surrounding code, so I'm not 100% sure that I'm correct in trying to get the name of the layer via event.layer in https://gerrit.wikimedia.org/r/#/c/366183/2/modules/wikivoyage/WVMapLayers.js. I'm hoping @MaxSem can confirm that I did it correctly or provide some advice for how to do it properly.
CRAN submission policy recommends waiting like 6 months before submitting another version and the most recent version available on CRAN went up on 2017-06-14, so the version that we actually want will probably go up in like 5-6 months.
Mon, Jul 24
Great job with this, @chelsyx!!! This is going to be such a useful tool when it's done! (Which it almost is! :P)
Nice! Great job @chelsyx! I was so frustrated when the thing supposed to work but didn't.
Thu, Jul 20
We're okay with that :)
Wed, Jul 19
I just realized that reworking discovery-stats properly will require R package installation stuff from https://gerrit.wikimedia.org/r/#/c/366170/
@hashar Thank you for making this ticket and emailing the R Foundation/R Development Core Team! Heh, yesterday I emailed @Ottomata & @Gehel asking if setting up our own CRAN mirror would be a reasonable thing.
Jul 19 2017
Found where layer selection events are implemented: https://github.com/wikimedia/mediawiki-extensions-Kartographer/blob/master/modules/wikivoyage/WVMapLayers.js
Jul 18 2017
Jul 14 2017
Update: French and Catalan were the only languages that use a community-developed sister search sidebar in addition to ours. I've separated out those two languages into their category but that wasn't it:
Jul 12 2017
@Ottomata: is it OK if we don't get around to this until after stat1005 goes live?
I need to repurpose https://github.com/wikimedia/puppet/blob/production/modules/statistics/manifests/discovery.pp to be the thing that runs https://github.com/wikimedia/wikimedia-discovery-golden instead of https://github.com/wikimedia/analytics-discovery-stats (deprecated). I'll ping you and Guillaume for CR when it's ready.
Tagging DBA here because the geo_tag table grows whenever someone adds coordinates but does not shrink when coordinates are removed on-wiki and that's something they should be aware of.
Jul 11 2017
@debt: so…are we going ahead with the idea to add another language category for languages that already have a sister project search sidebar (e.g. French)?
@MaxSem I'm going through your discovery-stats repo and currently taking a look at the geo_tag table. I'm noticing that sometimes there are geotags in the database that are no longer present on wiki.
Jul 10 2017
@JKatzWMF would @Tbayer be able to take a look at the Hive query that is generating the dataset and confirm it is correctly counting sister search-referred pageviews by platform, wiki, etc.? Just in case Chelsy or I missed some particular detail when writing/reviewing it. The query is at https://github.com/wikimedia/wikimedia-discovery-golden/blob/master/modules/metrics/search/sister_search_traffic
It's because we changed the sampling rates on April 19th, decreasing enwiki and increasing every other wiki. Since enwiki generally has high PaulScore, we effectively lowered the overall PaulScore by decreasing enwiki's contribution.
Jul 8 2017
Jul 7 2017
Links for future Mikhail:
@debt: latest version up on beta https://discovery-beta.wmflabs.org/metrics/#sister_search_traffic :)
Jul 6 2017
Jul 5 2017
SELECT DATE(LEFT(timestamp, 8)) AS `date`, COUNT(*) AS ssclicks FROM ( SELECT DISTINCT timestamp, event_uniqueId FROM TestSearchSatisfaction2_16909631 WHERE LEFT(timestamp, 6) >= '201707' AND event_subTest IS NULL AND event_source = 'fulltext' AND event_action = 'ssclick' ) deduped GROUP BY `date` ORDER BY `date` LIMIT 10;
@Gehel said the traffic team needs to take a look at this before we can call it done.
Jun 30 2017
Jun 29 2017
Here's the distribution of tile counts per IP address per day: