Another issue that is independent of proper randomization, is that for most use cases, the data produced by the system cannot be used for statistical testing. Let me give an example;

Jul 29 2016, 8:38 PM · Analytics-Radar, Traffic, SRE

• ellery added a comment to T135762: A/B Testing solid framework .

@Nuria
I'm confused about how your statement "a bucket will have control and treatment for 1 experiment". I though that a bucket represents a group of users that get assigned to either the treatment or the control.

Jul 29 2016, 8:16 PM · Analytics-Radar, Traffic, SRE

• ellery added a comment to T135762: A/B Testing solid framework .

@Nuria, @BBlack
I need to clarify that in the example that I gave above, the experiments were not run concurrently, but in sequence.

Jul 29 2016, 8:15 PM · Analytics-Radar, Traffic, SRE

Jul 28 2016

• ellery added a comment to T141019: Decide what license to use.

@Slaporte Do you have any recommendations for the NOTICE file, that apache 2 suggests including in addition to the LICENSE?

Jul 28 2016, 6:40 PM · WMF-Legal, Recommendation-API

Jul 26 2016

• ellery added a comment to T135762: A/B Testing solid framework .

@BBlack, @Nuria
In order to run a randomized controlled experiment, you need to ensure that users are randomly assigned to treatment conditions at the start of every experiment and that they remain in their treatment group for the entire duration of the experiment.

Jul 26 2016, 9:47 PM · Analytics-Radar, Traffic, SRE

Jul 25 2016

• ellery added a comment to T141022: Give Gerrit a try and make a decision about whether to switch or not.

I'm fine with using Gerrit. However, Ori should probably complete the request for a new Gerrit repo since you need to choose a code review model and location inside of MediaWiki to do that and I'm not sure what is best.

Jul 25 2016, 6:08 PM · Recommendation-API

• ellery added a comment to T141019: Decide what license to use.

@ori Who should I add as the copyright holder? Is it me or WMF?

Jul 25 2016, 5:44 PM · WMF-Legal, Recommendation-API

Jul 22 2016

• ellery reopened T139863: Request creation of readmore labs project, a subtask of T76375: [DO NOT USE] New Labs project requests (tracking) [superseded by #cloud-vps-project-requests], as Open.

Jul 22 2016, 5:25 PM · User-bd808, Tracking-Neverending, Cloud-Services

• ellery reopened T139863: Request creation of readmore labs project as "Open".

Jul 22 2016, 5:25 PM · Cloud-Services

• ellery reopened T139864: Request creation of detox labs project as "Open".

Jul 22 2016, 5:25 PM · Cloud-Services

• ellery reopened T139864: Request creation of detox labs project, a subtask of T76375: [DO NOT USE] New Labs project requests (tracking) [superseded by #cloud-vps-project-requests], as Open.

Jul 22 2016, 5:25 PM · User-bd808, Tracking-Neverending, Cloud-Services

Jul 18 2016

• ellery added a comment to T139700: Research showcase July 2016.

@DarTar Nithum wrote an abstract and it is up on https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase#Upcoming_showcases

Jul 18 2016, 9:34 PM · Research-Archive, Discussion-modeling (of Toxicity), Research-outreach

• ellery closed T130932: Use Wikidata Item Embeddings for Recommendation as Invalid.

Jul 18 2016, 5:05 PM · GapFinder

Jul 14 2016

• ellery added a comment to T139700: Research showcase July 2016.

Nithum and I are planning on staying below 40 minutes, so that should work out.

Jul 14 2016, 7:02 PM · Research-Archive, Discussion-modeling (of Toxicity), Research-outreach

Jul 11 2016

• ellery added a comment to T140041: analyze user_talk editing behavior of users blocked for harassment / personal attacks.

See: https://github.com/ewulczyn/wiki-detox/blob/master/src/analysis/Prevalence%20and%20Efficacy%20of%20Moderation.ipynb

Jul 11 2016, 11:52 PM · Discussion-modeling (of Toxicity)

• ellery created T140041: analyze user_talk editing behavior of users blocked for harassment / personal attacks.

Jul 11 2016, 11:52 PM · Discussion-modeling (of Toxicity)

• ellery updated the task description for T133078: Put Talk Diff Datasets on Public Mount.

Jul 11 2016, 11:49 PM · Discussion-modeling (of Toxicity)

• ellery added a comment to T133078: Put Talk Diff Datasets on Public Mount.

The raw diffs are up. See:
https://datasets.wikimedia.org/public-datasets/enwiki/article_talk_diffs_tsv
https://datasets.wikimedia.org/public-datasets/enwiki/user_talk_diffs_tsv/

Jul 11 2016, 11:47 PM · Discussion-modeling (of Toxicity)

• ellery added a comment to T139785: Log whether a CX translation was initiated via GapFinder .

@santhosh Great! Thank you.

Jul 11 2016, 4:48 PM · GapFinder

Jul 9 2016

• ellery closed T139863: Request creation of readmore labs project as Invalid.

Jul 9 2016, 11:30 PM · Cloud-Services

• ellery closed T139863: Request creation of readmore labs project, a subtask of T76375: [DO NOT USE] New Labs project requests (tracking) [superseded by #cloud-vps-project-requests], as Invalid.

Jul 9 2016, 11:30 PM · User-bd808, Tracking-Neverending, Cloud-Services

• ellery closed T139864: Request creation of detox labs project as Invalid.

Jul 9 2016, 11:29 PM · Cloud-Services

• ellery closed T139864: Request creation of detox labs project, a subtask of T76375: [DO NOT USE] New Labs project requests (tracking) [superseded by #cloud-vps-project-requests], as Invalid.

Jul 9 2016, 11:29 PM · User-bd808, Tracking-Neverending, Cloud-Services

• ellery created T139864: Request creation of detox labs project.

Jul 9 2016, 10:50 PM · Cloud-Services

• ellery renamed T139863: Request creation of readmore labs project from Request creation of <Replace Me> labs project to Request creation of readmore labs project.

Jul 9 2016, 10:49 PM · Cloud-Services

• ellery created T139863: Request creation of readmore labs project.

Jul 9 2016, 10:49 PM · Cloud-Services

Jul 8 2016

• ellery moved T139253: Clean up repository from Backlog to Done on the GapFinder board.

Jul 8 2016, 10:15 PM · GapFinder

• ellery added a comment to T139704: Discussion modeling - release notebooks; write up and present results.

Jul 8 2016, 10:15 PM · Epic, Research-and-Data-2016-17-Q1, Discussion-modeling (of Toxicity), Research-Freezer

• ellery added a comment to T139703: Design and evaluate ''attack'' and ''aggressiveness'' models on article talk comments.

In the last quarter, we focused on building machine learning models to detect personal attacks on user talk pages. Now we will extend that work to the article talk namespace.

Jul 8 2016, 10:11 PM · Epic, Research-and-Data-2016-17-Q1, Discussion-modeling (of Toxicity), Research-Freezer

• ellery created T139790: Log the title the user wants to give the new article in the target language in "Create from scratch".

Jul 8 2016, 8:03 PM · GapFinder

• ellery updated subscribers of T139785: Log whether a CX translation was initiated via GapFinder .

Jul 8 2016, 7:56 PM · GapFinder

• ellery updated subscribers of T139785: Log whether a CX translation was initiated via GapFinder .

Hey @santhosh , I just wanted to double check with you that this is not already happening. Also, if not, what process do you suggest for adding this logging?

Jul 8 2016, 7:55 PM · GapFinder

• ellery created T139785: Log whether a CX translation was initiated via GapFinder .

Jul 8 2016, 7:04 PM · GapFinder

• ellery added a comment to T135392: access request for users at fundraising analytics consultant CPS Data Consulting .

Katie is the owner of the private fr github repo.

Jul 8 2016, 4:08 PM · fundraising-tech-ops, Fundraising-Backlog

• ellery updated subscribers of T139710: Add option to cite a revision when blocking a user for personal attacks.

Hey @kaldari, do you know who I should talk to in order to suggest changes to the admin blocking interface?

Jul 8 2016, 1:22 AM · Discussion-modeling (of Toxicity)

• ellery created T139710: Add option to cite a revision when blocking a user for personal attacks.

Jul 8 2016, 1:21 AM · Discussion-modeling (of Toxicity)

• ellery created T139709: Add option to cite a revision in NPA templates.

Jul 8 2016, 1:18 AM · Discussion-modeling (of Toxicity)

Jul 7 2016

• ellery added a comment to T139253: Clean up repository.

For now I am just removing unused files and refactoring the API. I agree that it would make sense to separate the two though.

Jul 7 2016, 10:16 PM · GapFinder

Jul 6 2016

• ellery added a comment to T139007: [Discuss] Detox integration with ORES.

Our current thinking is to make Detox into its own service that exposes a scoring API. This way ORES can just submit revision ids or diffs to the API and get back scores instead of running the models itself. How does that sound?

Jul 6 2016, 5:25 PM · Research, Machine-Learning-Team (Active Tasks), Discussion-modeling (of Toxicity)

• ellery added a project to T139007: [Discuss] Detox integration with ORES: Discussion-modeling (of Toxicity).

Jul 6 2016, 4:32 PM · Research, Machine-Learning-Team (Active Tasks), Discussion-modeling (of Toxicity)

• ellery updated subscribers of T139007: [Discuss] Detox integration with ORES.

Jul 6 2016, 4:32 PM · Research, Machine-Learning-Team (Active Tasks), Discussion-modeling (of Toxicity)