LGTM. I would leave out the mention of signals used to infer importance in the target. Right now we just use pageviews in the source.
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Aug 27 2016
Aug 26 2016
Aug 23 2016
Aug 17 2016
Jul 29 2016
Another issue that is independent of proper randomization, is that for most use cases, the data produced by the system cannot be used for statistical testing. Let me give an example;
@Nuria
I'm confused about how your statement "a bucket will have control and treatment for 1 experiment". I though that a bucket represents a group of users that get assigned to either the treatment or the control.
Jul 28 2016
@Slaporte Do you have any recommendations for the NOTICE file, that apache 2 suggests including in addition to the LICENSE?
Jul 26 2016
Jul 25 2016
I'm fine with using Gerrit. However, Ori should probably complete the request for a new Gerrit repo since you need to choose a code review model and location inside of MediaWiki to do that and I'm not sure what is best.
@ori Who should I add as the copyright holder? Is it me or WMF?
Jul 22 2016
Jul 18 2016
@DarTar Nithum wrote an abstract and it is up on https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase#Upcoming_showcases
Jul 14 2016
Nithum and I are planning on staying below 40 minutes, so that should work out.
Jul 11 2016
@santhosh Great! Thank you.
Jul 9 2016
Jul 8 2016
In the last quarter, we focused on building machine learning models to detect personal attacks on user talk pages. Now we will extend that work to the article talk namespace.
Hey @santhosh , I just wanted to double check with you that this is not already happening. Also, if not, what process do you suggest for adding this logging?
Katie is the owner of the private fr github repo.
Hey @kaldari, do you know who I should talk to in order to suggest changes to the admin blocking interface?
Jul 7 2016
For now I am just removing unused files and refactoring the API. I agree that it would make sense to separate the two though.
Jul 6 2016
Our current thinking is to make Detox into its own service that exposes a scoring API. This way ORES can just submit revision ids or diffs to the API and get back scores instead of running the models itself. How does that sound?
Jul 5 2016
Jul 3 2016
Jun 8 2016
Jun 7 2016
Jun 3 2016
The are very few RPA redactions (~400). There are 36k instances of the npa template.
pinged James about how to access the data
Jun 2 2016
@Pcoombe Yes, you will need access to hadoop.
Jun 1 2016
@DarTar He has approval and should sign it soon