Page MenuHomePhabricator

Productionize and run 2018 job for Global Innovation Index from Hadoop Geowiki data
Closed, DuplicatePublic13 Estimated Story Points

Description

Since we have this data in Hadoop, we can make Oozie automatically generate what @leila has been doing manually (aggregating and truncating some countries with a lower number of edits, in general, but we should add specifics in comments below).

The code that Leila ran is referenced in T178183#4079604:

ssh leila@stat1006.eqiad.wmnet
mysql --defaults-extra-file=/etc/mysql/conf.d/research-client.cnf -hanalytics-slave.eqiad.wmnet
use staging;
select country, sum(edits) edits from erosen_geocode_country_edits where start like '2017-%-01' group by country order by edits desc;
select country, sum(edits) edits from erosen_geocode_country_edits where start like '2016-%-01' group by country order by edits desc;

Event Timeline

Milimetric triaged this task as Medium priority.Mar 23 2018, 4:18 PM
Milimetric created this task.
fdans lowered the priority of this task from Medium to Low.Mar 26 2018, 3:59 PM
fdans added a project: good first task.
fdans moved this task from Incoming to Geowiki on the Analytics board.
fdans set the point value for this task to 13.Apr 23 2018, 4:17 PM
fdans raised the priority of this task from Low to Medium.Oct 15 2018, 4:29 PM
Milimetric renamed this task from Productionize job for Global Innovation Index from Hadoop Geowiki data to Productionize and run 2018 job for Global Innovation Index from Hadoop Geowiki data.Jan 22 2019, 10:30 PM
Milimetric claimed this task.
Milimetric raised the priority of this task from Medium to Needs Triage.
Milimetric triaged this task as High priority.
Milimetric moved this task from Geowiki to Incoming on the Analytics board.

Moving this to Kanban and assigning this work to @fdans

mforns lowered the priority of this task from High to Medium.Mar 25 2019, 5:19 PM
mforns raised the priority of this task from Medium to High.Mar 25 2019, 5:19 PM