nobody knows what aggregate-datasets, limn-public-data, etc. are for
Requirements
* Dashiki needs structured folder for structured metrics *something*/<<metric-name>>/<<submetric-name>>/<<wiki>>.tsv
* Dashiki needs unstructured folders to graph random files (hopefully this doesn't get too crazy, maybe all should go in a base directory that's specifically for unstructured metrics
* Researchers on stat1003 output public datasets
* Researchers on stat1002 output public datasets
Current State
* stat1003 rsyncs to limn-public-data
* stat1002 rsyncs to aggregate-datasets
* stat1002 *now* rsyncs to limn-public-data
* ?? public-datasets (looks like ad-hoc work)
Ideal Solution
stat1001: https://datasets.wikimedia.org
README.md
/common
README.md: this is rsynced from stat1002 and 1003 and wherever with no --delete
/reports
README.md
/per-wiki
/sessions
/visualeditor
/enwiki.tsv
/all.tsv
/wikitext
/enwiki.tsv
/all.tsv
/cross-wiki
/request-breakdowns (now browser, we should rename)
/by-os-or-browser.tsv
/by-os.tsv
stat1003:/srv/reportupdater/output/... -> stat1001:.../reports/
stat1002:/a/reportupdater/output/... -> stat1001:.../reports/
Steps
1. move unstructured stuff from limn-public-data/* to common/legacy/limn-public-data/*
2. symlink limn-public-data to common/legacy/limn-public-data
3. move structured stuff from limn-public-data to reports
4. announce the plan to do the same thing for aggregate-datasets and public-data
5. in the distant future delete the symlinks
6. Make sure intentions for directories are documented in README
7. send an email to list
8. wikitech documentation?
9. update dashiki config & code for datasets api root (remove /metrics)
10. update the output paths of reportupdater jobs