The top1000 files for the mediacounts are no longer being produced after January 1, 2018. See a related issue: T122864.
URL: https://dumps.wikimedia.org/other/mediacounts/daily/2018/
The top1000 files for the mediacounts are no longer being produced after January 1, 2018. See a related issue: T122864.
URL: https://dumps.wikimedia.org/other/mediacounts/daily/2018/
Subject | Repo | Branch | Lines +/- | |
---|---|---|---|---|
Chmod yearly mediacounts directory so ezachte's scripts can write top1000 files | analytics/refinery | master | +12 -0 |
Mentioned in SAL (#wikimedia-analytics) [2018-01-23T20:10:04Z] <ottomata> hdfs dfs -chmod 775 /wmf/data/archive/mediacounts/daily/2018 for T185419
Huh! Ok, so when the new 2018 directory was created by the Hadoop jobs that compute the daily mediacount files, the directory was created with a file mode that was not writeable by @ezachte's top1000 scripts.
I've chmod-ed the 2018 directory for now, but we need to make the job that creates this data chmod new directories g+w.
Change 405938 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[analytics/refinery@master] Chmod yearly mediacounts directory so ezachte's scripts can write top1000 files
Fixed as of 2018-01-23.
@ezachte : Could you launh a backfill of 2018-01-01 to 2018-01-22 ?
Many thanks !
Change 405938 merged by Joal:
[analytics/refinery@master] Chmod yearly mediacounts directory so ezachte's scripts can write top1000 files