Page MenuHomePhabricator

Continue New AQS Loading
Closed, ResolvedPublic21 Story Points

Description

Load data and ensure proper compaction/compression is used.

Event Timeline

Restricted Application added subscribers: Zppix, Aklapper. · View Herald TranscriptJul 20 2016, 8:36 AM
JAllemandou set the point value for this task to 21.

After changing compression from lz4 to deflate and relaoding a month of data (January 2016), we are down to about 120Gb per instance, which is way better than 250G. Proceeding with loading February 2016.

Milimetric triaged this task as Normal priority.Aug 8 2016, 4:52 PM

For my fellow an-engineers to replace me while I'm on holidays: https://etherpad.wikimedia.org/p/backfilling_aqs

JAllemandou reassigned this task from JAllemandou to Nuria.Aug 29 2016, 4:07 PM
Nuria added a comment.Sep 1 2016, 6:00 PM

Tested a bit how are we doing consistency wise and thus far things checkout. I found 1 issue. See repro below.

Current API:
http://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/wikidata.org/all-access/user/Q604141/daily/20160601/20160630

In new cluster:
http://localhost:7232/analytics.wikimedia.org/v1/pageviews/per-article/wikidata.org/all-access/user/Q604141/daily/20160601/20160630

In new cluster we are storing: {"project":"wikidata","article":"Q604141","granularity":"daily","timestamp":"2016060100","access":"all-access","agent":"user","views":null} and in the old cluster we have zeroes rather than nulls to represent lack of views[{"project":"wikidata","article":"Q604141","granularity":"daily","timestamp":"2016060100","access":"all-access","agent":"user","views":0}

I think storage wise the new cluster is correct but the API should not return null, it should map null to zero.

Nuria added a comment.Sep 8 2016, 3:56 PM

We need to load data for all endpoints. Unique devices, top data.

Change 309602 had a related patch set uploaded (by Nuria):
Change default compression scheme

https://gerrit.wikimedia.org/r/309602

Data all loade for all endpoints except daily-top, currently finishing.

Change 309602 merged by Nuria:
Update per-article compression scheme to default (LCS)

https://gerrit.wikimedia.org/r/309602

Nuria closed this task as Resolved.Sep 19 2016, 3:08 PM
Nuria moved this task from Ready to Deploy to Done on the Analytics-Kanban board.

Change 315283 had a related patch set uploaded (by Elukey):
Update per-article compression scheme to default (LCS)

https://gerrit.wikimedia.org/r/315283

Change 315283 merged by Nuria:
Update per-article compression scheme to default (LCS)

https://gerrit.wikimedia.org/r/315283