Page MenuHomePhabricator

Managing size of page-create and revision-create tables in storage. Aggregation?
Closed, ResolvedPublic8 Estimated Story Points

Description

It looks like both page-create and revision-create tables are receiving quite a bit of data, for rebision-create we have about 1 million records daily. We need to define an agreggation strategy for this table to keep this data long term , otherwise it is soon going to be too big as this data is public by nature and does not need to be deleted.

Event Timeline

Restricted Application added subscribers: TerraCodes, Aklapper. · View Herald Transcript
Nuria raised the priority of this task from Medium to High.Jul 6 2017, 4:15 PM
Nuria updated the task description. (Show Details)
Nuria moved this task from Incoming to Operational Excellence Future on the Analytics board.

Hm, I had thought that the revision-create data would be about the same size as the EventLogging analytics Edit schema data. Is this not true? If so, then it may actually be too big for MySQL at all.

I think Kaldari only cares about page-create, so perhaps we should not insert revision-create data into MySQL at all.

Aklapper renamed this task from Managing size of page-create and revison-create tables in storage. Agreggation? to Managing size of page-create and revision-create tables in storage. Agreggation? .Jul 10 2017, 2:40 PM

@kaldari: can you confirm that you only care about page-create data?

Yep, only care about page-create data. I would be fine with not inserting revision-create data into MySQL at all.

Ok, will disable this when I also truncate the table in a bit, possibly tomorrow.

Change 364262 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Stop writing mediawiki.revision-create events to EventLogging analytics MySQL

https://gerrit.wikimedia.org/r/364262

Change 364262 merged by Ottomata:
[operations/puppet@production] Stop writing mediawiki.revision-create events to EventLogging analytics MySQL

https://gerrit.wikimedia.org/r/364262

Nemo_bis renamed this task from Managing size of page-create and revision-create tables in storage. Agreggation? to Managing size of page-create and revision-create tables in storage. Aggregation? .Jul 11 2017, 7:14 AM
nshahquinn-wmf raised the priority of this task from High to Needs Triage.Mar 30 2018, 10:32 AM
nshahquinn-wmf moved this task from Backlog to Radar on the Contributors-Analysis board.