Page MenuHomePhabricator

Long term storage for frack prometheus data
Closed, DeclinedPublic

Description

We need to come up with aggregated metrics that we can move to prod collectors for indefinite storage.

Event Timeline

Sounds awesome!

re: indefinite storage, the global instance of Prometheus now has 1yr retention, likely to be moved to 2yrs.

Jgreen added a subtask: Unknown Object (Task).Sep 20 2017, 2:17 PM

We will look into aggregated stats again later but there were spare 1TB disks on the lvs servers so I moved the prometheus backend there and set a 2 year retention. Our rate of collection will probably increase, but at the current rate 1TB would last like 20 years, so we should have plenty of time to figure it out.

re: long term storage of data in Prometheus I wanted to expand on it also wrt hardware requirements in {T175364}. See https://phabricator.wikimedia.org/T180105#3759016 for a longer explanation but tl;dr is that the limiting factor for querying metrics in the past is loading up all datapoints for the query in memory. Since a single Prometheus instance doesn't downsample data it means that queries involving "many" metrics will have troubles looking back e.g. one year due to memory constraints.

reopening for visibility re: last comment, @cwdent @Jgreen

RobH closed subtask Unknown Object (Task) as Resolved.May 31 2018, 4:31 PM

(Resetting assignee as @cwdent has left WMF)

Jgreen moved this task from Backlog to Done on the fundraising-tech-ops board.

Closing this as wontfix because it appears to be a larger project than we want to take on due to prometheus's design limitations--both in terms of the downsampling issue fgiunchedi mentions above, plus the project's lack of interest in backward storage scheme compatibility.