Page MenuHomePhabricator

Track all Wikidata metrics currently gathered in Graphite rather than SQL and TSVs
Closed, ResolvedPublic

Description

Using graphite and statsd would be much simpler.
As far as I understand it will work for every usecase we have so far.

See T117732 regarding an analytics specific instance.

Event Timeline

Addshore claimed this task.
Addshore raised the priority of this task from to Needs Triage.
Addshore updated the task description. (Show Details)
Addshore subscribed.

To expand on the use cases for a metrics storage backend here is appropriate.

I think that Wikidata content metrics favor long term retention (i.e. forever) because their purpose is to evaluate dynamics over both short and long period intervals. Since content is always changing, recreation of a past state from live data is not possible. The value of these historical measurement "snapshots" is therefore quite high. These old data are never archived either and must be able to be retrieved without loading a dump or using some offline process.

In contrast, ops metrics are much more focused on the present and/or recent state.

Thus, two different use cases exist here. If the proposal to use Graphite can substantiate a long term ( not decaying ) storage method, then it should work for both. If not, then something else (like OpenTSDB/ HBase) should be implemented.

If the proposal to use Graphite can substantiate a long term ( not decaying ) storage method, then it should work for both

Retention and resolution changes / decay are both configurable.

Simply setting the retention to 1d:100y would / should keep daily metrics for a period of 100years

thanks for expanding on that, here's my (as the person who's been looking after our graphite stack) opinion:

  • graphite isn't really data warehouse, thus I wouldn't recommend it as the primary storage for the verbatim/authoritative data
    • though saving data in graphite for graphing/etc and archived elsewhere too I think would cater in this case
  • it is possible as @Addshore suggests to not downsample daily data for a really long time, e.g. keeping a daily metric for e.g. 100y takes 438028 bytes on disk for each metric
  • an analytics graphite instance could help, it means maintenance of that too of course.
  • if the volume of metrics isn't very high (no idea on the order of magnitude though) then using the main graphite is certainly less overhead. To give an example, if we're talking about 10k distinct metrics that'd be no problem, 100k would be ATM.

hope that helps!

For reference and our worries about graphite loosing data / data being removed please see this crude script.

https://github.com/addshore/graphite-backup

Change 253571 had a related patch set uploaded (by Addshore):
Social metrics to graphite

https://gerrit.wikimedia.org/r/253571

Change 253572 had a related patch set uploaded (by Addshore):
Convert site_stats to graphite

https://gerrit.wikimedia.org/r/253572

Change 253573 had a related patch set uploaded (by Addshore):
Convert getclaims stats to graphite

https://gerrit.wikimedia.org/r/253573

Change 253571 merged by jenkins-bot:
Social metrics to graphite

https://gerrit.wikimedia.org/r/253571

Change 253572 merged by Addshore:
Convert site_stats to graphite

https://gerrit.wikimedia.org/r/253572

Change 253573 merged by Addshore:
Convert getclaims stats to graphite

https://gerrit.wikimedia.org/r/253573

Resolved using the PSs linked.

Now to import the old data into graphite.