Page MenuHomePhabricator

Logging: Introduce ability to sample certain kinds of log messages to reduce overwhelm
Closed, ResolvedPublic

Description

Certain kinds of warnings like "dsr/inconsistent" can be copious and just become too noisy to actually do something about them. A lot of these warning are harmless, but we probably should continue to emit them so that we can find the signal from the noise.

One way to do this is to introduce sampling. There are a couple ways to introduce sampling:

  • Sampling at log site by adding some specified suffix, ex: "warning/foobar/sample/3" would sample 3% of the warnings emitted at that site. This may be too much granular control.
  • Config based. So, we could introduce a configuration like logger.sample["warning/foobar"] = 3; logger.sample["error/xyz"] = 50 which would then do this for all warnings of that type at all log sites.
  • Page-based? This is really coarse, and would enable full logging of certain kinds of messages for a page that got picked.

I suspect config-based sampling is probably what we want, but including a dump of other thoughts here for the record.

https://gerrit.wikimedia.org/r/#/c/246106/ should probably block on this task since I suspect we'll get overwhelmed otherwise.

Event Timeline

ssastry raised the priority of this task from to Medium.
ssastry updated the task description. (Show Details)
ssastry added a project: Parsoid.
ssastry added subscribers: ssastry, Arlolra, cscott, tstarling.

Change 246416 had a related patch set uploaded (by Arlolra):
T115464: Sample logs

https://gerrit.wikimedia.org/r/246416

Arlolra removed a project: Patch-For-Review.
Arlolra set Security to None.