Page MenuHomePhabricator

Request ID for debug log
Closed, ResolvedPublic

Description

For a high-traffic site, MediaWiki does not provide much help for finding the right records in the debug logs. It would be nice if there would be a request ID which would be recorded and the same in all log lines recorded during that request. That ID would then be included in the HTTP response, making the identification of the right log events trivial. (Ideally, the ID would even be passed along in sub-requests that the main process makes, such as jobs or internal and external API request; see discussion in T113817.)

Due to the obvious privacy concerns the recording/handling of such an ID this should be opt-in and only used for debugging, e.g. activated by the presence of the X-Wikimedia-Debug header.

Event Timeline

Tgr raised the priority of this task from to Needs Triage.
Tgr updated the task description. (Show Details)
Tgr added a project: MediaWiki-Debug-Logger.
Tgr subscribed.

@csteipp was pretty concerned about tying operational logs to webrequest logs in the discussion at T113817#1711231, but in my reading he was ok with a per-request identification token that can be used to thread together operational logs (e.g. MediaWiki's wfDebugLog events and their equivalents). I would like his input however before we actually implement a solution for this task.

When the Monolog logging stack is used, adding an internal value for correlating log events is as easy as adding a Monolog\Processor\UidProcessor to the Monolog configuration and then exposing the value it generates in the logger output. We actually already use this in the WMF production Monolog config, but the value that is generated is only visible in Logstash. Adding %extra.uid% to the log message format we use for udp2log messages would expose it on fluorine as well. Getting this value added to a response header that is sent back to the client is a bit trickier but probably doable.

A more complete solution to debug event correlation would involve something like the [[https://github.com/search?q=%40wikimedia+X-Request-Id&type=Code&utf8=%E2%9C%93|X-Request-Id correlation that RESTBase and other services implement]] being added to MediaWiki:

  1. Add a new $wgRequestId global variable (@Legoktm will hate me for that but maybe he has a better idea) that is initialized very early in the MediaWiki request cycle.
  2. Initialize $wgRequestId to either match an inbound X-Request-Id header (set by RESTBase or some other upstream tier) or a new UUID value.
  3. Add the $wgRequestId data to the logging context automatically using a Monolog processor (with support added to LegacyLogger as well).
  4. Add an X-Request-Id: $wgRequestId header to requests that MediaWiki makes to downstream services.
  5. (optionally) Add an X-Request-Id: $wgRequestId header to responses sent to clients. This would probably be feature flagged with a new DefaultSettings config variable and then WMF and other sites could decide how to set the flag in local configuration.

Add an X-Request-Id: $wgRequestId header to requests that MediaWiki makes to downstream services.

That should IMO be spun into its own task: make MediaWiki and services be able to propagate a configurable set of headers.

Another instance of that is X-Wikimedia-Debug: your code might make HTTP calls to other wikis (e.g. global user pages), to the thumbnail scalers, it might schedule jobs etc. You don't want that process tree to leak into another deployment branch.

Krinkle closed this task as Resolved.EditedSep 14 2018, 6:03 PM
Krinkle claimed this task.
Krinkle subscribed.

The reqId and unique_id property are both present on most MediaWiki channels now, and have been for a year or so (possibly longer?).

This also applies to the debug channel, per https://wikitech.wikimedia.org/wiki/X-Wikimedia-Debug#Debug_logging.

Closing as such. Please re-open if I misunderstood.