Our stack comprises many components/servers that all interact with each other in other to fulfil clients' requests (or prepare data to be served to clients at a later point). For any given request, requests may be spawned to other components and their responses assembled before being returned to the client. This creates the need for having a sort of a distributed stack trace that allows us to pin-point problematic links in the request chain.
A certain degree of request identification does currently exist in our infrastructure, alas only on sub-system levels:
- MediaWiki's WebRequest relies on [the UNIQUE_ID env variable](https://github.com/wikimedia/mediawiki/blob/f6d582a91ee990de3ba04dad67eba055040a0e3f/includes/WebRequest.php#L266-L282) provided by [Apache's mod_unique_id](http://httpd.apache.org/docs/current/mod/mod_unique_id.html)
- RESTBase and the services behind it use and propagate the [X-Request-Id header](https://github.com/wikimedia/hyperswitch/blob/aa78fb649213bb0a4445135338341b71505cbc7d/lib/server.js#L260)
- EventBus relies on the [same x-request-id header](https://github.com/wikimedia/mediawiki-extensions-EventBus/blob/c36754afe5f04e72b799077c965180b96e187747/includes/EventBus.php#L393-L403) when creating events for both asynchronous updates as well as JobQueue messages
- Thumbor uses a [custom Thumbor-Request-Id header](https://phabricator.wikimedia.org/diffusion/THMBREXT/browse/master/wikimedia_thumbor/logging/__init__.py;f4bfba091899c22ebdad950abded7c869749d1d1$26)
There are probably more such examples.
In order to be able to trace the requests provoked by an (initial/external) request, all of the systems in our infrastructure should identify requests in the same way, use this identifier for logging and propagate it to other links in the request chain.
Use a UUID v1/v4 x-request-id header/entity. Varnish f-e (soon ATS) is the main point of entry of external requests. Therefore, it can generate the request IDs and attach them to requests in the form of the x-request-id header, which can then be used and propagated by all entities behind it. Furthermore, entities responding to requests must log the received/generated request ID.
- T89562: RESTBase should set Request-ID and perhaps X-Forwarded-For headers for external requests
- T97226: Include the request ID in API request logs
- T97207: Forward X-Request-ID header in outgoing requests
- T117021: Request ID for debug log
- T113817: Add request_id to webrequest logs as well as other event records ingested into Hadoop
- T200594: Add client identifier to requests sent from Kartotherian to WDQS
- T193050: Include request id (if present) in a comment in DB queries
- T147101: Uniform performance insight for different services (tracking)