HomePhabricator

Produce monolog messages through kafka+avro

Description

Produce monolog messages through kafka+avro

This allows a logging channel to be configured to write
directly to kafka. Logs can be serialized either to json
blobs or the more compact apache avro format.

The Kafka handler for monolog needs a list of one of more
kafka servers to query cluster metadata from. This should be
able to use any monolog formatter, although some like
JsonFormatter require you to disable formatBatch as Kafka
protocol would prefer to encode each record independently in
the protocol. This requires the nmred/kafka-php library,
version >= 1.3.0.

Adds a new formatter which serializes to the apache avro
format. This is a compact binary format which uses pre-
defined schemas. This initial implementation is very simple
and takes the plain schemas as a constructor argument.

Adds a new option to MonologSpi to wrap handlers in a
BufferHandler. This doesn't flush until the request shuts
down and prevents any network requests in the logger from
adding latency to web requests.

Related mediawiki/vendor update: Ibfe4bd2036ae8e998e2973f07bd9a6f057691578

The necessary config is something like:

array(

'loggers' => array(
    'CirrusSearchRequests' => array(
        'handlers' => array( 'kafka' ),
    ),
),
'handlers' => array(
    'kafka' => array(
        'factory' => '\\MediaWiki\\Logger\\Monolog\\KafkaHandler::factory',
        'args' => array( 'localhost:9092' ),
        'formatter' => 'avro',
        'buffer' => true,
    ),
),
'formatters' => array(
    'avro' => array(
        'class' => '\\MediaWiki\\Logger\\Monolog\\AvroFormatter',
        'args' => array(
            array(
                'CirrusSearchRequests' => array(
                    'type' => 'record',
                    'name' => 'CirrusSearchRequests'
                    'fields' => array( ... )
                ),
            ),
        ),
    ),
),

)

Bug: T106256
Change-Id: I6ee744b3e5306af0bed70811b558a543eed22840