Logging directly to logstash via the syslog input is more robust and matches current MediaWiki practice. Scap is one of 3 remaining udp2log input channels.
|Open||None||T227080 Deprecate all non-Kafka logstash inputs|
|Open||None||T205856 Retire udp2log: onboard its producers and consumers to the logging pipeline|
|Open||None||T86969 Send scap log directly to logstash via syslog input|
Scap already formats its messages for logstash so the change needed is to mimic the Monolog RedisHandler and rpush the log events on to the "logstash" list in redis on any of the logstash servers. The prod MW config randomly selects a redis host for each web request from the 3 possible. In beta there is only a single host to choose.
Config should allow choosing udp2log and/or redis as output and getting a list of redis servers based on the existing cascading config system. When redis is configured, the list of servers should be shuffled and the first or last one picked.
The notes about redis input are horribly out of date. The redis input queue was killed fairly soon after being deployed.
Syslog is the transport used by MediaWiki these days, but there are several others possible. There are two raw json inputs: json codec over UDP on port 11514, and json_lines codec over TCP on port 11514 might be the easiest to use with a Python app like Scap. The python-logstash library or custom code should be pretty easy to use with either.
Update on udp2log deprecation: we've deployed the new logging infrastructure now, IOW the recommended way is to use the system's syslog daemon (unix socket on /dev/log that is) and opt-in programs to have their syslog ingested into logstash via kafka. For json logging the syslog payload should be prefixed with @cee: to signal structured/json logging. I'm happy to provide further guidance too!