Page MenuHomePhabricator

Send scap log directly to logstash via syslog input
Open, MediumPublic

Description

Logging directly to logstash via the syslog input is more robust and matches current MediaWiki practice. Scap is one of 3 remaining udp2log input channels.

Event Timeline

bd808 created this task.Jan 15 2015, 9:13 PM
bd808 raised the priority of this task from to Needs Triage.
bd808 updated the task description. (Show Details)
bd808 added a project: Deployments.
bd808 added a subscriber: bd808.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJan 15 2015, 9:13 PM
greg triaged this task as High priority.Jan 15 2015, 9:29 PM
greg lowered the priority of this task from High to Medium.
greg moved this task from To Triage to Backlog (Tech) on the Deployments board.
bd808 added a comment.EditedJan 15 2015, 10:26 PM

Scap already formats its messages for logstash so the change needed is to mimic the Monolog RedisHandler and rpush the log events on to the "logstash" list in redis on any of the logstash servers. The prod MW config randomly selects a redis host for each web request from the 3 possible. In beta there is only a single host to choose.

Config should allow choosing udp2log and/or redis as output and getting a list of redis servers based on the existing cascading config system. When redis is configured, the list of servers should be shuffled and the first or last one picked.

bd808 renamed this task from [scap] Log directly to logstash via redis input to [scap] Log directly to logstash via syslog input.May 14 2015, 2:51 AM
bd808 updated the task description. (Show Details)
bd808 set Security to None.
greg edited projects, added scap2; removed Deployments.Feb 9 2016, 11:34 PM
mmodell edited projects, added Scap; removed scap2.Feb 10 2017, 6:22 PM
bd808 added a comment.Feb 10 2017, 6:33 PM

The notes about redis input are horribly out of date. The redis input queue was killed fairly soon after being deployed.

Syslog is the transport used by MediaWiki these days, but there are several others possible. There are two raw json inputs: json codec over UDP on port 11514, and json_lines codec over TCP on port 11514 might be the easiest to use with a Python app like Scap. The python-logstash library or custom code should be pretty easy to use with either.

mmodell moved this task from Needs triage to Debt on the Scap board.Feb 1 2018, 12:18 AM

Update on udp2log deprecation: we've deployed the new logging infrastructure now, IOW the recommended way is to use the system's syslog daemon (unix socket on /dev/log that is) and opt-in programs to have their syslog ingested into logstash via kafka. For json logging the syslog payload should be prefixed with @cee: to signal structured/json logging. I'm happy to provide further guidance too!

greg renamed this task from [scap] Log directly to logstash via syslog input to Log directly to logstash via syslog input.Feb 12 2019, 11:29 PM

Change 493232 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] deployment_server: ship logs through logging pipeline

https://gerrit.wikimedia.org/r/493232

Change 493232 merged by Filippo Giunchedi:
[operations/puppet@production] deployment_server: ship logs through logging pipeline

https://gerrit.wikimedia.org/r/493232

fgiunchedi renamed this task from Log directly to logstash via syslog input to Send scap log directly to logstash via syslog input.Jul 2 2019, 12:42 PM

Change 563468 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[mediawiki/tools/scap@master] Support for logging json to syslog

https://gerrit.wikimedia.org/r/563468

Change 563468 merged by jenkins-bot:
[mediawiki/tools/scap@master] Support for logging json to syslog

https://gerrit.wikimedia.org/r/563468

Patch has been deployed, however I overlooked adding @cee and will followup with a fix

Change 629342 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[mediawiki/tools/scap@master] log: add @cee: token for structured logging

https://gerrit.wikimedia.org/r/629342

fgiunchedi moved this task from Backlog to Doing on the User-fgiunchedi board.Sep 23 2020, 12:20 PM

Change 629342 merged by jenkins-bot:
[mediawiki/tools/scap@master] log: add @cee: token for structured logging

https://gerrit.wikimedia.org/r/629342

@LarsWirzenius re: the next scap release to get this change included, is there a timeline ? thanks!

ASAP, I'm sorting out things so I can start process soon.