Page MenuHomePhabricator

MediaWiki log spam during row D blip / rack D2 unavailable
Open, Needs TriagePublic

Description

[Related but different than T233623: MediaWiki log spam during master unavailable / read only, this one is about spam during https://wikitech.wikimedia.org/wiki/Incident_documentation/20190920-d2_switch_failure]

We've observed a large amount of logs sent by MW during the incident above, namely a peak of ~130k msg/s (see below) with the biggest chuck being from memcache channel:

e.g. Memcached error for key "WANCache:t:ptwiki:messages:pt" on server "127.0.0.1:11213": SERVER ERROR (https://logstash.wikimedia.org/goto/8c177e65b10582aaf15a904b91977751)

while the rest was mostly dbconnection / dbreplication channels: https://logstash.wikimedia.org/goto/006c2fa6f31a1d6b7138e673cb077a20

There should be some logging rate limit (per mw channel + level?) so that similar situations are handled more gracefully

2019-09-24-183428_1075x1185_scrot.png (1×1 px, 192 KB)

Event Timeline

Change 538931 had a related patch set uploaded (by Herron; owner: Herron):
[operations/puppet@production] logstash: throttle duplicate normalized_message with level:ERR*

https://gerrit.wikimedia.org/r/538931

Another type of spam that we have observed, php-generated but comes through by means of syslog + apache: https://logstash.wikimedia.org/goto/c5afb82f07a0249524d66b74c82e55c4

2019-09-24T11:35:36	mw1341	NOTICE	PHP Notice:  Undefined variable: wmgFileExporterBetaFeature in /srv/mediawiki/wmf-config/CommonSettings.php on line 3321
2019-09-24T11:35:35	mw1341	NOTICE	PHP Notice:  Undefined variable: wmgFileExporterBetaFeature in /srv/mediawiki/wmf-config/CommonSettings.php on line 3321
2019-09-24T11:35:35	mw1341	NOTICE	PHP Notice:  Undefined variable: wmgFileExporterBetaFeature in /srv/mediawiki/wmf-config/CommonSettings.php on line 3321
2019-09-24T11:35:35	mw1341	NOTICE	PHP Notice:  Undefined variable: wmgFileExporterBetaFeature in /srv/mediawiki/wmf-config/CommonSettings.php on line 3321
2019-09-24T11:35:35	mw1341	NOTICE	PHP Notice:  Undefined variable: wmgFileExporterBetaFeature in /srv/mediawiki/wmf-config/CommonSettings.php on line 3321
2019-09-24T11:35:35	mw1341	NOTICE	PHP Notice:  Undefined variable: wmgFileExporterBetaFeature in /srv/mediawiki/wmf-config/CommonSettings.php on line 3321
2019-09-24T11:35:35	mw1341	NOTICE	PHP Notice:  Undefined variable: wmgFileExporterBetaFeature in /srv/mediawiki/wmf-config/CommonSettings.php on line 3321
2019-09-24T11:35:35	mw1341	NOTICE	PHP Notice:  Undefined variable: wmgFileExporterBetaFeature in /srv/mediawiki/wmf-config/CommonSettings.php on line 3321
2019-09-24T11:35:35	mw1341	NOTICE	PHP Notice:  Undefined variable: wmgFileExporterBetaFeature in /srv/mediawiki/wmf-config/CommonSettings.php on line 3321
2019-09-24T11:35:35	mw1341	NOTICE	PHP Notice:  Undefined variable: wmgFileExporterBetaFeature in /srv/mediawiki/wmf-config/CommonSettings.php on line 3321

Change 538931 merged by Herron:
[operations/puppet@production] logstash: throttle duplicate normalized_message with level:ERR*

https://gerrit.wikimedia.org/r/538931

Change 540465 had a related patch set uploaded (by Herron; owner: Herron):
[operations/puppet@production] logstash: add id to drop action in filter-throttle-errors

https://gerrit.wikimedia.org/r/540465

Change 540465 merged by Herron:
[operations/puppet@production] logstash: add id to drop action in filter-throttle-errors

https://gerrit.wikimedia.org/r/540465