Page MenuHomePhabricator

MediaWiki logging indexing conflict on 'session' for 'session-ip' channel
Closed, ResolvedPublic

Description

This indexing error is popping up in logstash events, apparently due to session-ip channel logging session as a string while it is normally a JSON object.

I'm not pasting the specific log entries, however this is a log that's being indexed: https://logstash-next.wikimedia.org/app/dashboards#/doc/acba6310-f6d3-11ea-b848-090a7444f26c/logstash-mediawiki-2020.12.08?id=pjXKQnYBjr5R1RLChowm

while for channel session-ip the session field is logged as a string.

Event Timeline

Reedy renamed this task from Mediawiki logging indexing conflict on 'session' for 'session-ip' channel to MediaWiki logging indexing conflict on 'session' for 'session-ip' channel.Dec 8 2020, 5:53 PM
Reedy added a subscriber: Reedy.

Is this a new/recent behaviour/change?

I'm wondering if this is potential fallout from the upgrade to monolog 2.0 in T242751: Update monolog/monolog to 2.1.1 or later, or whether it's another MW change, or something just being surfaced by the logstash et al upgrade

AFAICS for session it started a couple of days ago: https://logstash.wikimedia.org/goto/843581652ed2250401a77167eb49554e . The indexing errors shows up both on ELK 5 and 7 so not related to the upgrade. Looks like sth in php-1.36.0-wmf.20 ? That'd be my best bet

Yeah, it's probably related to the upgrade to monolog 2.0

I don't think it's related to the fix for T269286: Monolog update removes exception object from logstash, which is in the logstash formatter... The formatter doesn't do anything (explicitly) with any keys with 'session' in them

https://github.com/wikimedia/mediawiki/blame/master/includes/session/SessionManager.php#L1073-L1082 is the only usage of 'session-ip' I can find...

And looking Session::getId() it's always been a string (but obviously it could've been json and still been a string)... https://github.com/wikimedia/mediawiki/blame/master/includes/session/Session.php#L76-L82

The 'session' log channel is much more widely used, in various places

This doesn't co-incide with any of the redis -> kask session work does it? As such, T206016: Create a service for session storage and substasks

And looking Session::getId() it's always been a string (but obviously it could've been json and still been a string)...

Ah, I see it is still a string with JSON in it (from the logstash-next link)... Just not being interpretted as a json object by logstash (is that because MW/Monlog isn't annotating it correctly?)...

{
  "MediaWiki\\Session\\SessionInfo": "[50]CentralAuthSessionProvider<+:REDACTED:REDACTED>REDACTED"
}

https://github.com/Seldaek/monolog/releases/tag/2.0.0
https://github.com/Seldaek/monolog/releases/tag/2.0.1
https://github.com/Seldaek/monolog/releases/tag/2.0.2

It is a json object as sent by session channel, however in other cases the field session is sent as a string (i.e. just the session ID) and not an object, and this conflict prevents proper indexing. Moreover, the session field type will flip/flop between object and string every day depending on which type gets indexed first at midnight UTC. IMHO the easiest fix would be to send session always as a string.

Examples:

Reedy added subscribers: Pchelolo, daniel.

Ah. It looks like this might not necessarily be the monolog upgrade...

When it is a json object, is it always (or at least from what you've seen) mentioning CentralAuthSessionProvider?

T268617: Create a variation of the CentralAuthTokenSessionProvider based on an Authorization header/rECAU6c0eb6c92f36: Create CentralAuthHeaderSessionProvider seems potentially suspect (though not necessarily to blame) with other session work in that area...

CC @daniel and @Pchelolo

Grepping over the codebase for 'session' => I can find numerous places where 'session' key is filled with sessionId and a numbers other places where it's filled with session info. For example, SessionManager as a whole logs SessionInfo, while SessionBackend class logs sessionId.

I am not sure what exactly has caused this problem to surface now, that's for sure not @daniel patches since they are not yet deployed anywhere (rolling on wmf.21). The instances where object is provided we are placing instances of SessionInfo object into the log context. I'm not sure how it is turned into array by monolog though - it's not JsonSerializable, it doesn't have public properties. Maybe monolog before 2.0 was just omitting this key since it didn't know how to turn an instance of SessionInfo into an array?

This is obviously a bug in MW - we should have a convention on what the 'session' key stand for. Let's make it always sessionId.

.21 is on group0 and group1 fwiw, so they are deployed on quite a few wikis, that and one of the linked changes are clearly on .21 due to paths - /srv/mediawiki/php-1.36.0-wmf.21/includes/session/SessionManager.php

I was more highlighting that there was work in the area, but did say they weren't necessarily to blame

This is still happening AFAICS, it is essentially a matter of chance whether session as an object or as a string is "valid" for a given day (i.e. depending on which type makes in first to elasticsearch at midnight UTC when the indices roll over)

Change 658433 had a related patch set uploaded (by Cwhite; owner: Cwhite):
[mediawiki/core@master] Update logging to use the concrete session value

https://gerrit.wikimedia.org/r/658433

This seems like a behavior change in Monolog 2, and I think the sane behavior would be to use the object's __toString if it has one, that's kind of the purpose of that method. Monolog 2 instead seems to convert to [ <classname> => <__toString()> ] which I don't think has any benefit.

Change 658698 had a related patch set uploaded (by Cwhite; owner: Cwhite):
[mediawiki/core@master] Add LogstashFormatter pre-formatter step to respect string representations

https://gerrit.wikimedia.org/r/658698

Change 658433 merged by jenkins-bot:
[mediawiki/core@master] Update SessionManager to pass session string representation to logger

https://gerrit.wikimedia.org/r/658433

Change 663890 had a related patch set uploaded (by Gergő Tisza; owner: Gergő Tisza):
[mediawiki/core@master] Call __toString explicitly when passing an object to the logger

https://gerrit.wikimedia.org/r/663890

Change 663891 had a related patch set uploaded (by Gergő Tisza; owner: Gergő Tisza):
[mediawiki/extensions/CentralAuth@master] Call __toString explicitly when passing an object to the logger

https://gerrit.wikimedia.org/r/663891

Change 658698 abandoned by Cwhite:
[mediawiki/core@master] Add LogstashFormatter pre-formatter step to respect string representations

Reason:
in favor of the approach applied in If92d60289fa925d19261ef912e2f2a5d31db31dc, I7a213417c6ce9741f190beb4bf626059a97a2f4f, Ia7875346cec95fbc30bc4579ec8b7fb60e911d12, and I8d442a7bc066bc0f2a1b69ceb1a5e1a863102685

https://gerrit.wikimedia.org/r/658698

Change 663891 merged by jenkins-bot:
[mediawiki/extensions/CentralAuth@master] session: Call __toString explicitly when passing an object to the logger

https://gerrit.wikimedia.org/r/663891

Change 663890 merged by jenkins-bot:
[mediawiki/core@master] Call __toString explicitly when passing an object to the logger

https://gerrit.wikimedia.org/r/663890

Change 664653 had a related patch set uploaded (by Reedy; owner: Gergő Tisza):
[mediawiki/extensions/CentralAuth@REL1_35] session: Call __toString explicitly when passing an object to the logger

https://gerrit.wikimedia.org/r/664653

Change 664705 had a related patch set uploaded (by Reedy; owner: Cwhite):
[mediawiki/core@REL1_35] Update SessionManager to pass session string representation to logger

https://gerrit.wikimedia.org/r/664705

Change 664706 had a related patch set uploaded (by Reedy; owner: Gergő Tisza):
[mediawiki/core@REL1_35] Call __toString explicitly when passing an object to the logger

https://gerrit.wikimedia.org/r/664706

Change 664653 merged by jenkins-bot:
[mediawiki/extensions/CentralAuth@REL1_35] session: Call __toString explicitly when passing an object to the logger

https://gerrit.wikimedia.org/r/664653

Change 664705 merged by jenkins-bot:
[mediawiki/core@REL1_35] Update SessionManager to pass session string representation to logger

https://gerrit.wikimedia.org/r/664705

Change 664706 merged by jenkins-bot:
[mediawiki/core@REL1_35] Call __toString explicitly when passing an object to the logger

https://gerrit.wikimedia.org/r/664706

colewhite claimed this task.

Since the latest MediaWiki deploy, these indexing errors have ceased.

Thanks, all!

Confirmed by looking at https://logstash.wikimedia.org/app/dashboards#/view/session over 4 weeks with either channel.keyword: session or channel.keyword: session-ip.