Page MenuHomePhabricator

Activity session ID seems to persist too long in some cases
Open, Needs TriagePublic

Description

The wikistories_consumption_event stream contains an activity_session_id field populated using mw.eventLog.id.getSessionId. This session ID is supposed to expire after 30 minutes of inactivity on the project.

While doing a quality check on the stream (T312262), I noticed that some sessions seem to persist much too long. Here are the ten longest in my dataset (where duration is measured by time between the first and last event with the same session ID).

approx_durationevent_count
2 days 07:20:38.8200005
1 days 05:31:53.3210002
1 days 00:59:31.68800013
0 days 21:30:48.00300034
0 days 09:03:21.2140005
0 days 08:37:44.95800018
0 days 06:59:10.0500002
0 days 05:47:38.34400011
0 days 05:20:11.3440002
0 days 05:13:03.42000027

Looking at the individual events, these all seem to follow the same pattern: all but one event in very plausible span of time (about 30 minutes or less), and then a final event many hours later. The page of the final event is usually but not always a page that has appeared previously in the session. This stream only includes events relating to Wikistories, a new beta feature on the Indonesian Wikipedia, so I expect to see break of longer than 30 minutes between events without a new session ID. However, this pattern involves breaks so long they're completely implausible.

After some digging, I found that 2 hr, 30 min is a rough dividing line between long sessions that seem correct (with lots of events spread evenly across the duration) and long sessions that seem incorrect (following the pattern of a plausible session followed by a single extremely delayed event). In my dataset of 942 sessions, 3.1% were longer than this cutoff, suggesting that the incidence of this issue is small but not negligible.

The pattern makes it seem like the session ID is getting reset after a long interval of inactivity, but not immediately, so there's one event with the old ID before a new one is generated.

For context, the wikistories_consumption_event stream is the first to use this session ID (the ID is generated by not actually logged by the session_tick instrumentation). This previously led to the discovery of one major bug with this session ID: T314622.

Event Timeline

@phuedx interested in your thoughts on this 😊

nshahquinn-wmf renamed this task from Session tick session ID seems to persist too long in some cases to Activity session ID seems to persist too long in some cases.Nov 3 2022, 6:35 PM