Page MenuHomePhabricator

Add guards for session stitching
Closed, ResolvedPublic


As a software engineer instrumenting complex session-based workflows, I want to be able to establish variable sampling ratios per schema and be assured that stitching will work predictably without knowing internals, so that later on data analysis on a session basis is simplified.

As described in T205569#4726503 and T205569#4727982, it's possible for an Event Logging client to use session-based sampling in a manner on two or more schemas without sufficient intersection of events for session stitching. Specifically, if the populationSize argument to the function sessionInSample is set differently between schemas, the likelihood of finding events to stitch is diminished.

Users of the JS and PHP API should have prescriptive guidance for session stitching. For example, inline documentation could be enhanced and the JavaScript and PHP functions could be updated to emit warnings. This might be enriched with convenience methods that disallow problematic populationSize arguments and specify the context (e.g., logging in a session-stitching manner). There are several potential options, some of them complementary. These are just examples of how to potentially improve the API.