Understanding first day: aggregate and store EditorJourney data
We decided in T209721 that we will not store any EditorJourney data after 90 days. But we do want to store data in aggregate. The main objective of the schema is to determine the most common workflows newcomers go through on their first day, and @nettrom_WMF is moving toward that level of understanding as he continues to analyze the dataset. Aggregating the number of newcomers that go through each workflow is one example of something we would want to aggregate. Other examples might include things like:

  • Weekly or monthly counts of newcomers who do any of a set of certain activities on their first day.
  • Weekly or monthly counts of how many newcomers create their accounts from each context.

We can probably start aggregating and storing things as soon as we think of things we'll want to have later -- before we come up with the final newcomer workflows or anything more sophisticated.