Page MenuHomePhabricator

Homepage: EventLogging schema(s)
Closed, ResolvedPublic

Description

This task is to develop the EventLogging schema(s) to be used for instrumenting the newcomer homepage, and to put the schema(s) on Meta Wiki. We should keep in mind that multiple schemas are an option.

This is the measurement specifications document, which should list everything needed to develop the schemas.

After the schema(s) are specified, we will be able to send them for approval to the Legal and Analytics teams.

Event Timeline

I think these are done:

And we have extended help panel schema to accept "homepage_help" and "homepage_mentor" as editor interface.

FYI we are very close to being able to support 'map types' in JSONschemas + Hive: T215442: Make Refine use JSONSchemas of event data to support Map types and proper types for integers vs decimals

This might make it easier for fields like action_data where you want to encode slightly more free-form information about the action taken.

We will use this task to document our DACI Approvals.

Approval from Leighanna: You have Legal's approval for the DACI.
Based off your plan, it looks like you'll need another extension under the data retention guidelines. That will be the same process as we used for the HelpPanel. We don't need to sort that out right away, just before you get to the 90-day mark on the data you collect through this project, though obviously sooner is better than later.

Approval from Security: https://phabricator.wikimedia.org/T219289#5094347

Retention Guidelines were revised by Legal to read: New editor research: There is an additional short-term extension for data collected as part of the Personalized first day project. Most EventLogging data collected for this project will be retained for up to nine months, beginning on 6 May 2019; page IDs, page titles, and longer-term session IDs will still be deleted, aggregated, or de-identified after at most 90 days. This extension is necessary in order to gain a statistically significant sample from a small pool of users and analyze the information collected. The retained data will be deleted, aggregated, or de-identified no later than 6 February 2020.

Our initial EventLogging has been working well on desktop and we have our purging strategy under control.