Page MenuHomePhabricator

MobileWebSectionUsage schema is whitelisting both session ids and page ids
Closed, ResolvedPublic

Description

Hi @Jdlrobson and @JKatzWMF! The whitelist for the MobileWebSectionUsage EventLogging schema, of which you are the maintainers, is keeping permanently fields that contain both unique session ids and page ids. Per our data retention guidelines, we cannot keep those two items together for more than 90 days and therefore one of them should be removed from the whitelist.

Please let me know which one of the following two you'd like to keep:

  • Session IDs => keep sessionId
  • Page IDs => keep pageId

Event Timeline

fdans created this task.

This schema is marked as inactive. We're not currently using it.. what am I missing?
cc @Tbayer

@Jdlrobson there's data from this schema that is persisted beyond the 90 day window, which contains the fields above mentioned. Even if the schema is no longer active, we need to remove one of the two fields from the whitelist for the purging script to delete the pertinent data and comply with our retention guidelines.

Got it. I guess @Tbayer is the person to talk to here.

Yes, I can take care of this.

Discussed with @ovasileva today - we are going to remove the session IDs and keep the page names. I will submit a patch soon.

Change 480470 had a related patch set uploaded (by Fdans; owner: Fdans):
[analytics/refinery@master] Remove sessionID from whitelist for MobileWebSectionUsage schema

https://gerrit.wikimedia.org/r/480470

Change 480470 merged by Nuria:
[analytics/refinery@master] Remove sessionID from whitelist for MobileWebSectionUsage schema

https://gerrit.wikimedia.org/r/480470