Page MenuHomePhabricator

Update purging settings for Schema:Popups
Closed, ResolvedPublic3 Estimated Story Points

Description

(This schema has seen several updates in 2016 and 2017 since it was first entered into the whitelist.)
As per the talk page documentation: Auto-purge just eventCapsule PII after 90 days, keep the rest indefinitely. This means that an update is needed for the following fields:
event_namespaceIdSource
event_pageIdSource (this is basically redundant to the already whitelisted event_pageTitleSource, but makes some queries easier)
event_namespaceIdHover
event_isAnon
event_totalInteractionTime
event_hovercardsSuppressedByGadget
event_perceivedWait
event_editCountBucket
event_previewCountBucket
event_linkInteractionToken (unique to this schema)
event_pageToken (unique to this schema)
event_previewType
event_version

Also, "event_sessionId" in the current whitelist looks like a typo (AFAICS there was never a field with that name in this schema) and should read "event_sessionToken" instead

Event Timeline

Jdlrobson added a subscriber: Jdlrobson.

@Tbayer or @mforns are you planning to do this or do you need web team assistance (ie. do we need to plan it for an upcoming sprint?)

@Jdlrobson As you can see, it's assigned to a member of the Analytics team. I expect that no work from the web team will be required, so feel fre to remove it from the project.

In the following update to the white-list
https://gerrit.wikimedia.org/r/#/c/298721/9..10/files/mariadb/eventlogging_purging_whitelist.tsv
I applied the following changes:

### Fields that were already on the list (no action)
event_action
wiki
webHost
isTruncated
clientValidated
event_duration
event_pageTitleHover
event_pageTitleSource
event_popupDelay
event_popupEnabled

### Added the following fields
event_sessionToken
event_version
event_pageIdSource
event_namespaceIdSource
event_namespaceIdHover
event_isAnon
event_totalInteractionTime
event_previewType
event_hovercardsSuppressedByGadget
event_perceivedWait
event_editCountBucket
event_previewCountBucket
event_linkInteractionToken
event_pageToken

### Removed the following fields
event_sessionId

Please, review! We'll merge this change in short and hopefully execute the purging script before the end of the quarter.
BTW, thank you guys for trying to make non-sensitive schemas by using bucketized fields!

mforns set the point value for this task to 3.Jun 16 2017, 3:12 PM
Milimetric triaged this task as Medium priority.Jun 22 2017, 3:07 PM

Will move this task to done, because the editing of the white-list is finished and will be merged in a Gerrit patch belonging to another task: T156933.