Page MenuHomePhabricator

Whitelist CentralNotice banner history events for sanitization and long-term storage
Closed, ResolvedPublic1 Estimated Story Points

Description

See T161656#5874648.

Event Timeline

Change 572767 had a related patch set uploaded (by AndyRussG; owner: AndyRussG):
[analytics/refinery@master] Add CentralNoticeBannerHistory schema to EventLogging whitelist

https://gerrit.wikimedia.org/r/572767

This would take effect once we deploy it but Analytics needs to run a refine job from the begging of the data stream being available. cc @fdans (who has the ops week) and @Ottomata

This would take effect once we deploy it but Analytics needs to run a refine job from the begging of the data stream being available. cc @fdans (who has the ops week) and @Ottomata

Thanks so much!!!!!

needs to run a refine job from the begging of the data stream being available

There should be 90 days available now, but the sanitize_delayed job will start sanitizing from 45 days ago once refinery is deployed. @AndyRussG, do you need the 90-45 days old data to be manually sanitized into the event_sanitized database too, or is starting from 45 days ago ok.

There should be 90 days available now, but the sanitize_delayed job will start sanitizing from 45 days ago once refinery is deployed. @AndyRussG, do you need the 90-45 days old data to be manually sanitized into the event_sanitized database too, or is starting from 45 days ago ok.

We do need the 90 days... or nearly. The year-end English-language campaign started on December 2nd, so, from today, at least 80 days, really... Thanks again!!

Nuria renamed this task from Whitelist CentralNotice banner history events for sanitaization and long-term storage to Whitelist CentralNotice banner history events for sanitization and long-term storage.Feb 18 2020, 5:35 PM

Change 572767 merged by Fdans:
[analytics/refinery@master] Add CentralNoticeBannerHistory schema to EventLogging whitelist

https://gerrit.wikimedia.org/r/572767

Change 573324 had a related patch set uploaded (by Mforns; owner: Mforns):
[analytics/refinery@master] Correct indentation of EventLogging sanitization white-list

https://gerrit.wikimedia.org/r/573324

Change 573324 merged by Fdans:
[analytics/refinery@master] Correct indentation of EventLogging sanitization white-list

https://gerrit.wikimedia.org/r/573324

I believe this is done!

We deployed the white-listing of CentralNoticeBannerHistory yesterday.
From now on data from event.CentralNoticeBannerHistory is copied over to event_sanitized.CentralNoticeBannerHistory following the white-list spec, which IIUC keeps all its data.
We also backfilled the new table since 2019-11-21 so that the Big English is kept indefinitely.

-> Please, check that the data in event_sanitized.CentralNoticeBannerHistory looks good (we already did, but a second vetting on your side would be cool).

-> And note that the original table event.CentralNoticeBannerHistory will continue to be purged normally after 90 days.

Cheers!

Thanks everyone!

I just looked at the event_sanitized.CentralNoticeBannerHistory and it looks good.