db: presto_analytics_hive, schema: event, table: mediawiki_api_request contains data on API requests to our projects that would be useful for Partnerships and Okapi analysis but is set to expire soon – requesting to dump the non-sensitive fields into a temporary table for further analysis!
Description
Event Timeline
We can keep data for longer than 90 days that has no identifying fields. Just need to submit a changeset that lists those fields. Please take a look at docs: https://wikitech.wikimedia.org/wiki/Analytics/Systems/EventLogging/Data_retention
It will help to work with a data analysts that is familiar with systems
Just following up, the data is being collected on an ongoing basis, and we always have the last 90 days of data.
(I initially made a mistake by looking only at eqiad but right now the data's coming from codfw)
Right now, we're waiting on @Maryana & others to let us know what fields they would like to keep on an ongoing basis, and I can help them implement that in the sanitization whitelist yaml. This is here, and is very self-explanatory, direct patches welcome!
https://github.com/wikimedia/analytics-refinery/blob/master/static_data/eventlogging/whitelist.yaml