Page MenuHomePhabricator

Add content_translation_event data stream to the sanitization allowlist
Closed, ResolvedPublic

Description

We want to archive the content_translation_data indefinitely, so once the schema is merged, we should update the sanitization allowlist to do this.

Fields that should not be kept:

  • web_session_id
  • web_pageview_id
  • user_name
  • user_global_edit_count

Event Timeline

ldelench_wmf moved this task from Triage to Current Quarter on the Product-Analytics board.

I will wait until the implementation is close to being merged before doing this, as there's a small chance that the engineers could encounter issues that require changing the schema.

The implementation is still in progress, and not particularly close to completion.

nshahquinn-wmf added a subscriber: mpopov.

@mpopov is planning to batch this with T287255.

MNeisler subscribed.

Update: @mpopov and I decided it would be best to make separate patches for each of these additions. I'm assigning this over to myself to add to the allowlist.

Change 716339 had a related patch set uploaded (by MNeisler; author: MNeisler):

[analytics/refinery@master] Add the content_translation_event stream to the allowlist

https://gerrit.wikimedia.org/r/716339

Change 716339 merged by Mforns:

[analytics/refinery@master] Add the content_translation_event stream to the allowlist

https://gerrit.wikimedia.org/r/716339