Page MenuHomePhabricator

Refine eventlogging pipeline should not refine data for domains that are not wikimedia's
Closed, DuplicatePublic5 Story Points

Description

Refine eventlogging pipeline should not refine data for domains that are not wikimedia's. It is not infrequent that other wikis like www.wikipedia-with-spam.org run a clone of our code and , as such, they endup running our instrumenting code and sending us their eventlogging events.

Those events should probably be dropped (ideally) before they get refined. This is somewhat related to: https://phabricator.wikimedia.org/T219162

and https://github.com/wikimedia/analytics-refinery/commit/58a03f623cd6124fd4de70cb8d7e739a90b58214

Event Timeline

Nuria created this task.Apr 1 2019, 6:57 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptApr 1 2019, 6:57 PM

ping @phuedx and @Jdlrobson so they are aware this ticket exists

fdans triaged this task as High priority.Apr 4 2019, 5:18 PM
fdans moved this task from Incoming to Data Quality on the Analytics board.
phuedx awarded a token.Apr 4 2019, 5:18 PM
Nuria assigned this task to mforns.May 14 2019, 7:50 PM
Nuria added a project: Analytics-Kanban.
Nuria set the point value for this task to 5.