Lookout for duplicates in EL refine similar to how we remove possible kafka duplicates in webrequest
Description
Details
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | Ottomata | T159170 Sunset MySQL data store for eventlogging | |||
Resolved | Ottomata | T162610 Implement EventLogging Hive refinement | |||
Resolved | Ottomata | T153328 Research Spike: Better support for Eventlogging data on hive | |||
Resolved | JAllemandou | T161924 Write Spark schema differ / Hive DDL generator | |||
Resolved | mforns | T166414 Explore NavigationTiming by faceted properties - EventLogging refine | |||
Duplicate | mforns | T176426 Implement purging scheme for eventlogging data on top of eventlogging refine | |||
Resolved | Ottomata | T178440 Refine should parse user agent field as it is done on refinery pipeline | |||
Resolved | Ottomata | T179540 Timestamp format in Hive-refined EventLogging tables is incompatible with MySQL version | |||
Resolved | Ottomata | T179625 Resolve EventCapsule / MySQL / Hive schema discrepancies | |||
Resolved | mforns | T181064 Sanitize Hive EventLogging | |||
Resolved | Ottomata | T185237 Lookout for duplicates in EL refine, implement pluggable transform method config in JSONRefine |
Event Timeline
Change 405800 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[analytics/refinery/source@master] [WIP] Add configurable transform function to JSONRefine
Change 407508 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[analytics/refinery/source@master] [WIP] Add TransformFunctions for JsonRefine job
Change 405800 merged by Ottomata:
[analytics/refinery/source@master] Add configurable transform function to JSONRefine
Change 410240 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[analytics/refinery/source@jsonrefine] Add TransformFunctions for JsonRefine job
Change 407508 abandoned by Ottomata:
Add TransformFunctions for JsonRefine job
Reason:
in favor of https://gerrit.wikimedia.org/r/#/c/410240/
Change 410240 merged by Ottomata:
[analytics/refinery/source@jsonrefine] Add TransformFunctions for JsonRefine job
Change 417287 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] [WIP] Apply geocode and deuplicate transform function for refine jobs
Change 417287 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] [WIP] Apply geocode, deduplicate and monitoring for refine jobs
Change 417287 merged by Ottomata:
[operations/puppet@production] Apply geocode, deduplicate and monitoring for refine jobs