Page MenuHomePhabricator

Start refining ChangesListHighlights events
Closed, ResolvedPublic

Details

Related Gerrit Patches:

Event Timeline

Neil_P._Quinn_WMF triaged this task as Medium priority.Dec 19 2018, 11:07 PM
Neil_P._Quinn_WMF created this task.
Milimetric moved this task from Incoming to Data Quality on the Analytics board.Jan 7 2019, 4:49 PM
Milimetric moved this task from Data Quality to Incoming on the Analytics board.
Milimetric moved this task from Incoming to Radar on the Analytics board.Jan 7 2019, 4:59 PM
Restricted Application edited projects, added Product-Analytics; removed Product-Analytics (Kanban). · View Herald TranscriptOct 16 2019, 5:47 PM

Ah ha! I was able to fix this schema. It's filters field had its array items defined as an array itself, rather than an object, so the refine job wasn't able to figure out the actual schema of the filters field. After fixing, it works!

https://meta.wikimedia.org/w/index.php?title=Schema%3AChangesListHighlights&type=revision&diff=19499852&oldid=16484288

19/10/28 19:51:37 INFO Refine: Successfully refined 21 of 21 dataset partitions into table `event`.`ChangesListHighlights` (total # refined records: 122)

I'll see if I can backfill from whatever we have in raw data.

Change 546703 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] EventLogging refine = Unblacklist ChangesListHighlights schema, it has been fixed

https://gerrit.wikimedia.org/r/546703

Nuria added a subscriber: Nuria.EditedOct 28 2019, 8:22 PM

I see data and while i think filters file looks strange as is, I guess the schema defines it this way.

2019-10-27T18:23:14Z {"action":"set","filters":[{"name":"damaging__maybebad","color":"c3"}],"userId":34623446} cp2006.codfw.wmnet 16484288 ChangesListHighlights 10137746 {"browser_family":"Edge","browser_major":"18","browser_minor":"18362","device_family":"Other","is_bot":false,"is_mediawiki":false,"os_family":"Windows","os_major":"10","os_minor":null,"wmf_app_version":"-"} 9dc42bd4d0a05e5381a052f562975939 en.wikipedia.org enwiki {"city":"Rowlett","","timezone":"America/Chicago","country":"United States",","continent":"North America","country_code":"US","subdivision":"Texas"}

Nuria added a subscriber: Catrope.Oct 28 2019, 8:31 PM

per @Catrope looks like this schema is not used and can be retired, if so devs need to do changes to stop sending events.

Change 546703 merged by Ottomata:
[operations/puppet@production] EventLogging refine - Unblacklist ChangesListHighlights, it has been fixed

https://gerrit.wikimedia.org/r/546703

Backfilled since July 31th:

19/10/29 01:34:26 INFO Refine: Successfully refined 1982 of 1982 dataset partitions into table `event`.`ChangesListHighlights` (total # refined records: 16304)
Neil_P._Quinn_WMF closed this task as Resolved.Oct 29 2019, 10:20 AM
Neil_P._Quinn_WMF claimed this task.

Looks good! Since Growth doesn't want the data anymore, I filed T236770: Retire the ChangesListHighlights data stream, which we can deal with separately.

Looks good! Since Growth doesn't want the data anymore

Ha, oh ok.

Looks good! Since Growth doesn't want the data anymore

Ha, oh ok.

At least the data got to feel special one more time before it gets thrown away 😜