Page MenuHomePhabricator

Start refining ChangesListHighlights events
Closed, ResolvedPublic

Event Timeline

nshahquinn-wmf created this task.

Ah ha! I was able to fix this schema. It's filters field had its array items defined as an array itself, rather than an object, so the refine job wasn't able to figure out the actual schema of the filters field. After fixing, it works!

https://meta.wikimedia.org/w/index.php?title=Schema%3AChangesListHighlights&type=revision&diff=19499852&oldid=16484288

19/10/28 19:51:37 INFO Refine: Successfully refined 21 of 21 dataset partitions into table `event`.`ChangesListHighlights` (total # refined records: 122)

I'll see if I can backfill from whatever we have in raw data.

Change 546703 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] EventLogging refine = Unblacklist ChangesListHighlights schema, it has been fixed

https://gerrit.wikimedia.org/r/546703

I see data and while i think filters file looks strange as is, I guess the schema defines it this way.

2019-10-27T18:23:14Z {"action":"set","filters":[{"name":"damaging__maybebad","color":"c3"}],"userId":34623446} cp2006.codfw.wmnet 16484288 ChangesListHighlights 10137746 {"browser_family":"Edge","browser_major":"18","browser_minor":"18362","device_family":"Other","is_bot":false,"is_mediawiki":false,"os_family":"Windows","os_major":"10","os_minor":null,"wmf_app_version":"-"} 9dc42bd4d0a05e5381a052f562975939 en.wikipedia.org enwiki {"city":"Rowlett","","timezone":"America/Chicago","country":"United States",","continent":"North America","country_code":"US","subdivision":"Texas"}

per @Catrope looks like this schema is not used and can be retired, if so devs need to do changes to stop sending events.

Change 546703 merged by Ottomata:
[operations/puppet@production] EventLogging refine - Unblacklist ChangesListHighlights, it has been fixed

https://gerrit.wikimedia.org/r/546703

Backfilled since July 31th:

19/10/29 01:34:26 INFO Refine: Successfully refined 1982 of 1982 dataset partitions into table `event`.`ChangesListHighlights` (total # refined records: 16304)
nshahquinn-wmf claimed this task.

Looks good! Since Growth doesn't want the data anymore, I filed T236770: Retire the ChangesListHighlights data stream, which we can deal with separately.

Looks good! Since Growth doesn't want the data anymore

Ha, oh ok.

Looks good! Since Growth doesn't want the data anymore

Ha, oh ok.

At least the data got to feel special one more time before it gets thrown away 😜