|Open||Ottomata||T185233 Modern Event Platform (TEC2)|
|Open||Ottomata||T201068 Modern Event Platform: Stream Intake Service|
|Resolved||Ottomata||T206785 Modern Event Platform: Stream Intake Service (EventGate): Implementation|
|Resolved||Ottomata||T214080 Rewrite Avro schemas (ApiAction, CirrusSearchRequestSet) as JSONSchema and produce to EventGate|
|Resolved||EBernhardson||T222268 Port usage of mediawiki_CirrusSearchRequestSet to mediawiki_cirrussearch_request|
The primary user of this data is the oozie job in wikimedia/discovery/analytics repository which proccesses these logs into the discovery.query_clicks_hourly table. This is a single hql script, so hopefully should be easy to port. There are also a variety of notebooks used for ad-hoc analysis that might also have to change, but I don't think any of those are under source control.
There are two scripts, one for hourly and daily:
but they are mostly the same.
The query is a bit complicated, it's probably safest if someone from search/discovery picks up this task. BTW if all goes well we should have all cirrussearch-request events in Hive starting today.