We should move special casing and transformation of EventLogging analytics data for insertion into MySQL into the MySQL consumer process itself, not upstream in the processor.
Currently, we do several things to make EventLogging analytics data work for MySQL.
- Convert (varnish) timestamps to ints and then to to Mediawiki format. T179540
- Parse `userAgent` and convert to JSON string. T153207, T178440
- Filter out unwanted bots. T67508
We should do these things only to the data as it is inserted into MySQL, not before it goes to Kafka.
- Modify EventCapsule schema
-- Make `timestamp` optional `number`
-- Add optional `dt` field in ISO-8601 date-time format.
-- Make `userAgent` `"type": ["object", "string"]` rather than just `"type": "string"`
- Modify eventlogging code to
-- Parse `dt` from raw client-side log format.
-- Parse `userAgent`, but leave it as a nested object, not a JSON string.
-- Add map:// reader/writer handlers to
-- map:// in eventlogging-consumer mysql to add `timestamp` during eventlogging-processor
--- add `timestamp` and remove `dt` for compatibility with existing tables
--- Filter out bots
--- Convert `userAgent` to JSON string for compatibility with existing tables