Change Details

This is a parent task for the work to be done for the FY2018-2019 [[ https://www.mediawiki.org/wiki/Wikimedia_Technology/Annual_Plans/FY2019/TEC2:_Modern_Event_Platform | Modern Event Platform Program ]]. EventLogging is home grown, and was not designed for purposes other than low volume analytics in MySQL databases. However, the ideas it was based on are solid and convergently have become an industry standard, often called a Stream Data Platform. In the last two years, we have been developing the EventBus sub-system with the aim of standardizing events to be used both internally for propagating changes to update the dependent artifacts as well as exposing them to clients. While this has been a success, integrating these events with different systems requires much custom and cumbersome glue code. There exist open source technologies for integrating and processing streams of events. Engineering teams should be able to quickly develop features that are easy to instrument and measure, as well as for those features to react to events from other systems. # Components ##### Scalable event intake from internal and external clients (browsers & apps). EventLogging + EventBus do some of this already, but are limited in scope and scale. ##### Comprehensive event schema repository service This should combine the existing EventLogging schemas with the mediawiki/event-schemas. This might be something new, or it might be improvements to the existant Mediawiki based schema repository. ##### Guidelines for writing reusable event schemas Some exist already for analytics purposes, some exist for mediawiki/event-schemas. We should unify these. ##### Connectors for ingestion to and from various state stores (MySQL, Redis, Druid, Cassandra, HDFS, etc.) This will likely be Kafka Connect. We may need to move away from JSON to use this, or we may need to adapt Kafka Connect (or something) to make this possible with JSON. ##### Stream processing system with dependency tracking system conceptual design Engineers should have a standardized way to build and deploy and maintain stream processing type jobs, for both analytics an production purposes. A very common use of stream processing at WMF is change-propogation, which to do well requires a dependency tracking mechanism, a very long term goal. We want to choose stream processing technologies that work toward this goal. This component is the lowest priority of the Modern Event Platform, and as such will have more thought and planning towards the end of the program. # Timeline //NOTE: This timeline is very WIP, and is only a guess of when progress will be made on different components.// ##### FY2017-2018 - Q4: Interview product and technology stakeholders to collect desires, use cases, and requirements. ##### FY2018-2019 - Q1: Survey and choose technologies and solutions with input from #services and #operations. - Q2: Begin implementation of and deployment of some chosen techs. - Q3: Deployment of techs to production. Begin research into stream processing and dependency tracking. - Q4: Implementation and deployment of stream processing system. ## Use case collection - JADE for ORES - Fundraising banner impressions pipeline - WDQS state updates - Job Queue (implementation ongoing) - Frontend Cache (varnish) invalidation - Scalable EventLogging (with automatic visualization in tools (Pivot, etc.)) - Realtime SQL queries and state store updates. Can be used to verify real time that events have what they should/are valid - Trending pageviews & edits - Mobile App Events - ElasticSearch index updates incorporating new revisions & ORES scores - Automatic Prometheus metric transformation and collection - Dependency tracking transport and stream processing - Stream of reference/citation events: https://etherpad.wikimedia.org/p/RefEvents (...add more as collected!)

This is a parent task for the work to be done for the FY2018-2019 [[ https://www.mediawiki.org/wiki/Wikimedia_Technology/Annual_Plans/FY2019/TEC2:_Modern_Event_Platform | Modern Event Platform Program ]]. EventLogging is home grown, and was not designed for purposes other than low volume analytics in MySQL databases. However, the ideas it was based on are solid and convergently have become an industry standard, often called a Stream Data Platform. In the last two years, we have been developing the EventBus sub-system with the aim of standardizing events to be used both internally for propagating changes to update the dependent artifacts as well as exposing them to clients. While this has been a success, integrating these events with different systems requires much custom and cumbersome glue code. There exist open source technologies for integrating and processing streams of events. Engineering teams should be able to quickly develop features that are easy to instrument and measure, as well as for those features to react to events from other systems. # Background Reading - https://www.confluent.io/blog/data-dichotomy-rethinking-the-way-we-treat-data-and-services/ - https://www.confluent.io/blog/stream-data-platform-1/ - https://www.confluent.io/blog/messaging-single-source-truth/ - https://www.confluent.io/blog/build-services-backbone-events/ # Components ##### Scalable event intake from internal and external clients (browsers & apps). EventLogging + EventBus do some of this already, but are limited in scope and scale. ##### Comprehensive event schema repository service This should combine the existing EventLogging schemas with the mediawiki/event-schemas. This might be something new, or it might be improvements to the existant Mediawiki based schema repository. ##### Guidelines for writing reusable event schemas Some exist already for analytics purposes, some exist for mediawiki/event-schemas. We should unify these. ##### Connectors for ingestion to and from various state stores (MySQL, Redis, Druid, Cassandra, HDFS, etc.) This will likely be Kafka Connect. We may need to move away from JSON to use this, or we may need to adapt Kafka Connect (or something) to make this possible with JSON. ##### Stream processing system with dependency tracking system conceptual design Engineers should have a standardized way to build and deploy and maintain stream processing type jobs, for both analytics an production purposes. A very common use of stream processing at WMF is change-propogation, which to do well requires a dependency tracking mechanism, a very long term goal. We want to choose stream processing technologies that work toward this goal. This component is the lowest priority of the Modern Event Platform, and as such will have more thought and planning towards the end of the program. # Timeline //NOTE: This timeline is very WIP, and is only a guess of when progress will be made on different components.// ##### FY2017-2018 - Q4: Interview product and technology stakeholders to collect desires, use cases, and requirements. ##### FY2018-2019 - Q1: Survey and choose technologies and solutions with input from #services and #operations. - Q2: Begin implementation of and deployment of some chosen techs. - Q3: Deployment of techs to production. Begin research into stream processing and dependency tracking. - Q4: Implementation and deployment of stream processing system. ## Use case collection - JADE for ORES - Fundraising banner impressions pipeline - WDQS state updates - Job Queue (implementation ongoing) - Frontend Cache (varnish) invalidation - Scalable EventLogging (with automatic visualization in tools (Pivot, etc.)) - Realtime SQL queries and state store updates. Can be used to verify real time that events have what they should/are valid - Trending pageviews & edits - Mobile App Events - ElasticSearch index updates incorporating new revisions & ORES scores - Automatic Prometheus metric transformation and collection - Dependency tracking transport and stream processing - Stream of reference/citation events: https://etherpad.wikimedia.org/p/RefEvents (...add more as collected!)