=== **User Story**
> ==== As an ML Platform Engineer, I want to utilize the platform capabilities developed by the Event Platform team to thoroughly test and evaluate their functionality. This will enable me to effectively implement an event-driven data pipeline that generates an event stream containing revert risk model scores.
==== Why?
By implementing a proof of concept job utilizing the capabilities of the Event Platform team, we can gain insights into the advantages and disadvantages of using Flink in comparison to other available solutions like ChangeProp and Benthos.
Additionally, the output stream generated from this implementation is also of interest to various product teams. Currently, these teams rely on individual calls to the LiftWing API to access the required data. However, by implementing an event stream, we can enable consumers to subscribe to the stream and reduce the number of API calls made to LiftWing, improving efficiency and reducing dependencies. In the long run we could also look at connecting the stream to something like Cassandra to better serve the data.
==== Expected Sub-tasks (not exhaustive - please add as needed)
[] Deploy Flink operator to ML k8s
[] Build Python Flink job that listens to mw.page_change (I don’t think we need the content stream), makes an API call to LiftWing for the revert risk model score and outputs the results in a new stream
[] Design schema for output topic
[] Deploy new output stream
[] Deploy Flink job to ML k8s
==== Success Criteria
[] Flink job is running as a PoC (no SLO) on ML k8s and is able to enrich relevant page change events with a revert risk model score. Following the proof of concept we will review the process and can work together to understand event platform improvements, steps to move from PoC to more formal implementation, etc.
==== Useful Links (please add as needed)
- eventutilities Python docs -> https://doc.wikimedia.org/data-engineering/eventutilities-python/index.html