User Story
As an ML Platform Engineer, I want to utilize the platform capabilities developed by the Event Platform team to thoroughly test and evaluate their functionality. This will enable me to effectively implement an event-driven data pipeline that generates an event stream containing revert risk model scores.
Why?
By implementing a proof of concept job utilizing the capabilities of the Event Platform team, we can gain insights into the advantages and disadvantages of using Flink in comparison to other available solutions like ChangeProp and Benthos.
Additionally, the output stream generated from this implementation is also of interest to various product teams. Currently, these teams rely on individual calls to the LiftWing API to access the required data. However, by implementing an event stream, we can enable consumers to subscribe to the stream and reduce the number of API calls made to LiftWing, improving efficiency and reducing dependencies. In the long run we could also look at connecting the stream to something like Cassandra to better serve the data.
Expected Sub-tasks (not exhaustive - please add as needed)
- Deploy Flink operator to dse-k8s
- Build Python Flink job that listens to mw.page_change, makes an API call to LiftWing for the revert risk model score and outputs the results in a new stream
- Design schema for output topic
- Deploy new output stream
- Deploy Flink job to dse-k8s
Success Criteria
- Flink job is running as a PoC on dse-k8s and is able to enrich relevant page change events with a revert risk model score. Following the proof of concept we will review the process and can work together to understand event platform improvements, steps to move from PoC to more formal implementation, etc.