Description: This is a Flink streaming enrichment application. It consumes events from the mediawiki.page_change stream, asks the MW API for revision content, adds that content to the event, and then produces events to a new mediawiki.page_content_change stream.
We have been running this app in dse-k8s-eqiad for the past month or two, using Kafka jumbo only. We'd like to move this app into wikikube eqiad and codfw k8s, using Kafka main.
Since this is the first Flink native k8s app being deployed, we will need to T333464: New Service Request: flink-kubernetes-operator. This operator will be used by other flink-app deployments in the future, including rdf-streaming-updater and a new search pipeline.
Timeline: 2023-04
Technologies: Flink, Python, Java
Point person: @Ottomata & @gmodena
Estimation of resources: All of the review and testing has already been done in dse-k8s-eqiad, so we'd only need final approval of the deployments to wikikube, especially of the flink-kubernetes-operator and associated namespace creation.
We will initially deploy with 2 pods in each wikikube eqiad, codfw
- 1 JobManager pod, 1000m
- 1 TaskManager pod, 3000m, 2 cpu
These resources may slightly increase as we get more experience operating, especially with HA JobManagers, etc. (Deploying HA will require 2 JobManager pods).
- deployed in staging wikikube, using Kafka test cluster
- deployed in eqiad and codfw wikikube, consuming from Kafka main clusters and producing to Kafka jumbo-eqiad.