Decision: The proposal for page_content_change on Kafka Main (option A) was not approved, the ML team proceeded with mediawiki.page_change.v1 instead (option D).
As @Ottomata noted in T401021#11345086, mediawiki.page_content_change.v1 currently exists only in Kafka jumbo-eqiad, while ChangeProp only consumes from Kafka main. As a result, ChangeProp cannot consume from mediawiki.page_content_change.v1 to trigger LiftWing updates for Revise Tone Task Generation.
We have a few options to enable ChangeProp to consume mediawiki.page_content_change.v1, or consume mediawiki.page_change.v1 instead + query page content from MW API. We need to decide which option is best to move forward.
Options from @Ottomata in T401021#11345086:
Option A. Produce mediawiki.page_content_change.v1 to Kafka main
This is my preferred option. I think having access to mediawiki.page_content_change.v1 and other streams like this will be useful for realtime updates for derived data products like this one.
The original reason this was not produced to Kafka main was that SRE was worried about polluting Kafka main with this stream that has large event bodies. Previously, the only user of this stream was for mediawiki_content_change_v1 in the Data Lake, so there was no reason to produce to Kafka main.
We should consider this and talk to SRE ServiceOps to see what they think.
Option B. New change-prop service consuming from Kafka jumbo
Ideally this wouldn't be too hard to do (although I'm not sure its helm chart is in good shape to make this easy). We'd have to figure out where to run it (dse-k8s-eqiad?).
This is my least preferred option. I don't want to deploy more change-props.
Option C. New change-prop rule consuming from Kafka jumbo
This would probably require:
- A new change-prop route rule (/{api:sys}/queue-jumbo?) declared in the helm chart, like this.
- Helm chart and helmfile modifications to support consuming from multiple kafka clusters.
If this isn't too hard, this option would be an okay compromise, assuming SRE ServiceOps won't like Option A.
I'm not sure, but I don't think this will require any actual change-prop code changes. Just helm config changes to declare the new routes and kafka configs.
Option D. Consume mediawiki.page_change.v1 instead
This is probably the fastest path to production. LiftWing already responds to mediawiki.page_change.v1 events via change-prop and Kafka main. Doing this for tone check score would mean that the page content would have to be looked up from the MediaWiki API at score time, rather than just getting it out of the page_content_change event body.
This is already done for other models in LiftWing, so perhaps this is easy to do quickly?
I'd prefer to avoid the extra MW API lookups for page content for all of the other LiftWing usages too. All of the other options would allow us to do that.