Page MenuHomePhabricator

Use the mediawiki.revision_score_drafttopic stream instead of mediawiki.revision-score
Closed, ResolvedPublic3 Estimated Story Points

Description

The ML team is willing to deprecate the mediawiki.revision-score stream in favor of a stream per model (T317768). CirrusSearch data pipelines should be using mediawiki.revision_score_drafttopic instead of mediawiki.revision-score one (see T328576).

AC:

  • mediawiki.revision-score (hive tables and/or kafka topics) is no longer used by CirrusSearch data-pipelines

Details

ReferenceSource BranchDest BranchAuthorTitle
repos/data-engineering/airflow-dags!381work/ebernhardson/drafttopic-t333468mainebernhardsonUse the drafttopic specific stream
Customize query in GitLab

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
dcausse renamed this task from Use the mediawiki.revision_score_drafttopic instead of mediawiki.revision_score_drafttopic to Use the mediawiki.revision_score_drafttopic stream instead of mediawiki.revision-score.Mar 29 2023, 4:24 PM
dcausse updated the task description. (Show Details)
dcausse added a subscriber: elukey.

Since we're creating a new update pipeline for Search, it might be smarter to not fix our current pipeline and wait until the new one is in place.

@calbon / @elukey: how urgent is this from your side?

@Gehel not super urgent, we have the deadline of August/September to deprecate ORES, any time before that would be good. If it happens this quarter it would be awesome for us :)

Gehel set the point value for this task to 5.May 8 2023, 3:45 PM
Gehel changed the point value for this task from 5 to 3.

@Ottomata Hi! The Search team is ready to use the new draft topic stream, but it is based on rc1 page_change and I am wondering if we should wait for a more stable state (to avoid kafka client offset changes and holes in the data in Hive etc..). Is there a target date to move page change out of RC status?

I want to do it imminently, but I keep hoping for a decision soon on MW user types. I think this is not going to happen, so I'm considering anticipating in in the schema...need to talk to other event platform folks about this.

@elukey we are ready to go. T336817: Release mediawiki.page_change.v1 stream

Okay if we just do this and you can change your configs to use mediawiki.page_change.v1 once we're done?

Change 920696 had a related patch set uploaded (by Elukey; author: Elukey):

[operations/deployment-charts@master] services: change lift wing's kafka topic in changeprop's config

https://gerrit.wikimedia.org/r/920696

Change 920696 merged by Elukey:

[operations/deployment-charts@master] services: change lift wing's kafka topic in changeprop's config

https://gerrit.wikimedia.org/r/920696

Commented in https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/381 the the draft topic stream is now using the new page_change source, we should be ready to proceed. Thanks!

The drafttopic stream has had it's inputs changed. Note that articletopic is still being read from the event.mediawiki_revision_score table. This is intended to be replaced with a different model but it hasn't changed over yet. The articletopic work is tracked in T328276.