Page MenuHomePhabricator

Rename weighted_tags referencing ores in their names
Closed, ResolvedPublic5 Estimated Story Points

Description

We have two confusing weighted_tags currently named:

  • classification.ores.articletopic
  • classification.ores.drafttopic

We should rename them:

  • classification.prediction.articletopic
  • classification.prediction.drafttopic

We currently don't have a clear migration plan for doing such renames but it could be:

  • Investigate all direct usages of the tag names and write a BC version of them
    • ArticleTopic search keyword
    • Investigate possible other users via codesearch
  • Adapt producers to produce the new name
    • AFAIK the SUP itself is controlling these names
    • Possibly set an additional a clear command on the old names whenever the tag is updated (to try to avoid having competing data in both names)
  • Attempt to write a generic painless script to run during a re-index to copy the data from the old tag to the new tag.
  • Update the documentation that still reference these tag names

AC:

  • CirrusSearch indices should not have any tags referencing ores.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Gehel set the point value for this task to 5.

Change #1129871 had a related patch set uploaded (by Peter Fischer; author: Peter Fischer):

[mediawiki/extensions/CirrusSearch@master] Reindexer: Allow to rename weighted tag prefixes

https://gerrit.wikimedia.org/r/1129871

I augmented the Reindexer to replace weighted tag prefixes if configured via --weightedTagsPrefixMap <old>:<new> [,<old>:<new>, …] which works at least in a local mw setup.

Change #1130090 had a related patch set uploaded (by Peter Fischer; author: Peter Fischer):

[mediawiki/extensions/GrowthExperiments@master] CirrusSearch: rename ORES weighted tags

https://gerrit.wikimedia.org/r/1130090

Updated https://www.mediawiki.org/wiki/Help:CirrusSearch#Articletopic: Replaced links specifically pointing to ORES with links pointing to ML.

Change #1130523 had a related patch set uploaded (by Peter Fischer; author: Peter Fischer):

[mediawiki/extensions/CirrusSearch@master] Rename ORES weighted tags

https://gerrit.wikimedia.org/r/1130523

According to @Michael, importOresTopics is not executed periodically and is only intended for importing data into a test instance. If at all, it runs in a beta cluster, but never in production. So we don't have to consider its patch when ordering patches related to this task.

Change #1129871 merged by jenkins-bot:

[mediawiki/extensions/CirrusSearch@master] Reindexer: Allow to rename weighted tag prefixes

https://gerrit.wikimedia.org/r/1129871

Change #1130523 merged by jenkins-bot:

[mediawiki/extensions/CirrusSearch@master] Rename ORES weighted tags

https://gerrit.wikimedia.org/r/1130523

Change #1135010 had a related patch set uploaded (by Peter Fischer; author: Peter Fischer):

[operations/mediawiki-config@master] CirrusSearch: weighted tags mapping (during maintenance inflicted reindexing)

https://gerrit.wikimedia.org/r/1135010

Change #1135019 had a related patch set uploaded (by Peter Fischer; author: Peter Fischer):

[operations/deployment-charts@master] Search update pipeline: 504 handling, weighted tags rename

https://gerrit.wikimedia.org/r/1135019

Change #1130090 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@master] CirrusSearch: rename ORES weighted tags

https://gerrit.wikimedia.org/r/1130090

Should be deployed, but we'll only see if it works during/after the next full reindex, which might only happen after migration to OpenSearch is completed.

Change #1135019 merged by jenkins-bot:

[operations/deployment-charts@master] Search update pipeline: 504 handling, weighted tags rename

https://gerrit.wikimedia.org/r/1135019

Change #1135010 merged by jenkins-bot:

[operations/mediawiki-config@master] CirrusSearch: weighted tags mapping (during maintenance inflicted reindexing)

https://gerrit.wikimedia.org/r/1135010

Change #1144600 had a related patch set uploaded (by Peter Fischer; author: Peter Fischer):

[operations/mediawiki-config@master] CirrusSearch: weighted tags mapping (during maintenance inflicted reindexing)

https://gerrit.wikimedia.org/r/1144600

Change #1144600 merged by jenkins-bot:

[operations/mediawiki-config@master] CirrusSearch: weighted tags mapping (during maintenance inflicted reindexing)

https://gerrit.wikimedia.org/r/1144600

Mentioned in SAL (#wikimedia-operations) [2025-05-13T13:07:34Z] <lucaswerkmeister-wmde@deploy1003> Started scap sync-world: Backport for [[gerrit:1144600|CirrusSearch: weighted tags mapping (during maintenance inflicted reindexing) (T389053)]]

Mentioned in SAL (#wikimedia-operations) [2025-05-13T13:14:06Z] <lucaswerkmeister-wmde@deploy1003> lucaswerkmeister-wmde, pfischer: Backport for [[gerrit:1144600|CirrusSearch: weighted tags mapping (during maintenance inflicted reindexing) (T389053)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)

Mentioned in SAL (#wikimedia-operations) [2025-05-13T13:22:54Z] <lucaswerkmeister-wmde@deploy1003> Finished scap sync-world: Backport for [[gerrit:1144600|CirrusSearch: weighted tags mapping (during maintenance inflicted reindexing) (T389053)]] (duration: 15m 19s)

Change #1191306 had a related patch set uploaded (by DCausse; author: DCausse):

[mediawiki/extensions/CirrusSearch@master] Stop querying deprecated ores weighted_tags

https://gerrit.wikimedia.org/r/1191306

Change #1191306 merged by jenkins-bot:

[mediawiki/extensions/CirrusSearch@master] Stop querying deprecated ores weighted_tags

https://gerrit.wikimedia.org/r/1191306

Change #1193052 had a related patch set uploaded (by DCausse; author: DCausse):

[operations/mediawiki-config@master] cirrus: stop copying ores weighted_tags

https://gerrit.wikimedia.org/r/1193052

Change #1193052 merged by jenkins-bot:

[operations/mediawiki-config@master] cirrus: stop copying ores weighted_tags

https://gerrit.wikimedia.org/r/1193052

Mentioned in SAL (#wikimedia-operations) [2025-10-07T07:05:33Z] <dcausse@deploy2002> Started scap sync-world: Backport for [[gerrit:1193052|cirrus: stop copying ores weighted_tags (T389053)]], [[gerrit:1193092|cirrus: test completion with default sort on simplewiki [2/3] (T404858)]]

Mentioned in SAL (#wikimedia-operations) [2025-10-07T07:11:53Z] <dcausse@deploy2002> dcausse: Backport for [[gerrit:1193052|cirrus: stop copying ores weighted_tags (T389053)]], [[gerrit:1193092|cirrus: test completion with default sort on simplewiki [2/3] (T404858)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.

Mentioned in SAL (#wikimedia-operations) [2025-10-07T07:21:05Z] <dcausse@deploy2002> Finished scap sync-world: Backport for [[gerrit:1193052|cirrus: stop copying ores weighted_tags (T389053)]], [[gerrit:1193092|cirrus: test completion with default sort on simplewiki [2/3] (T404858)]] (duration: 15m 32s)