Page MenuHomePhabricator

Migrate Ttmserver (Translatewiki application) indices from production OpenSearch to OpenSearch on k8s
Open, In Progress, Needs TriagePublic

Description

Per parent ticket, we need to migrate the ttmserver indices from the production CirrusSearch cluster to OpenSearch on k8s.

Here is the tentative maintenance plan, subject to approval by stakeholders

  • Stand up opensearch-ttmserver OpenSearch clusters in k8s (DPE SRE)
  • Verify access to new clusters (@Nikerabbit )
  • TBD: Load the ttmserver indices into K8s. This will be a joint effort between DPE SRE and @Nikerabbit , as it requires familiarity with OpenSearch on one hand and the application on the other. I'm guessing it will probably be similar to how we generate the search indices in production OpenSearch.
  • Verify the newly-created indices are correct/usable from K8s (joint effort)
  • Cut over application to use the new OpenSearch clusters (@Nikerabbit )

@Nikerabbit, let us know if that sounds like a reasonable plan. We will let you know when the new cluster is ready for testing, and feel free to ping us here or in Slack if you have any questions about how to access it.

Details

Related Changes in Gerrit:
SubjectRepoBranchLines +/-
operations/deployment-chartsmaster+54 -8
operations/mediawiki-configmaster+35 -6
operations/mediawiki-configmaster+33 -13
operations/mediawiki-configmaster+33 -13
operations/puppetproduction+2 -2
operations/deployment-chartsmaster+3 -0
operations/mediawiki-configmaster+12 -39
operations/mediawiki-configmaster+39 -12
operations/mediawiki-configmaster+1 -1
operations/puppetproduction+50 -4
operations/puppetproduction+2 -0
operations/puppetproduction+31 -2
operations/deployment-chartsmaster+2 -2
mediawiki/extensions/Translatemaster+10 -3
operations/mediawiki-configmaster+36 -10
operations/mediawiki-configmaster+36 -10
operations/puppetproduction+13 -0
Show related patches Customize query in gerrit

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Change #1287388 had a related patch set uploaded (by Atsuko; author: Atsuko):

[operations/puppet@production] services_proxy: isetting up toolhub and ttmserver

https://gerrit.wikimedia.org/r/1287388

Change #1287388 merged by Atsuko:

[operations/puppet@production] services_proxy: isetting up toolhub and ttmserver

https://gerrit.wikimedia.org/r/1287388

Change #1287428 had a related patch set uploaded (by Atsuko; author: Atsuko):

[operations/puppet@production] services_proxy: enabling toolhub and ttmserver

https://gerrit.wikimedia.org/r/1287428

Change #1287428 merged by Atsuko:

[operations/puppet@production] services_proxy: enabling toolhub and ttmserver

https://gerrit.wikimedia.org/r/1287428

Something seems to have broken translation memory. https://www.mediawiki.org/w/api.php?action=translationaids&format=jsonfm&title=Translations%3AManual%3APywikibot%2Freplace.py%2F80%2Fde&uselang=en gives:

"ttmserver": {
    "error": "Elastica exception: Elastica\\Exception\\ResponseException: no such index [ttmserver] [index: ttmserver] in /srv/mediawiki/php-1.47.0-wmf.2/vendor/ruflin/elastica/src/Transport/Http.php:178\nStack trace:\n#0 /srv/mediawiki/php-1.47.0-wmf.2/vendor/ruflin/elastica/src/Request.php(183): Elastica\\Transport\\Http->exec(Object(Elastica\\Request), Array)\n#1 /srv/mediawiki/php-1.47.0-wmf.2/vendor/ruflin/elastica/src/Client.php(545): Elastica\\Request->send()\n#2 /srv/mediawiki/php-1.47.0-wmf.2/vendor/ruflin/elastica/src/Search.php(352): Elastica\\Client->request('ttmserver/_sear...', 'POST', Array, Array)\n#3 /srv/mediawiki/php-1.47.0-wmf.2/vendor/ruflin/elastica/src/Index.php(546): Elastica\\Search->search(Object(Elastica\\Query), NULL, 'POST')\n#4 /srv/mediawiki/php-1.47.0-wmf.2/extensions/Translate/src/TtmServer/ElasticSearchTtmServer.php(145): Elastica\\Index->search(Object(Elastica\\Query))\n#5 /srv/mediawiki/php-1.47.0-wmf.2/extensions/Translate/src/TtmServer/ElasticSearchTtmServer.php(72): MediaWiki\\Extension\\Translate\\TtmServer\\ElasticSearchTtmServer->doQuery('en', 'de', 'between $1 and ...')\n#6 /srv/mediawiki/php-1.47.0-wmf.2/extensions/Translate/src/TranslatorInterface/Aid/TTMServerAid.php(69): MediaWiki\\Extension\\Translate\\TtmServer\\ElasticSearchTtmServer->query('en', 'de', 'between $1 and ...')\n#7 /srv/mediawiki/php-1.47.0-wmf.2/extensions/Translate/src/TranslatorInterface/Aid/TranslationAidsActionApi.php(112): MediaWiki\\Extension\\Translate\\TranslatorInterface\\Aid\\TTMServerAid->getData()\n#8 /srv/mediawiki/php-1.47.0-wmf.2/includes/Api/ApiMain.php(2045): MediaWiki\\Extension\\Translate\\TranslatorInterface\\Aid\\TranslationAidsActionApi->execute()\n#9 /srv/mediawiki/php-1.47.0-wmf.2/includes/Api/ApiMain.php(949): MediaWiki\\Api\\ApiMain->executeAction()\n#10 /srv/mediawiki/php-1.47.0-wmf.2/includes/Api/ApiMain.php(920): MediaWiki\\Api\\ApiMain->executeActionWithErrorHandling()\n#11 /srv/mediawiki/php-1.47.0-wmf.2/includes/Api/ApiEntryPoint.php(138): MediaWiki\\Api\\ApiMain->execute()\n#12 /srv/mediawiki/php-1.47.0-wmf.2/includes/MediaWikiEntryPoint.php(180): MediaWiki\\Api\\ApiEntryPoint->execute()\n#13 /srv/mediawiki/php-1.47.0-wmf.2/api.php(30): MediaWiki\\MediaWikiEntryPoint->run()\n#14 /srv/mediawiki/w/api.php(3): require('/srv/mediawiki/...')\n#15 {main}"
},

Reported via T426467: Translation memory is not working on multiple Wikimedia wikis

Something seems to have broken translation memory. https://www.mediawiki.org/w/api.php?action=translationaids&format=jsonfm&title=Translations%3AManual%3APywikibot%2Freplace.py%2F80%2Fde&uselang=en gives:

"ttmserver": {
    "error": "Elastica exception: Elastica\\Exception\\ResponseException: no such index [ttmserver] [index: ttmserver] in /srv/mediawiki/php-1.47.0-wmf.2/vendor/ruflin/elastica/src/Transport/Http.php:178\nStack trace:\n#0 /srv/mediawiki/php-1.47.0-wmf.2/vendor/ruflin/elastica/src/Request.php(183): Elastica\\Transport\\Http->exec(Object(Elastica\\Request), Array)\n#1 /srv/mediawiki/php-1.47.0-wmf.2/vendor/ruflin/elastica/src/Client.php(545): Elastica\\Request->send()\n#2 /srv/mediawiki/php-1.47.0-wmf.2/vendor/ruflin/elastica/src/Search.php(352): Elastica\\Client->request('ttmserver/_sear...', 'POST', Array, Array)\n#3 /srv/mediawiki/php-1.47.0-wmf.2/vendor/ruflin/elastica/src/Index.php(546): Elastica\\Search->search(Object(Elastica\\Query), NULL, 'POST')\n#4 /srv/mediawiki/php-1.47.0-wmf.2/extensions/Translate/src/TtmServer/ElasticSearchTtmServer.php(145): Elastica\\Index->search(Object(Elastica\\Query))\n#5 /srv/mediawiki/php-1.47.0-wmf.2/extensions/Translate/src/TtmServer/ElasticSearchTtmServer.php(72): MediaWiki\\Extension\\Translate\\TtmServer\\ElasticSearchTtmServer->doQuery('en', 'de', 'between $1 and ...')\n#6 /srv/mediawiki/php-1.47.0-wmf.2/extensions/Translate/src/TranslatorInterface/Aid/TTMServerAid.php(69): MediaWiki\\Extension\\Translate\\TtmServer\\ElasticSearchTtmServer->query('en', 'de', 'between $1 and ...')\n#7 /srv/mediawiki/php-1.47.0-wmf.2/extensions/Translate/src/TranslatorInterface/Aid/TranslationAidsActionApi.php(112): MediaWiki\\Extension\\Translate\\TranslatorInterface\\Aid\\TTMServerAid->getData()\n#8 /srv/mediawiki/php-1.47.0-wmf.2/includes/Api/ApiMain.php(2045): MediaWiki\\Extension\\Translate\\TranslatorInterface\\Aid\\TranslationAidsActionApi->execute()\n#9 /srv/mediawiki/php-1.47.0-wmf.2/includes/Api/ApiMain.php(949): MediaWiki\\Api\\ApiMain->executeAction()\n#10 /srv/mediawiki/php-1.47.0-wmf.2/includes/Api/ApiMain.php(920): MediaWiki\\Api\\ApiMain->executeActionWithErrorHandling()\n#11 /srv/mediawiki/php-1.47.0-wmf.2/includes/Api/ApiEntryPoint.php(138): MediaWiki\\Api\\ApiMain->execute()\n#12 /srv/mediawiki/php-1.47.0-wmf.2/includes/MediaWikiEntryPoint.php(180): MediaWiki\\Api\\ApiEntryPoint->execute()\n#13 /srv/mediawiki/php-1.47.0-wmf.2/api.php(30): MediaWiki\\MediaWikiEntryPoint->run()\n#14 /srv/mediawiki/w/api.php(3): require('/srv/mediawiki/...')\n#15 {main}"
},

Reported via T426467: Translation memory is not working on multiple Wikimedia wikis

Apologies about this, we only tested Special:SearchTranslations on meta & wikidata. From the config we thought that adding a new ttmserver service would not make it queryable by default, looking at \MediaWiki\Extension\Translate\TranslatorInterface\Aid\TTMServerAid::getInternalServices I don't see where we actually choose one the ttmserver mirrors. I suspect this is a oversight of the work done in T132076. In its current shape I understand that TTMServerAid is querying all mirrors and thus is making 3 requests to default, eqiad and codfw ttmserver services (now 4 with the new test). TTMServerAid should probably be fixed to query only one of these services.
In the meantime to rapidly fix the problem we could attempt to create an empty ttmserver index in opensearch-ttmserver-test.

In its current shape I understand that TTMServerAid is querying all mirrors and thus is making 3 requests to default, eqiad and codfw ttmserver services (now 4 with the new test).

This is not true, as pointed out by @Nikerabbit TTMServerAid does exclude writable services, but since we added the new test service as not writable (to avoid update failures) it was taken into account by TTMServerAid.

  1. Version check in ttmserver-export.php is released r/1286978.
  2. Need to fix MediaWiki\Extension\Translate\TtmServer\ElasticSearchTtmServer::getReplicaCount to work with current config wmf-config/CommonSettings.php where we use integer but it expects a string, see P92612.
  3. Need a release plan for switchover on production cluster, since adding the server without indices breaks the read operations, see T426467.

Change #1294949 had a related patch set uploaded (by Atsuko; author: Atsuko):

[operations/mediawiki-config@master] translate: adding separate read/write endpoints

https://gerrit.wikimedia.org/r/1294949

atsuko changed the task status from Stalled to In Progress.EditedFri, May 29, 12:21 PM

wikimedia-config backport is scheduled for Tuesday. If everything is going well, will start a new cluster and switchover production as well.

Change #1295901 had a related patch set uploaded (by Atsuko; author: Atsuko):

[operations/puppet@production] service: services_proxy: prod opensearch-on-k8s services

https://gerrit.wikimedia.org/r/1295901

Change #1295901 merged by Atsuko:

[operations/puppet@production] service: services_proxy: prod opensearch-on-k8s services

https://gerrit.wikimedia.org/r/1295901

Change #1294949 merged by jenkins-bot:

[operations/mediawiki-config@master] translate: adding separate read/write endpoints

https://gerrit.wikimedia.org/r/1294949

Mentioned in SAL (#wikimedia-operations) [2026-06-02T07:19:21Z] <atsuko@deploy1003> Started scap sync-world: Backport for [[gerrit:1294949|translate: adding separate read/write endpoints (T425377)]]

Mentioned in SAL (#wikimedia-operations) [2026-06-02T07:21:17Z] <atsuko@deploy1003> atsuko: Backport for [[gerrit:1294949|translate: adding separate read/write endpoints (T425377)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.

Mentioned in SAL (#wikimedia-operations) [2026-06-02T07:40:23Z] <atsuko@deploy1003> Finished scap sync-world: Backport for [[gerrit:1294949|translate: adding separate read/write endpoints (T425377)]] (duration: 21m 01s)

Change #1296262 had a related patch set uploaded (by Atsuko; author: Atsuko):

[operations/mediawiki-config@master] translate: fixing missed variable in credentials formatting closure

https://gerrit.wikimedia.org/r/1296262

Change #1296262 merged by jenkins-bot:

[operations/mediawiki-config@master] translate: fixing missed variable in credentials formatting closure

https://gerrit.wikimedia.org/r/1296262

Mentioned in SAL (#wikimedia-operations) [2026-06-02T07:49:02Z] <atsuko@deploy1003> Started scap sync-world: Backport for [[gerrit:1296262|translate: fixing missed variable in credentials formatting closure (T425377)]]

Mentioned in SAL (#wikimedia-operations) [2026-06-02T07:50:46Z] <atsuko@deploy1003> atsuko: Backport for [[gerrit:1296262|translate: fixing missed variable in credentials formatting closure (T425377)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.

Mentioned in SAL (#wikimedia-operations) [2026-06-02T08:03:49Z] <atsuko@deploy1003> Finished scap sync-world: Backport for [[gerrit:1296262|translate: fixing missed variable in credentials formatting closure (T425377)]] (duration: 14m 47s)

Change #1296488 had a related patch set uploaded (by Atsuko; author: Atsuko):

[operations/mediawiki-config@master] Revert "translate: adding separate read/write endpoints"

https://gerrit.wikimedia.org/r/1296488

Change #1296488 merged by jenkins-bot:

[operations/mediawiki-config@master] Revert "translate: adding separate read/write endpoints"

https://gerrit.wikimedia.org/r/1296488

Mentioned in SAL (#wikimedia-operations) [2026-06-02T08:13:30Z] <atsuko@deploy1003> Started scap sync-world: Backport for [[gerrit:1296488|Revert "translate: adding separate read/write endpoints" (T425377)]]

Mentioned in SAL (#wikimedia-operations) [2026-06-02T08:15:15Z] <atsuko@deploy1003> atsuko: Backport for [[gerrit:1296488|Revert "translate: adding separate read/write endpoints" (T425377)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.

Mentioned in SAL (#wikimedia-operations) [2026-06-02T08:17:03Z] <atsuko@deploy1003> Finished scap sync-world: Backport for [[gerrit:1296488|Revert "translate: adding separate read/write endpoints" (T425377)]] (duration: 03m 33s)

@dcausse and me didn't proceed with roll-out after running checks on testwiki: even tho credentials were in the config, we were still getting unauthenticated access error from the client

[eqiad-test] => Array
    (
        [type] => ttmserver
        [class] => ElasticSearchTTMServer
        [shards] => 1
        [index] => ttmserver-test
        [cutoff] => 0.65
        [writable] => 1
        [use_wikimedia_extra] => 1
        [config] => Array
            (
                [servers] => Array
                    (
                        [0] => Array
                            (
                                [host] => opensearch-ttmserver-test.svc.eqiad.wmnet
                                [transport] => CirrusSearch\Elastica\DeprecationLoggedHttps
                                [port] => 30443
                                [username] => opensearch
                                [password] => <REDACTED>
                            )

                    )

            )

    )
#0 /srv/mediawiki/php-1.47.0-wmf.4/vendor/ruflin/elastica/src/Request.php(183): Elastica\Transport\Http->exec(Object(Elastica\Request), Array)
#1 /srv/mediawiki/php-1.47.0-wmf.4/vendor/ruflin/elastica/src/Client.php(545): Elastica\Request->send()
#2 /srv/mediawiki/php-1.47.0-wmf.4/vendor/ruflin/elastica/src/Client.php(580): Elastica\Client->request('ttmserver-test', 'PUT', Array, Array)
#3 /srv/mediawiki/php-1.47.0-wmf.4/vendor/ruflin/elastica/src/Index.php(774): Elastica\Client->requestEndpoint(Object(Elasticsearch\Endpoints\Indices\Create))
#4 /srv/mediawiki/php-1.47.0-wmf.4/vendor/ruflin/elastica/src/Index.php(510): Elastica\Index->requestEndpoint(Object(Elasticsearch\Endpoints\Indices\Create))
#5 /srv/mediawiki/php-1.47.0-wmf.4/extensions/Translate/src/TtmServer/ElasticSearchTtmServer.php(338): Elastica\Index->create(Array, Array)
#6 /srv/mediawiki/php-1.47.0-wmf.4/extensions/Translate/src/TtmServer/ElasticSearchTtmServer.php(352): MediaWiki\Extension\Translate\TtmServer\ElasticSearchTtmServer->createIndex(false)
#7 /srv/mediawiki/php-1.47.0-wmf.4/extensions/Translate/scripts/ttmserver-export.php(225): MediaWiki\Extension\Translate\TtmServer\ElasticSearchTtmServer->beginBootstrap()
#8 /srv/mediawiki/php-1.47.0-wmf.4/extensions/Translate/scripts/ttmserver-export.php(102): TTMServerBootstrap->beginBootstrap(Object(MediaWiki\Extension\Translate\TtmServer\ElasticSearchTtmServer), 'eqiad-test')
#9 /srv/mediawiki/php-1.47.0-wmf.4/maintenance/includes/MaintenanceRunner.php(692): TTMServerBootstrap->execute()
#10 /srv/mediawiki/php-1.47.0-wmf.4/maintenance/run.php(53): MediaWiki\Maintenance\MaintenanceRunner->run()
#11 /srv/mediawiki/multiversion/MWScript.php(219): require_once('/srv/mediawiki/...')

Change #1296539 had a related patch set uploaded (by Atsuko; author: Atsuko):

[operations/deployment-charts@master] opensearch-cluster: anonymous access for ttmsearch and toolhub

https://gerrit.wikimedia.org/r/1296539

Change #1296539 merged by jenkins-bot:

[operations/deployment-charts@master] opensearch-cluster: anonymous access for ttmsearch and toolhub

https://gerrit.wikimedia.org/r/1296539

Indices are accessible without pasword. Before:

atsuko@deploy1003:~$ curl -X PUT https://opensearch-ttmserver-test.svc.eqiad.wmnet:30443/ttmserver/_settings -H "Content-Type: application/json" -d '{"index": {"number_of_replicas": 1}}'; echo
{"error":{"root_cause":[{"type":"security_exception","reason":"no permissions for [indices:admin/settings/update] and User [name=opendistro_security_anonymous, backend_roles=[opendistro_security_anonymous_backendrole], requestedTenant=null]"}],"type":"security_exception","reason":"no permissions for [indices:admin/settings/update] and User [name=opendistro_security_anonymous, backend_roles=[opendistro_security_anonymous_backendrole], requestedTenant=null]"},"status":403}

After:

atsuko@deploy1003:~$ curl -X PUT https://opensearch-ttmserver-test.svc.codfw.wmnet:30443/ttmserver/_settings -H "Content-Type: application/json" -d '{"index": {"number_of_replicas": 1}}'; echo
{"acknowledged":true}
atsuko@deploy1003:~$ curl -X PUT https://opensearch-ttmserver-test.svc.eqiad.wmnet:30443/ttmserver/_settings -H "Content-Type: application/json" -d '{"index": {"number_of_replicas": 1}}'; echo
{"acknowledged":true}

Change #1296608 had a related patch set uploaded (by Atsuko; author: Atsuko):

[operations/puppet@production] services_proxy: switch to prod opensearch-on-k8s services

https://gerrit.wikimedia.org/r/1296608

Change #1296608 merged by Atsuko:

[operations/puppet@production] services_proxy: switch to prod opensearch-on-k8s services

https://gerrit.wikimedia.org/r/1296608

Change #1296631 had a related patch set uploaded (by Atsuko; author: Atsuko):

[operations/mediawiki-config@master] translate: adding separate read/write endpoints

https://gerrit.wikimedia.org/r/1296631

Change #1296631 merged by jenkins-bot:

[operations/mediawiki-config@master] translate: adding separate read/write endpoints

https://gerrit.wikimedia.org/r/1296631

Mentioned in SAL (#wikimedia-operations) [2026-06-03T14:05:40Z] <dcausse@deploy1003> Finished scap sync-world: Backport for [[gerrit:1296631|translate: adding separate read/write endpoints (T425377)]] (duration: 13m 06s)

Attempted a deploy and got:

Elastica\Exception\ResponseException from line 178 of /srv/mediawiki/php-1.47.0-wmf.5/vendor/ruflin/elastica/src/Transport/Http.php: request [/ttmserver-test/_mapping] contains unrecognized parameter: [include_type_name]
#0 /srv/mediawiki/php-1.47.0-wmf.5/vendor/ruflin/elastica/src/Request.php(183): Elastica\Transport\Http->exec(Object(Elastica\Request), Array)
#1 /srv/mediawiki/php-1.47.0-wmf.5/vendor/ruflin/elastica/src/Client.php(545): Elastica\Request->send()
#2 /srv/mediawiki/php-1.47.0-wmf.5/vendor/ruflin/elastica/src/Client.php(580): Elastica\Client->request('ttmserver-test/...', 'PUT', Array, Array)
#3 /srv/mediawiki/php-1.47.0-wmf.5/vendor/ruflin/elastica/src/Index.php(774): Elastica\Client->requestEndpoint(Object(Elasticsearch\Endpoints\Indices\PutMapping))
#4 /srv/mediawiki/php-1.47.0-wmf.5/vendor/ruflin/elastica/src/Mapping.php(172): Elastica\Index->requestEndpoint(Object(Elasticsearch\Endpoints\Indices\PutMapping))
#5 /srv/mediawiki/php-1.47.0-wmf.5/extensions/Translate/src/TtmServer/ElasticSearchTtmServer.php(390): Elastica\Mapping->send(Object(Elastica\Index), Array)
#6 /srv/mediawiki/php-1.47.0-wmf.5/extensions/Translate/scripts/ttmserver-export.php(225): MediaWiki\Extension\Translate\TtmServer\ElasticSearchTtmServer->beginBootstrap()
#7 /srv/mediawiki/php-1.47.0-wmf.5/extensions/Translate/scripts/ttmserver-export.php(102): TTMServerBootstrap->beginBootstrap(Object(MediaWiki\Extension\Translate\TtmServer\ElasticSearchTtmServer), 'codfw-test')
#8 /srv/mediawiki/php-1.47.0-wmf.5/maintenance/includes/MaintenanceRunner.php(692): TTMServerBootstrap->execute()
#9 /srv/mediawiki/php-1.47.0-wmf.5/maintenance/run.php(53): MediaWiki\Maintenance\MaintenanceRunner->run()
#10 /srv/mediawiki/multiversion/MWScript.php(219): require_once('/srv/mediawiki/...')

MW was able to initiate the creation of the index solving the perms/auth issues.
The error indicates that Translate is not compatible with opensearch2 and needs to be adapted.

atsuko updated Other Assignee, added: atsuko.
atsuko subscribed.

Assigning to @Nikerabbit for further guidance on how we can proceed

atsuko changed the task status from In Progress to Stalled.Thu, Jun 4, 11:26 AM
atsuko updated Other Assignee, removed: atsuko.

Tested proposed patch, it seems to unblock us. Release available to continue on Tuesday.

atsuko changed the task status from Stalled to In Progress.Tue, Jun 9, 10:28 AM

The fix in T428168 seems to be working, we need either to backport it to wmf.5 or wait for wmf.6 to be released fully. Preparing the switchover diff in the meantime.

Change #1299529 had a related patch set uploaded (by Atsuko; author: Atsuko):

[operations/mediawiki-config@master] translate: adding separate read/write endpoints

https://gerrit.wikimedia.org/r/1299529

Change #1299529 merged by jenkins-bot:

[operations/mediawiki-config@master] translate: adding separate read/write endpoints

https://gerrit.wikimedia.org/r/1299529

Mentioned in SAL (#wikimedia-operations) [2026-06-10T07:15:30Z] <atsuko@deploy1003> Started scap sync-world: Backport for [[gerrit:1299556|ElasticSearchTtmServer: drop include_type_name and support int replicas (T428168)]], [[gerrit:1299561|ElasticSearchTtmServer: clean stale _doc usage and version error output (T428168)]], [[gerrit:1299529|translate: adding separate read/write endpoints (T425377)]]

Mentioned in SAL (#wikimedia-operations) [2026-06-10T07:17:44Z] <atsuko@deploy1003> atsuko: Backport for [[gerrit:1299556|ElasticSearchTtmServer: drop include_type_name and support int replicas (T428168)]], [[gerrit:1299561|ElasticSearchTtmServer: clean stale _doc usage and version error output (T428168)]], [[gerrit:1299529|translate: adding separate read/write endpoints (T425377)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be veri

Mentioned in SAL (#wikimedia-operations) [2026-06-10T07:29:34Z] <atsuko@deploy1003> Finished scap sync-world: Backport for [[gerrit:1299556|ElasticSearchTtmServer: drop include_type_name and support int replicas (T428168)]], [[gerrit:1299561|ElasticSearchTtmServer: clean stale _doc usage and version error output (T428168)]], [[gerrit:1299529|translate: adding separate read/write endpoints (T425377)]] (duration: 14m 03s)

Deployed test server. Test indices are re-generated, prod indices getting data from usage.

atsuko@deploy1003:~$ for s in opensearch-ttmserver{,-test}.svc.{eqiad,codfw}.wmnet; do echo $s; curl --silent https://${s}:30443/_cat/indices | grep ttmserver ; doneopensearch-ttmserver.svc.eqiad.wmnet
...
opensearch-ttmserver-test.svc.eqiad.wmnet
green open ttmserver                    DiAisX6OT4W1LCHHiBg_fw 1 1    3    0  56.9kb  28.4kb
green open ttmserver-test               ZmbcgSQcSNOCOF9kh7QycQ 1 1 4445    0   3.1mb   1.5mb
opensearch-ttmserver-test.svc.codfw.wmnet
green open ttmserver                    lAGrlZyoR7WbOrKHFGhS4Q 1 1    3   0  55.5kb  27.7kb
green open ttmserver-test               RYtD_4ATSoeD-eyNj3Tm5g 1 1 4445   0   3.1mb   1.5mb

Plan

  1. figure out patch for the config to split test and prod servers, so the testwiki will go to ttmserver-test and the prod to ttmserver,
  2. test how prod index regeneration works on test wiki using the script
  3. plan the switchover

Notes:

dcausse: the prod index will take for ever to populate via ttmserver-refresh and won't be doable during the scap test phase

Mentioned in SAL (#wikimedia-operations) [2026-06-10T13:58:38Z] <atsuko@deploy1003> mwscript-k8s job started: extensions/Translate/scripts/ttmserver-export.php --wiki=default --ttmserver eqiad-test # T425377 populating production index on test cluster to estimate time required for the release

Mentioned in SAL (#wikimedia-operations) [2026-06-10T14:24:54Z] <atsuko@deploy1003> mwscript-k8s job started: foreachwikiindblist translate extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # T425377 populating ttmserver index on test cluster to estimate time required for the release, now with dblist translate

Mentioned in SAL (#wikimedia-operations) [2026-06-10T14:54:14Z] <atsuko@deploy1003> mwscript-k8s job started: foreachwikiindblist translate extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # T425377 populating ttmserver index on test cluster to estimate time required for the release, now with correct schema

Mentioned in SAL (#wikimedia-operations) [2026-06-10T15:20:39Z] <atsuko@deploy1003> mwscript-k8s job started: foreachwikiindblist translate extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # T425377 populating ttmserver index on test cluster to estimate time required for the release

Mentioned in SAL (#wikimedia-operations) [2026-06-10T15:33:29Z] <atsuko@deploy1003> mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # T425377 populating ttmserver index on test cluster to estimate time required for the release (dblist: https://phabricator.wikimedia.org/P94013)

Reindexing took ~15h, here's comparison of prod and new index:

atsuko@deploy1003:~$ curl http://localhost:6102/_cat/indices/ttmserver?vhealth status index     uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   ttmserver WnIZZzjmS26Uup3tMDSrfA   1   2    5600092      1986618     16.5gb          5.5gb
health status index     uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   ttmserver f62XM4FjQL2PlVDZlEoEJw   1   2    5141294         2571     10.4gb          3.4gb

Mentioned in SAL (#wikimedia-operations) [2026-06-10T17:33:22Z] <atsuko@deploy1003> mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # T425377 populating ttmserver index on test cluster to estimate time required for the release (dblist: https://phabricator.wikimedia.org/P94021)

Mentioned in SAL (#wikimedia-operations) [2026-06-11T08:22:18Z] <atsuko@deploy1003> mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # T425377 populating ttmserver index on test cluster to estimate time required for the release (dblist: https://phabricator.wikimedia.org/P94051)

Mentioned in SAL (#wikimedia-operations) [2026-06-11T08:25:01Z] <atsuko@deploy1003> mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # T425377 populating ttmserver index on test cluster to estimate time required for the release (dblist: https://phabricator.wikimedia.org/P94052)

Mentioned in SAL (#wikimedia-operations) [2026-06-11T08:33:57Z] <atsuko@deploy1003> mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # T425377 populating ttmserver index on test cluster to estimate time required for the release (dblist: https://phabricator.wikimedia.org/P94053)

Mentioned in SAL (#wikimedia-operations) [2026-06-11T08:34:43Z] <atsuko@deploy1003> mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # T425377 populating ttmserver index on test cluster to estimate time required for the release (dblist: https://phabricator.wikimedia.org/P94055)

Change #1300813 had a related patch set uploaded (by Atsuko; author: Atsuko):

[operations/deployment-charts@master] opensearch-ttmserver: increase memory to 1x index

https://gerrit.wikimedia.org/r/1300813

Trying to increase memory available for opensearch to handle large index because P90+ doesn't look good. However, we don't have metrics from choudelastic-chi to compare with.

Change #1301373 had a related patch set uploaded (by Atsuko; author: Atsuko):

[operations/mediawiki-config@master] translate: production opensearch on k8s endpoints

https://gerrit.wikimedia.org/r/1301373

Change #1301373 merged by jenkins-bot:

[operations/mediawiki-config@master] translate: production opensearch on k8s endpoints

https://gerrit.wikimedia.org/r/1301373

Mentioned in SAL (#wikimedia-operations) [2026-06-15T07:49:54Z] <atsuko@deploy1003> Started scap sync-world: Backport for [[gerrit:1301373|translate: production opensearch on k8s endpoints (T425377)]]

Mentioned in SAL (#wikimedia-operations) [2026-06-15T07:53:48Z] <atsuko@deploy1003> atsuko: Backport for [[gerrit:1301373|translate: production opensearch on k8s endpoints (T425377)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.

Mentioned in SAL (#wikimedia-operations) [2026-06-15T08:10:49Z] <atsuko@deploy1003> Finished scap sync-world: Backport for [[gerrit:1301373|translate: production opensearch on k8s endpoints (T425377)]] (duration: 20m 54s)

Translate is running on both prod CirrusSearch and prod K8S now, will switch off CirrusSearch when reindexing finishes. Temporarily increasing available memory for the duration of re-index.

Change #1300813 merged by jenkins-bot:

[operations/deployment-charts@master] opensearch-ttmserver: increase memory to 1x index

https://gerrit.wikimedia.org/r/1300813

Mentioned in SAL (#wikimedia-operations) [2026-06-15T09:40:35Z] <atsuko@deploy1003> mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-k8s # T425377: populating translation memory (ttmserver-export.php) on eqiad-k8s (dblist: https://phabricator.wikimedia.org/P94120)

Mentioned in SAL (#wikimedia-operations) [2026-06-15T11:43:00Z] <atsuko@deploy1003> mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-k8s # T425377: populating translation memory (ttmserver-export.php) on eqiad-k8s (dblist: https://phabricator.wikimedia.org/P94127)

Mentioned in SAL (#wikimedia-operations) [2026-06-15T11:44:45Z] <atsuko@deploy1003> mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver codfw-k8s # T425377: populating translation memory (ttmserver-export.php) on codfw-k8s (dblist: https://phabricator.wikimedia.org/P94128)