Page MenuHomePhabricator

Prepare and check storage layer for btmwiki
Closed, ResolvedPublic

Description

The new wiki's visibility will be: Public.

Event Timeline

ABran-WMF claimed this task.
ABran-WMF moved this task from Triage to Done on the DBA board.
ABran-WMF subscribed.

private data has been sanitized
view database has been created with the proper accounting

I think sre.wikireplicas.add-wiki needs to be executed by WMCS, see https://wikitech.wikimedia.org/wiki/Add_a_wiki#Maintain_views.

ah indeed, mybad

All done, ready for the views creation.

fnegri subscribed.

I will run the sre.wikireplicas.add-wiki cookbook

fnegri changed the task status from Open to In Progress.Jun 26 2024, 12:54 PM
fnegri triaged this task as High priority.

Mentioned in SAL (#wikimedia-operations) [2024-06-26T13:18:58Z] <fnegri@cumin1002> START - Cookbook sre.wikireplicas.add-wiki for database btmwiki (T368066)

Mentioned in SAL (#wikimedia-operations) [2024-06-26T13:31:55Z] <fnegri@cumin1002> END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) for database btmwiki (T368066)

The cookbook completed with PASS, but there were some errors in the DNS creation:

2024-06-26T13:19:11Z root         ERROR   : Zone analytics.db.svc.wikimedia.cloud. does not exist.  Please create it and re-run.

I fixed the issue in https://gerrit.wikimedia.org/r/c/operations/puppet/+/1049933 and I'm running the cookbook again.

Mentioned in SAL (#wikimedia-operations) [2024-06-26T13:35:45Z] <fnegri@cumin1002> START - Cookbook sre.wikireplicas.add-wiki for database btmwiki (T368066)

Mentioned in SAL (#wikimedia-operations) [2024-06-26T14:01:30Z] <fnegri@cumin1002> END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) for database btmwiki (T368066)

The second run of the cookbook completed successfully.
Anything left to do in this task?

Cookbook cookbooks.sre.wikireplicas.update-views run by btullis: Started updating wiki replica views

Cookbook cookbooks.sre.wikireplicas.update-views started by btullis executed with errors:

  • an-redacteddb1001.eqiad.wmnet (FAIL)
    • Ran Puppet agent
    • The maintain-views run failed, see OUTPUT of 'maintain-views ...' above for details

We experienced a failure relating to the sqooping of this new wiki into HDFS, so I'm just investigating why this might be. Apologies for the noise, that was caused by my running the maintain-views cookbook with incompatible arguments.

We are still experiencing a failure relating to btmwiki at the beginning of each month.
It is something to do with the grants on an-redacteddb1001, I believe.
I have created T371991: Investigate MariaDB grant issues with the btmwiki database on an-redacteddb1001 to track the work to fix it.