Page MenuHomePhabricator

Run maintain-views to create new ORES tables
Closed, ResolvedPublic

Description

From the above initial list the extension is already enabled in the following wikis:

simplewiki
trwiki

And we have also enabled it in idwiki.
I have created the tables for the remaining list of wikis (ores_classification and ores_model tables).
This is the list of the remaining wikis, as this step precedes the extension installation:

cywiki
bewiki
kkwiki
nnwiki
mkwiki
lawiki
afwiki
tewiki
mrwiki
swwiki
mlwiki
iswiki
pawiki
hawiki
tlwiki
bnwiki
azwiki

Could someone run the maintain-views script for the above? Thank youu ๐Ÿ™
cc: @Ladsgroup @taavi

Event Timeline

Restricted Application added subscribers: Nemoralis, Aklapper. ยท View Herald Transcript

From the parent task:

I ran it with multiple db options and only lawiki was run. So the cookbook must be either run one by one (or xargs) or the old fashioned way. I leave that to WMCS to handle.

I think we can try running with --all-databases, though sometimes it hangs due to table locks. See for example T375751#10278291.

I think we can try running with --all-databases, though sometimes it hangs due to table locks. See for example T375751#10278291.

We only need to create missing views here (instead of re-creating existing views) so getting locks should not be a problem.

True that. I'm also double checking why specifying multiple dbs did not work, the option says

group.add_argument(
    "--databases",
    help=(
        "Specify database(s) to work on, instead of all. Multiple"
        " values can be given space-separated."
    ),
    nargs="+",
)
fnegri changed the task status from Open to In Progress.May 23 2025, 12:53 PM
fnegri claimed this task.

Looks like the "multiple database" list is supported by maintain-views.py, but not by the update-views cookbook.

I'm gonna go with:

fnegri@cumin1002:~$ sudo cookbook sre.wikireplicas.update-views --clean -t T395122

Cookbook cookbooks.sre.wikireplicas.update-views run by fnegri: Started updating wiki replica views

The cookbook crashed unexpectedly, I somehow also lost my tmux session. There is nothing in the spicerack logs, these are the last lines:

2025-05-23 13:38:36,098 fnegri 1213905 [INFO wmflib.actions:126 in _action] Ran Puppet agent
2025-05-23 13:38:36,099 fnegri 1213905 [DEBUG spicerack.remote:750 in _execute] Executing commands ['maintain-views --replace-all --auto-depool --clean --all-databases'] on 1 hosts: clouddb1017.eqiad.wmnet
2025-05-23 13:38:36,102 fnegri 1213905 [INFO cumin.transports.clustershell.ClusterShellWorker:78 in execute] Executing commands [cumin.transports.Command('maintain-views --replace-all --auto-depool --clean --all-databases')] on '1' hosts: clouddb1017.eqiad.wmnet
2025-05-23 13:38:36,108 fnegri 1213905 [DEBUG cumin.transports.clustershell.SyncEventHandler:590 in ev_pickup] node=clouddb1017.eqiad.wmnet, command='maintain-views --replace-all --auto-depool --clean --all-databases'

I will retry.

Cookbook cookbooks.sre.wikireplicas.update-views run by fnegri: Started updating wiki replica views

Cookbook cookbooks.sre.wikireplicas.update-views started by fnegri executed with errors:

  • an-redacteddb1001.eqiad.wmnet (PASS)
    • Ran Puppet agent
    • Ran 'maintain-views --replace-all --auto-depool --clean --all-databases'
  • clouddb1017.eqiad.wmnet (PASS)
    • Ran Puppet agent
    • Ran 'maintain-views --replace-all --auto-depool --clean --all-databases'
  • clouddb1018.eqiad.wmnet (FAIL)
    • Ran Puppet agent
    • The maintain-views run failed, see OUTPUT of 'maintain-views ...' above for details

Failed again, but with an error message:

pymysql.err.OperationalError: (1205, 'Lock wait timeout exceeded; try restarting transaction')
================
PASS |                                                                                                                               |   0% (0/1) [02:16<?, ?hosts/s]
FAIL |โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 100% (1/1) [02:16<00:00, 136.78s/hosts]
100.0% (1/1) of nodes failed to execute command 'maintain-views -... --all-databases': clouddb1018.eqiad.wmnet
0.0% (0/1) success ratio (< 100.0% threshold) for command: 'maintain-views -... --all-databases'. Aborting.

The views were created in an-redacteddb1001 and clouddb1017, but failed on clouddb1018. I looked more at the logs and it failed while creating cswiki_p.logging:

2025-05-23 14:17:20,978 INFO [cswiki_p.logging]
2025-05-23 14:18:20,979 WARNING Depooling s2 and retrying
2025-05-23 14:19:36,114 INFO Re-pooling section s2
Traceback (most recent call last):
  File "/usr/local/sbin/maintain-views", line 142, in write_execute
    self.cursor.execute(query)

The automatic depooling was not enough. Perhaps we could try sleeping more than 15 seconds after depooling, but that's unrelated to this task, and can be discussed in T300427: Automate maintain-views replica depooling.

I will continue this work on Monday, running maintain-views manually on clouddb1018 and on the remaining hosts, to have more control on what is happening.

You can specify the table instead, ores_model and ores_classification. It means you have to run it twice but that should do the trick as those tables are clearly not being read/written right now.

You can specify the table instead, ores_model and ores_classification

Yes good point. I would still like to do a full run to make sure everything is in sync with the yaml file. I'll post an update later today.

fnegri triaged this task as Medium priority.May 26 2025, 11:06 AM

Cookbook cookbooks.sre.wikireplicas.update-views run by fnegri: Started updating wiki replica views

Cookbook cookbooks.sre.wikireplicas.update-views started by fnegri completed:

  • an-redacteddb1001.eqiad.wmnet (PASS)
    • Ran Puppet agent
    • Ran 'maintain-views --replace-all --auto-depool --all-databases --table ores_model'
  • clouddb1017.eqiad.wmnet (PASS)
    • Ran Puppet agent
    • Ran 'maintain-views --replace-all --auto-depool --all-databases --table ores_model'
  • clouddb1018.eqiad.wmnet (PASS)
    • Ran Puppet agent
    • Ran 'maintain-views --replace-all --auto-depool --all-databases --table ores_model'
  • clouddb1019.eqiad.wmnet (PASS)
    • Ran Puppet agent
    • Ran 'maintain-views --replace-all --auto-depool --all-databases --table ores_model'
  • clouddb1020.eqiad.wmnet (PASS)
    • Ran Puppet agent
    • Ran 'maintain-views --replace-all --auto-depool --all-databases --table ores_model'
  • clouddb1013.eqiad.wmnet (PASS)
    • Ran Puppet agent
    • Ran 'maintain-views --replace-all --auto-depool --all-databases --table ores_model'
  • clouddb1014.eqiad.wmnet (PASS)
    • Ran Puppet agent
    • Ran 'maintain-views --replace-all --auto-depool --all-databases --table ores_model'
  • clouddb1015.eqiad.wmnet (PASS)
    • Ran Puppet agent
    • Ran 'maintain-views --replace-all --auto-depool --all-databases --table ores_model'
  • clouddb1016.eqiad.wmnet (PASS)
    • Ran Puppet agent
    • Ran 'maintain-views --replace-all --auto-depool --all-databases --table ores_model'

Cookbook cookbooks.sre.wikireplicas.update-views run by fnegri: Started updating wiki replica views

Cookbook cookbooks.sre.wikireplicas.update-views started by fnegri completed:

  • an-redacteddb1001.eqiad.wmnet (PASS)
    • Ran Puppet agent
    • Ran 'maintain-views --replace-all --auto-depool --all-databases --table ores_classification'
  • clouddb1017.eqiad.wmnet (PASS)
    • Ran Puppet agent
    • Ran 'maintain-views --replace-all --auto-depool --all-databases --table ores_classification'
  • clouddb1018.eqiad.wmnet (PASS)
    • Ran Puppet agent
    • Ran 'maintain-views --replace-all --auto-depool --all-databases --table ores_classification'
  • clouddb1019.eqiad.wmnet (PASS)
    • Ran Puppet agent
    • Ran 'maintain-views --replace-all --auto-depool --all-databases --table ores_classification'
  • clouddb1020.eqiad.wmnet (PASS)
    • Ran Puppet agent
    • Ran 'maintain-views --replace-all --auto-depool --all-databases --table ores_classification'
  • clouddb1013.eqiad.wmnet (PASS)
    • Ran Puppet agent
    • Ran 'maintain-views --replace-all --auto-depool --all-databases --table ores_classification'
  • clouddb1014.eqiad.wmnet (PASS)
    • Ran Puppet agent
    • Ran 'maintain-views --replace-all --auto-depool --all-databases --table ores_classification'
  • clouddb1015.eqiad.wmnet (PASS)
    • Ran Puppet agent
    • Ran 'maintain-views --replace-all --auto-depool --all-databases --table ores_classification'
  • clouddb1016.eqiad.wmnet (PASS)
    • Ran Puppet agent
    • Ran 'maintain-views --replace-all --auto-depool --all-databases --table ores_classification'
fnegri moved this task from In progress to Done on the cloud-services-team (FY2024/2025-Q3-Q4) board.

Using --table worked quite fast, this task is Resolved.