from an alert:
https://alerts.wikimedia.org/?q=team%3Dwmcs
Check unit status of backup_vms summary: CRITICAL: Status of the systemd unit backup_vms 2 hours ago instance: cloudvirt1021 source: icinga team: wmcs @receiver: irc-spam
from an alert:
https://alerts.wikimedia.org/?q=team%3Dwmcs
Check unit status of backup_vms summary: CRITICAL: Status of the systemd unit backup_vms 2 hours ago instance: cloudvirt1021 source: icinga team: wmcs @receiver: irc-spam
It seems there's some mess with the backy parameter order or the database:
Mar 20 20:00:05 cloudvirt1021 systemd[1]: Started Backup vms assigned to this host. Mar 20 20:02:02 cloudvirt1021 wmcs-backup[3528010]: ERROR: [backy2.logging] Couldn't parse datetime string: 'b0094916-a7bf-11ec-95b0-dd68c0a50dd7' Mar 20 20:02:02 cloudvirt1021 wmcs-backup[3528010]: ERROR: [backy2.logging] Backy failed. Mar 20 20:02:02 cloudvirt1021 wmcs-backup[3527381]: Traceback (most recent call last): Mar 20 20:02:02 cloudvirt1021 wmcs-backup[3527381]: File "/usr/local/sbin/wmcs-backup", line 2037, in <module> Mar 20 20:02:02 cloudvirt1021 wmcs-backup[3527381]: args.func() Mar 20 20:02:02 cloudvirt1021 wmcs-backup[3527381]: File "/usr/local/sbin/wmcs-backup", line 1870, in <lambda> Mar 20 20:02:02 cloudvirt1021 wmcs-backup[3527381]: func=lambda: get_current_instances_state( Mar 20 20:02:02 cloudvirt1021 wmcs-backup[3527381]: File "/usr/local/sbin/wmcs-backup", line 1600, in get_current_instances_state Mar 20 20:02:02 cloudvirt1021 wmcs-backup[3527381]: backup_entries = get_backups() Mar 20 20:02:02 cloudvirt1021 wmcs-backup[3527381]: File "/usr/lib/python3/dist-packages/rbd2backy2.py", line 600, in get_backups Mar 20 20:02:02 cloudvirt1021 wmcs-backup[3527381]: backup = subprocess.check_output([BACKY, "-ms", "ls"]) Mar 20 20:02:02 cloudvirt1021 wmcs-backup[3527381]: File "/usr/lib/python3.9/subprocess.py", line 424, in check_output Mar 20 20:02:02 cloudvirt1021 wmcs-backup[3527381]: return run(*popenargs, stdout=PIPE, timeout=timeout, check=True, Mar 20 20:02:02 cloudvirt1021 wmcs-backup[3527381]: File "/usr/lib/python3.9/subprocess.py", line 528, in run Mar 20 20:02:02 cloudvirt1021 wmcs-backup[3527381]: raise CalledProcessError(retcode, process.args, Mar 20 20:02:02 cloudvirt1021 wmcs-backup[3527381]: subprocess.CalledProcessError: Command '['/usr/bin/backy2', '-ms', 'ls']' returned non-zero exit status 100.
It's related to the database for the backups, backy2 by itself already fails when trying to do an ls:
root@cloudvirt1021:~# backy2 -d -v -ms ls
ERROR: [backy2.logging] Unexpected exception
ERROR: [backy2.logging] Couldn't parse datetime string: 'b0094916-a7bf-11ec-95b0-dd68c0a50dd7'
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/backy2/scripts/backy.py", line 769, in main
func(**func_args)
File "/usr/lib/python3/dist-packages/backy2/scripts/backy.py", line 139, in ls
versions = backy.ls()
File "/usr/lib/python3/dist-packages/backy2/backy.py", line 74, in ls
versions = self.meta_backend.get_versions()
File "/usr/lib/python3/dist-packages/backy2/meta_backends/sql.py", line 397, in get_versions
return self.session.query(Version).order_by(Version.name, Version.date).all()
File "/usr/lib/python3/dist-packages/sqlalchemy/orm/query.py", line 3373, in all
return list(self)
File "/usr/lib/python3/dist-packages/sqlalchemy/orm/loading.py", line 100, in instances
cursor.close()
File "/usr/lib/python3/dist-packages/sqlalchemy/util/langhelpers.py", line 68, in __exit__
compat.raise_(
File "/usr/lib/python3/dist-packages/sqlalchemy/util/compat.py", line 182, in raise_
raise exception
File "/usr/lib/python3/dist-packages/sqlalchemy/orm/loading.py", line 80, in instances
rows = [proc(row) for row in fetch]
File "/usr/lib/python3/dist-packages/sqlalchemy/orm/loading.py", line 80, in <listcomp>
rows = [proc(row) for row in fetch]
File "/usr/lib/python3/dist-packages/sqlalchemy/orm/loading.py", line 579, in _instance
_populate_full(
File "/usr/lib/python3/dist-packages/sqlalchemy/orm/loading.py", line 725, in _populate_full
dict_[key] = getter(row)
File "/usr/lib/python3/dist-packages/sqlalchemy/engine/result.py", line 107, in __getitem__
return processor(self._row[index])
File "/usr/lib/python3/dist-packages/sqlalchemy/processors.py", line 44, in process
raise ValueError(
ValueError: Couldn't parse datetime string: 'b0094916-a7bf-11ec-95b0-dd68c0a50dd7'
ERROR: [backy2.logging] Backy failed.I think that with the upgrade, we might have to do some sqlalchemy migrations that we have not done...
looking
#
So in order to migrate from version to version of backy if you use sqlite, you have to before upgrading export the backups, remove the database, and then after upgrading import them:
Tried opening the same backy db with an older version (buster):
dcaro@cloudbackup2002:~$ backy2 --version 2.13.2
But still got error:
dcaro@cloudbackup2002:~$ backy2 -c backy.cfg ls
INFO: [backy2.logging] $ /usr/bin/backy2 -c backy.cfg ls
ERROR: [backy2.logging] Couldn't parse datetime string: 'b0094916-a7bf-11ec-95b0-dd68c0a50dd7'
ERROR: [backy2.logging] Backy failed.So not sure anymore what version we had.
This is interesting though:
sqlite> select * from versions where date='b0094916-a7bf-11ec-95b0-dd68c0a50dd7'; |b0094916-a7bf-11ec-95b0-dd68c0a50dd7|4|2022-03-19 20:03:47.708378||4194304|1|0| ...
So it seems that the versions table has some corrupted data somehow.
Yep, just the DB was corrupted, to check:
root@cloudvirt1021:~# sqlite3 /srv/backy/backy.sqlite SQLite version 3.34.1 2021-01-20 14:10:07 Enter ".help" for usage hints. sqlite> pragma integrity_check; *** in database main *** On tree page 71625 cell 219: Rowid 861598 out of order On tree page 4 cell 0: 2nd reference to page 71847 row 304958 missing from index ix_blocks_checksum ...
To fix did the following:
root@cloudvirt1021:~# sqlite3 /srv/backy/backy.sqlite sqlite> .output backup.db sqlite> .dump sqlite> .quit # the backy.fixed.sqlite file did not exist yet, sqlite3 creates it root@cloudvirt1021:~# sqlite3 backy.fixed.sqlite SQLite version 3.34.1 2021-01-20 14:10:07 Enter ".help" for usage hints. sqlite> .read backup.db sqlite> .quit # restore the DB root@cloudvirt1021:~# cp backy.fixed.sqlite /srv/backy2/backy.sqlite
That did it, will add to the runbook
Change 772839 had a related patch set uploaded (by David Caro; author: David Caro):
[operations/puppet@production] wmcs.backy2: add link to the runbook for backup_vms
Change 772839 merged by David Caro:
[operations/puppet@production] wmcs.backy2: add link to the runbook for backup_vms
Change 775961 had a related patch set uploaded (by Nskaggs; author: Nskaggs):
[operations/puppet@production] wmcs.backy2: fix typo in link to runbook for backup_vms
Change 775961 merged by Arturo Borrero Gonzalez:
[operations/puppet@production] wmcs.backy2: fix typo in link to runbook for backup_vms