It seems like running any PostgreSQL command on cloudbackup2001 just gets stuck. This is also blocking Puppet runs.
Description
Description
Event Timeline
Comment Actions
Not sure if related, but the host is also almost out of disk space:
/dev/mapper/backup-cinder--backups 80T 75T 1.5T 99% /srv/cinder-backups
Comment Actions
the postgres log is full of:
2023-10-09 01:44:03 GMT LOG: using stale statistics instead of current ones because stats collector is not responding 2023-10-09 01:44:13 GMT LOG: using stale statistics instead of current ones because stats collector is not responding 2023-10-09 01:44:23 GMT LOG: using stale statistics instead of current ones because stats collector is not responding 2023-10-09 01:44:34 GMT LOG: using stale statistics instead of current ones because stats collector is not responding 2023-10-09 01:44:44 GMT LOG: using stale statistics instead of current ones because stats collector is not responding 2023-10-09 01:44:54 GMT LOG: using stale statistics instead of current ones because stats collector is not responding
Comment Actions
Mentioned in SAL (#wikimedia-cloud) [2023-10-09T07:35:14Z] <taavi> restart postgresql on cloudbackup2001 T348431
Comment Actions
I restarted Postgres. It's clearly doing something on an 80G pgsql_tmp directory according to a strace, but that's taking a while. I'll come back to it later.
Comment Actions
It seems like Postgres is back up. Andrew also did something to the data directory and now it's about halfway full instead of 99% full.