Page MenuHomePhabricator

maintain-dbusers doesn't close connections right on harvest-replicas
Closed, ResolvedPublic

Description

When running the "harvest-replicas" action, connections get left behind, unclosed. Fix that.

Event Timeline

Andrew triaged this task as Medium priority.Dec 8 2020, 5:09 PM
Andrew moved this task from Inbox to Doing on the cloud-services-team (Kanban) board.

Change 647285 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[operations/puppet@production] wikireplicas and toolsdb: close connections when done with them

https://gerrit.wikimedia.org/r/647285

Change 647285 merged by Bstorm:
[operations/puppet@production] wikireplicas and toolsdb: close connections when done with them

https://gerrit.wikimedia.org/r/647285

Change 647419 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[operations/puppet@production] wikireplicas: close all connections

https://gerrit.wikimedia.org/r/647419

Change 647419 merged by Bstorm:
[operations/puppet@production] wikireplicas: close all connections

https://gerrit.wikimedia.org/r/647419

Change 649475 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[operations/puppet@production] wikireplicas: close the connection object for maintain-meta_p

https://gerrit.wikimedia.org/r/649475

Change 649475 merged by Bstorm:
[operations/puppet@production] wikireplicas: close the connection object for maintain-meta_p

https://gerrit.wikimedia.org/r/649475

That's all of the major scripts that hit the replicas.

Marostegui subscribed.

This seems to be happening for labsdbadmin every minute

Jan 12 06:48:17 clouddb1015 mysqld[4551]: 2021-01-12  6:48:17 2106410 [Warning] Aborted connection 2106410 to db: 'unconnected' user: 'labsdbadmin' host: '10.64.37.19' (Got an error reading communication packets)
Jan 12 06:49:22 clouddb1015 mysqld[4551]: 2021-01-12  6:49:22 2106495 [Warning] Aborted connection 2106495 to db: 'unconnected' user: 'labsdbadmin' host: '10.64.37.19' (Got an error reading communication packets)
Jan 12 06:50:26 clouddb1015 mysqld[4551]: 2021-01-12  6:50:26 2106578 [Warning] Aborted connection 2106578 to db: 'unconnected' user: 'labsdbadmin' host: '10.64.37.19' (Got an error reading communication packets)
Jan 12 06:52:41 clouddb1015 mysqld[4551]: 2021-01-12  6:52:41 2106763 [Warning] Aborted connection 2106763 to db: 'unconnected' user: 'labsdbadmin' host: '10.64.37.19' (Got an error reading communication packets)
Jan 12 06:53:45 clouddb1015 mysqld[4551]: 2021-01-12  6:53:45 2106853 [Warning] Aborted connection 2106853 to db: 'unconnected' user: 'labsdbadmin' host: '10.64.37.19' (Got an error reading communication packets)
Jan 12 06:55:59 clouddb1015 mysqld[4551]: 2021-01-12  6:55:59 2107033 [Warning] Aborted connection 2107033 to db: 'unconnected' user: 'labsdbadmin' host: '10.64.37.19' (Got an error reading communication packets)

That's interesting...
I'll check that out. The script starts up every minute, but that's clearly not right.

Change 656215 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[operations/puppet@production] maintain-dbusers: close the connections where they open

https://gerrit.wikimedia.org/r/656215

Change 656215 merged by Bstorm:
[operations/puppet@production] maintain-dbusers: close the connections where they open

https://gerrit.wikimedia.org/r/656215

Thanks Brooke!
Looks like it has stopped after that merge, last two entries, which matches the time of the merge:

Jan 14 20:11:47 clouddb1013 mysqld[7286]: 2021-01-14 20:11:47 2139535 [Warning] Aborted connection 2139535 to db: 'unconnected' user: 'labsdbadmin' host: '10.64.37.19' (Got an error reading communication packets)
Jan 14 20:14:02 clouddb1013 mysqld[7286]: 2021-01-14 20:14:02 2139711 [Warning] Aborted connection 2139711 to db: 'unconnected' user: 'labsdbadmin' host: '10.64.37.19' (Got an error reading communication packets)

On a different host:

Jan 14 20:11:47 clouddb1015 mysqld[4551]: 2021-01-14 20:11:47 2391537 [Warning] Aborted connection 2391537 to db: 'unconnected' user: 'labsdbadmin' host: '10.64.37.19' (Got an error reading communication packets)
Jan 14 20:14:02 clouddb1015 mysqld[4551]: 2021-01-14 20:14:02 2391713 [Warning] Aborted connection 2391713 to db: 'unconnected' user: 'labsdbadmin' host: '10.64.37.19' (Got an error reading communication packets)

Resolving this - thanks much.