To better support diversity of use cases, we decided a new layout for the puppet code that supports ceph rbd clients.
|Resolved||Bstorm||T216208 ToolsDB overload and cleanup|
|Resolved||Bstorm||T216441 Evaluate transferring the non-replicated tables to the new toolsdb server|
|Resolved||fnegri||T236101 Find a way to remove non-replicated tables from ToolsDB|
|Resolved||dcaro||T301951 toolsdb: full disk on clouddb1001 broke clouddb1002 (secondary) replication|
|Open||None||T301967 toolsdb: evaluate storage usage by some tools|
|Open||fnegri||T291782 Trove for some ToolsDB users|
|Open||None||T272395 Cloud: reduce NAT exceptions from cloud to production|
|Resolved||Andrew||T291405 [NFS] Reduce or eliminate bare-metal NFS servers|
|Resolved||Andrew||T292546 cloud NFS: figure out backups for cinder volumes|
|Resolved||aborrero||T293752 cloud ceph: refactor rbd client puppet profiles|
- Mentioned In
- rLPRId7c65f621172: hiera: ceph: add dummy caps for mgr auth entries
rLPRI2863d679d944: hiera: ceph: add mgr keyrings placeholders
rLPRIf42b60479b78: ceph: auth: introduce keydata for mon.xxxx entries
rLPRI5ffc7c4acd0b: hieradata: ceph: refresh bootstrap auth
rLPRI575b28cc5f26: hieradata: codfw: ceph: add dummy keydata for radosgw
rLPRId5d21b4a1de6: hieradata: ceph: auth: add dummy keydata for the admin client
rLPRIf33f1fff217c: hieradata: ceph: auth: add dummy keydata for the admin client
T292546: cloud NFS: figure out backups for cinder volumes
After two months, a deep rabbit hole to climb, some frictions within the WMCS team. more than 50 patches, and a severe sidetrack in the WMCS roadmap, I think we could consider this refactor completed.
On the other side, we are (or at least I am) wiser now, gained a lot of knowledge on how ceph works, a couple of interesting code patterns in puppet, and in general, a more robust ceph abstraction. Worth it :-)
Pretty sure there are things that can be improved, but let's do them in a future iteration. Also, the day we bootstrap again a ceph cluster from scratch we may or may not discover a few potential bugs, or race conditions with how keyrings are deployed and how soon ceph needs them in the bootstrap process.
For the future: figure out a way to cleanup old unused keyrings from the ceph internal DB.