|operations/puppet : production||gerrit: Switch db from mysql to H2|
On one hand i would love this because it would make the gerrit codfw slave work which is blocked to lack of misc mysql cluster there. (because of that the DBA tag wasn't wrong to me fwiw). If we wouldn't need mysql that would be a nice solution for T176532.
But on the other hand i am afraid that this is not suitable for large installations with many users like ours. I checked upstream docs and unfortunately they confirmed those concerns.
Using the embedded H2 database is the easiest way to get a Gerrit site up and running, making it ideal for proof of concepts or small team servers. On the flip side, H2 is not the recommended option for large corporate installations. This is because there is no easy way to interact with the database while Gerrit is offline, it’s not easy to backup the data, and it’s not possible to set up H2 in a load balanced/hotswap configuration.
Those Docs are based on using H2 fully, by that i mean not using NoteDB. In 2.16 it would be a single table called "schema_version" that is used by the gerrit init when doing schema upgrades. This table is not used by the UI (as far as i know).
Reading notes from https://gitenterprise.me/2019/01/16/migrating-from-gerrit-2-15-to-2-16/
To convert you setup a vanilla gerrit site (weather it be in a separate directory or on your own laptop and copy the db).
Then just copy the h2 db and switch to h2 in the db config.
I've just successfully switched https://gerrit.git.wmflabs.org/r/q/status:open from using mysql to h2 for that single table.
It's very easy to migrate, setup just a fake 2.16 site that uses h2 as the db then click y for NoteDB, once you have gone through all the setup, copy the ReviewDB* files from <gerrit_site>/db/ to /var/lib/gerrit2/review_site/db/ then merge https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/488093/
Docs no longer say that and also that’s true if you went and switched to h2 under 2.15, 2.14. Under 2.16 it won’t be true as there’s only one table. GerritHub switched to this and they have way more users + repos then us :)
Has the issue of data backups of the H2 db been resolved yet? a quick google seems to suggest you still need to take the DBs offline to backup successfully without data loss, which doesn't seem ideal…
Is the H2 clustering design suitable for our multi DC approach? http://www.h2database.com/html/advanced.html#clustering