Page MenuHomePhabricator

Convert Gerrit to use H2 as the database
Open, Stalled, MediumPublic

Description

Since 2.16, all data has been migrated to NoteDB except it still requires a db connection and also a single table "schema_version" which has a single row.

This requires us doing T200739 first.

This task will fix T176532

Details

Related Gerrit Patches:
operations/puppet : productiongerrit: Switch db from mysql to H2

Related Objects

StatusSubtypeAssignedTask
StalledNone
Openthcipriani
ResolvedPaladox
ResolvedPaladox
DeclinedNone
ResolvedPaladox
Resolvedhashar
Resolvedhashar
Resolvedhashar
ResolvedNone
ResolvedJoe
ResolvedJoe
ResolvedJdforrester-WMF
Resolvedbd808
Resolvedhashar
Resolvedhashar
Duplicatehashar
OpenNone
OpenNone
ResolvedDzahn
Resolvedthcipriani
Openthcipriani
OpenNone
ResolvedDzahn

Event Timeline

Paladox created this task.Dec 4 2018, 6:39 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptDec 4 2018, 6:39 PM
Paladox renamed this task from Convert Gerrit's to use H2 as the database after 2.16 upgrade to Convert Gerrit to use H2 as the database after 2.16 upgrade.Dec 4 2018, 6:41 PM
Paladox updated the task description. (Show Details)
Dzahn added a subscriber: Dzahn.Dec 4 2018, 6:51 PM
Dzahn added a comment.EditedDec 4 2018, 7:05 PM

On one hand i would love this because it would make the gerrit codfw slave work which is blocked to lack of misc mysql cluster there. (because of that the DBA tag wasn't wrong to me fwiw). If we wouldn't need mysql that would be a nice solution for T176532.

But on the other hand i am afraid that this is not suitable for large installations with many users like ours. I checked upstream docs and unfortunately they confirmed those concerns.

Using the embedded H2 database is the easiest way to get a Gerrit site up and running, making it ideal for proof of concepts or small team servers. On the flip side, H2 is not the recommended option for large corporate installations. This is because there is no easy way to interact with the database while Gerrit is offline, it’s not easy to backup the data, and it’s not possible to set up H2 in a load balanced/hotswap configuration.

https://gerrit-review.googlesource.com/Documentation/database-setup.html

Those Docs are based on using H2 fully, by that i mean not using NoteDB. In 2.16 it would be a single table called "schema_version" that is used by the gerrit init when doing schema upgrades. This table is not used by the UI (as far as i know).

Dzahn triaged this task as Medium priority.Dec 5 2018, 5:15 PM

Reading notes from https://gitenterprise.me/2019/01/16/migrating-from-gerrit-2-15-to-2-16/

To convert you setup a vanilla gerrit site (weather it be in a separate directory or on your own laptop and copy the db).

Then just copy the h2 db and switch to h2 in the db config.

Mentioned in SAL (#wikimedia-cloud) [2019-02-05T17:35:00Z] <paladox> temporarily disabling puppet on gerrit-test3 whilst trying T211139

I've just successfully switched https://gerrit.git.wmflabs.org/r/q/status:open from using mysql to h2 for that single table.

It's very easy to migrate, setup just a fake 2.16 site that uses h2 as the db then click y for NoteDB, once you have gone through all the setup, copy the ReviewDB* files from <gerrit_site>/db/ to /var/lib/gerrit2/review_site/db/ then merge https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/488093/

Change 488093 had a related patch set uploaded (by Paladox; owner: Paladox):
[operations/puppet@production] gerrit: Switch db from mysql to H2

https://gerrit.wikimedia.org/r/488093

Paladox raised the priority of this task from Medium to High.Mar 19 2019, 6:21 PM
Paladox renamed this task from Convert Gerrit to use H2 as the database after 2.16 upgrade to Convert Gerrit to use H2 as the database.Mar 19 2019, 6:23 PM
Paladox added a project: serviceops.

say something random to make wikibugs rejoin a channel

fsero moved this task from Backlog to Incoming on the serviceops board.Jun 20 2019, 2:23 PM
Joe moved this task from Incoming to Externally Blocked on the serviceops board.Jun 21 2019, 8:48 AM

Change 488093 had a related patch set uploaded (by Paladox; owner: Paladox):
[operations/puppet@production] gerrit: Switch db from mysql to H2

https://gerrit.wikimedia.org/r/488093

jcrespo changed the task status from Open to Stalled.EditedDec 24 2019, 9:26 AM
jcrespo lowered the priority of this task from High to Medium.
jcrespo added a subscriber: jcrespo.

This seems to be stalled due to concerns raised at T211139#4798560, to be revisited later.

On one hand i would love this because it would make the gerrit codfw slave work which is blocked to lack of misc mysql cluster there. (because of that the DBA tag wasn't wrong to me fwiw). If we wouldn't need mysql that would be a nice solution for T176532.
But on the other hand i am afraid that this is not suitable for large installations with many users like ours. I checked upstream docs and unfortunately they confirmed those concerns.

Using the embedded H2 database is the easiest way to get a Gerrit site up and running, making it ideal for proof of concepts or small team servers. On the flip side, H2 is not the recommended option for large corporate installations. This is because there is no easy way to interact with the database while Gerrit is offline, it’s not easy to backup the data, and it’s not possible to set up H2 in a load balanced/hotswap configuration.

https://gerrit-review.googlesource.com/Documentation/database-setup.html

Docs no longer say that and also that’s true if you went and switched to h2 under 2.15, 2.14. Under 2.16 it won’t be true as there’s only one table. GerritHub switched to this and they have way more users + repos then us :)

Peachey88 added a subscriber: Peachey88.EditedDec 25 2019, 7:29 AM

On one hand i would love this because it would make the gerrit codfw slave work which is blocked to lack of misc mysql cluster there. (because of that the DBA tag wasn't wrong to me fwiw). If we wouldn't need mysql that would be a nice solution for T176532.
But on the other hand i am afraid that this is not suitable for large installations with many users like ours. I checked upstream docs and unfortunately they confirmed those concerns.

Using the embedded H2 database is the easiest way to get a Gerrit site up and running, making it ideal for proof of concepts or small team servers. On the flip side, H2 is not the recommended option for large corporate installations. This is because there is no easy way to interact with the database while Gerrit is offline, it’s not easy to backup the data, and it’s not possible to set up H2 in a load balanced/hotswap configuration.

https://gerrit-review.googlesource.com/Documentation/database-setup.html

Docs no longer say that and also that’s true if you went and switched to h2 under 2.15, 2.14. Under 2.16 it won’t be true as there’s only one table. GerritHub switched to this and they have way more users + repos then us :)

Has the issue of data backups of the H2 db been resolved yet? a quick google seems to suggest you still need to take the DBs offline to backup successfully without data loss, which doesn't seem ideal…

Is the H2 clustering design suitable for our multi DC approach? http://www.h2database.com/html/advanced.html#clustering