Page MenuHomePhabricator

Investigate Gerrit h2 cache being way too large
Open, Needs TriagePublic

Description

Gerrit had it is / partition filed up when we upgraded from 3.4 to 3.5 which was caused by reindexing of changes filing up file diff caches (T323262).

Gerrit stores cache data in H2 database files and two of them are overinflated on disk:

/var/lib/gerrit2/review_site/cache/Size (MB)
git_file_diff.h2.db8376
gerrit_file_diff.h2.db11597

gerrit show-caches displays smaller usage (around 150M if I get it right):

  Name                          |Entries              |  AvgGet |Hit Ratio|
                                |   Mem   Disk   Space|         |Mem  Disk|
--------------------------------+---------------------+---------+---------+
D gerrit_file_diff              | 24562 150654 157.36m|  14.9ms | 72%  44%|
D git_file_diff                 | 12998 143329 158.06m|  14.8ms |  3%  14%|

I would like to inspect the database to figure out what kind of data are there and whether some garbage collection can be achieved to shrink the h2.db files.

Gerrit uses an old version com.h2database:h2:1.3.176. It connects, as understand it, without any username or password using:

java/com/google/gerrit/server/cache/h2/H2CacheImpl.java
this.conn = org.h2.Driver.load().connect(url, null);

I have made a compressed copy of one of the cache on gerrit1001.wikimedia.org at /home/hashar/git_file_diff.h2.db.gz which is only 32MBytes.

I have tried to access the database locally but hitting a wall:

$ /usr/lib/jvm/java-8-openjdk-amd64/bin/java -cp h2-1.3.176-ijar.jar org.h2.tools.Recover
Error: A JNI error has occurred, please check your installation and try again
Exception in thread "main" java.lang.ClassFormatError: Absent Code attribute in method that is not native or abstract in class file org/h2/tools/Recover

With a more recent version of h2 /usr/lib/jvm/java-8-openjdk-amd64/bin/java -cp h2-2.1.214.jar org.h2.tools.Recover it results in:

git_file_diff.h2.db.h2.sql
-- MVStore
CREATE ALIAS IF NOT EXISTS READ_BLOB_MAP FOR 'org.h2.tools.Recover.readBlobMap';
CREATE ALIAS IF NOT EXISTS READ_CLOB_MAP FOR 'org.h2.tools.Recover.readClobMap';
-- LOB
CREATE TABLE IF NOT EXISTS INFORMATION_SCHEMA.LOB_BLOCKS(LOB_ID BIGINT, SEQ INT, DATA VARBINARY, PRIMARY KEY(LOB_ID, SEQ));
-- lobMap.size: 0
-- lobData.size: 0
-- Layout
-- chunk.1 = chunk:1,block:2,len:1,liveMax:380,livePages:2,map:9,max:410,next:3,pages:4,root:400000f746,time:1a,unusedAtVersion:1,version:1,toc:427,occupancy:0a
-- meta.id = 1
-- root.1 = 4000008d50
-- root.2 = 4000002a8e
-- root.5 = 8000002ac6
-- Meta
-- map.2 = name:_
-- map.3 = name:openTransactions
-- map.4 = name:undoLog.1
-- map.5 = name:table.0,key:8fa25204,val:5803b3f1
-- map.6 = name:lobMap,key:8fa25204,val:f4470498
-- map.7 = name:tempLobMap,key:8fa25204,val:59a6a071
-- map.8 = name:lobRef,key:eabe0274,val:436a4e4b
-- map.9 = name:lobData,key:8fa25204,val:59a6a071
-- name._ = 2
-- name.lobData = 9
-- name.lobMap = 6
-- name.lobRef = 8
-- name.openTransactions = 3
-- name.table.0 = 5
-- name.tempLobMap = 7
-- name.undoLog.1 = 4
-- Types
-- 436a4e4b = org.h2.mvstore.db.NullValueDataType@574caa3f
-- 5803b3f1 = org.h2.mvstore.tx.VersionedValueType@5803b3f1
-- 59a6a071 = org.h2.mvstore.type.ByteArrayDataType@59a6a071
-- 8fa25204 = org.h2.mvstore.type.LongDataType@8fa25204
-- eabe0274 = org.h2.mvstore.db.LobStorageMap$BlobReference$Type@eabe0274
-- f4470498 = org.h2.mvstore.db.LobStorageMap$BlobMeta$Type@f4470498
-- Tables
---- Schema SET ----
SET CREATE_BUILD 214;
---- Table Data ----
---- Schema ----
CREATE USER IF NOT EXISTS "" SALT '' HASH '' ADMIN;
DROP ALIAS READ_BLOB_MAP;
DROP ALIAS READ_CLOB_MAP;
DROP TABLE IF EXISTS INFORMATION_SCHEMA.LOB_BLOCKS;

Essentially there is no data found :-\

Event Timeline

I did a few wrong things:

  • using h2-1.3.176-ijar.jar which I think only has the interfaces not the actual code
  • The jdbc URL jdbc:h2:git_file_diff.h2.db create a database file git_file_diff.h2.db.h2.db which is well.. empty. Had to strop the .h2.db suffix

Full command which dumped data:

java -cp h2-1.3.176.jar org.h2.tools.Script -url 'jdbc:h2:git_file_diff'

Results in a 165MB backup.sql file.

And to get a SQL prompt:

java -cp h2-1.3.176.jar org.h2.tools.Shell -url 'jdbc:h2:file:./git_file_diff;IFEXISTS=TRUE'

There are 127709 entries.

In the SQL prompt I issued a SHUTDOWN COMPACT; and once completed the file has shrunk to 289MBytes.

From http://www.h2database.com/html/features.html#compacting

Empty space in the database file re-used automatically. When closing the database, the database is automatically compacted for up to 200 milliseconds by default.
To compact more, use the SQL statement SHUTDOWN COMPACT. However re-creating the database may further reduce the database size because this will re-build the indexes.
...
See also the sample application org.h2.samples.Compact

maxCompactTime
Database setting MAX_COMPACT_TIME (default: 200).

Which is milliseconds. The setting can be passed to the H2 jdbc URL, probably with something such as ;MAX_COMPACT_TIME=15000 which would allow compacting to run for 15 * 1000 milliseconds = 15 seconds.

I have grabbed the file from production then tried to compact it locally:

$ gunzip -k git_file_diff.h2.db.gz
$ java -cp h2-1.3.176.jar org.h2.tools.Shell -url 'jdbc:h2:file:./git_file_diff;IFEXISTS=TRUE;MAX_COMPACT_TIME=15000'
sql> SHUTDOWN COMPACT;

The file went from 9G to 302MB \o/

I am entirely surely we have been copying those cache files over and over for the last ten years and they only ever had 200ms to be compacted which resulted in all those files to grow out of control.

We need to compact all of them (requires Gerrit to be shutdown) and probably could use a patch to be send Upstream to allow a longer compaction window.