Page MenuHomePhabricator

Gerrit GC thrashing during branch cut
Open, NormalPublic

Description

I first noticed this happening during the June 30th train.

I've tried string deduplication to lower overall memory usage, and branch cleanup on the gerrit server itself; however, neither have alleviated the problem.

Throwing more hardware at the issue would fix it most likely, but there should be some way to mitigate it in the interim.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptTue, Sep 3, 1:16 PM
hashar added a comment.Tue, Sep 3, 1:19 PM

That is definitely started with the branch cut. Can't we raise the HEAP and or throw some more RAM at the machine?

Mentioned in SAL (#wikimedia-operations) [2019-09-03T13:26:09Z] <hashar> Gerrit should be fine again, apparently was due to the wmf branch cut taking too much resources (sic) - T231872 filled to investigate

Paladox added a subscriber: Paladox.Tue, Sep 3, 1:30 PM

Happened again today apparently, gerrit's been down/unresponsive and its monitoring data has been unavailable since ~12.43

Mentioned in SAL (#wikimedia-operations) [2019-09-10T13:11:20Z] <hashar> Gerrit experimenting difficulty due to ongoing wmf branch cut - T231872

Restarted Gerrit.

42 second(!!!) GC pause at 12:47:28 means the service likely wouldn't have recovered.

On cobalt from lsof -p $(pidof java) (F30290036), I see a LOT of pack files opened:

COMMAND   PID    USER   FD   TYPE             DEVICE   SIZE/OFF      NODE NAME
...
java    10368 gerrit2  505r   REG              253,0    1174218 216008382 /srv/gerrit/git/mediawiki/core.git/objects/pack/pack-21a4e5cb6a9533b0565cf59cf1282181ac4db897.pack
java    10368 gerrit2  506r   REG              253,0       2677  23789870 /srv/gerrit/git/mediawiki/skins/MinervaNeue.git/objects/pack/pack-3a03431737fe31619ca3e417c1b348c739be2807.pack
java    10368 gerrit2  507r   REG              253,0    3959314  22544407 /srv/gerrit/git/mediawiki/extensions/cldr.git/objects/pack/pack-c83ea98d8e971cb4b941fd42d105c853a6d22308.pack
java    10368 gerrit2  508r   REG              253,0      54243 209518637 /srv/gerrit/git/VisualEditor/VisualEditor.git/objects/pack/pack-1a81f2a690fd1ee9ecd78bf940c22488118f1683.pack
java    10368 gerrit2  509r   REG              253,0       6084  22478850 /srv/gerrit/git/mediawiki/extensions/cldr.git/objects/pack/pack-f060e07631f88aaf5f95278ac44517d92cbd66c8.pack
java    10368 gerrit2  510r   REG              253,0      21598  22478865 /srv/gerrit/git/mediawiki/extensions/cldr.git/objects/pack/pack-fb3d19ca6c000577695974fac3a8057f54f12c70.pack
java    10368 gerrit2  511r   REG              253,0      63734 228720668 /srv/gerrit/git/mediawiki/extensions/Wikibase.git/objects/pack/pack-7010654bf9d6f6aed9a4165bfd954e474738a04a.pack
java    10368 gerrit2  512r   REG              253,0      45714  23789758 /srv/gerrit/git/mediawiki/skins/MinervaNeue.git/objects/pack/pack-c53391d4f0dc0d66d796513ea640d23595566450.pack
java    10368 gerrit2  513r   REG              253,0      44481  23789699 /srv/gerrit/git/mediawiki/skins/MinervaNeue.git/objects/pack/pack-47adf4c23b92cc568f2614e0df467f1544e6ba13.pack
java    10368 gerrit2  514r   REG              253,0      21984 222626182 /srv/gerrit/git/mediawiki/extensions/AbuseFilter.git/objects/pack/pack-23bdb3e781c25d410daf8443305a2eb376a5c33b.pack
java    10368 gerrit2  515r   REG              253,0   12009872  23789619 /srv/gerrit/git/mediawiki/skins/MinervaNeue.git/objects/pack/pack-2d079071d270527642a3963bd95f32f7fdaeba6a.pack
java    10368 gerrit2  516r   REG              253,0      21978 222626139 /srv/gerrit/git/mediawiki/extensions/AbuseFilter.git/objects/pack/pack-107e4c3f1faa0f8e8f3c4885f90206bdb0347520.pack
java    10368 gerrit2  517r   REG              253,0      39891 222626136 /srv/gerrit/git/mediawiki/extensions/AbuseFilter.git/objects/pack/pack-5542ab6534bd8b14a126fd0232e174985a732093.pack
java    10368 gerrit2  518r   REG              253,0      39803 222626131 /srv/gerrit/git/mediawiki/extensions/AbuseFilter.git/objects/pack/pack-11dd62eec2ae54961aa353ca529512ae9413b2b5.pack
java    10368 gerrit2  519r   REG              253,0      43802 222626099 /srv/gerrit/git/mediawiki/extensions/AbuseFilter.git/objects/pack/pack-4a258daaaf07b3575502320c63968e5a514c78cc.pack
java    10368 gerrit2  520r   REG              253,0     248337 228196645 /srv/gerrit/git/mediawiki/extensions/CirrusSearch.git/objects/pack/pack-157ee708fee44a6d69b8576d532cbb8c364e22ae.pack
java    10368 gerrit2  521r   DIR              253,0       4096 216072251 /srv/gerrit/git/mediawiki/extensions/WebToolsManager.git/refs
java    10368 gerrit2  522r   REG              253,0      27540 213582495 /srv/gerrit/git/operations/puppet.git/objects/pack/pack-d58545b73758f5cbdfc0dcfca6bbdd207defd24d.pack
java    10368 gerrit2  523r   REG              253,0       7896 222625825 /srv/gerrit/git/mediawiki/extensions/AbuseFilter.git/objects/pack/pack-e277a2d6ca44afe2fbb4f4a8458f4dc4292f0999.pack
java    10368 gerrit2  524r   REG              253,0      11129 222626059 /srv/gerrit/git/mediawiki/extensions/AbuseFilter.git/objects/pack/pack-174112c2f74dbcf4214d972d4392ab06cfd61385.pack
java    10368 gerrit2  525r   REG              253,0      25595 222626052 /srv/gerrit/git/mediawiki/extensions/AbuseFilter.git/objects/pack/pack-992ef531d86650c607d70e89344521a8592a98cf.pack
java    10368 gerrit2  526r   REG              253,0       7896 222625817 /srv/gerrit/git/mediawiki/extensions/AbuseFilter.git/objects/pack/pack-378bf123ed4d1bab4c35c2d298287a86714253c8.pack
java    10368 gerrit2  527r   REG              253,0      67215 222626089 /srv/gerrit/git/mediawiki/extensions/AbuseFilter.git/objects/pack/pack-d80a460c54f45246b28eabb17341cfb4f81f4016.pack
java    10368 gerrit2  528r   REG              253,0    3286278 224722950 /srv/gerrit/git/mediawiki/extensions/ArticlePlaceholder.git/objects/pack/pack-190be94825c1fb546ef6a27ff12004296ae9ce86.pack
java    10368 gerrit2  529r   REG              253,0    2613984 225116187 /srv/gerrit/git/mediawiki/extensions/Babel.git/objects/pack/pack-9b1671f6b4908d29153bf6a177184f6e368f8734.pack
java    10368 gerrit2  530r   REG              253,0      51057 231079972 /srv/gerrit/git/mediawiki/extensions/ContentTranslation.git/objects/pack/pack-c1dd456c2c984fcdd9ce7c2616fa79cc23017ef6.pack
java    10368 gerrit2  531r   REG              253,0    2758152 225378313 /srv/gerrit/git/mediawiki/extensions/BetaFeatures.git/objects/pack/pack-c6458cc00a9df63efd7dbbb8cdafb490b511d6a1.pack
java    10368 gerrit2  532r   DIR              253,0       4096 216072302 /srv/gerrit/git/mediawiki/extensions/WebToolsManager.git/refs/meta
java    10368 gerrit2  533r   REG              253,0      35967 231080098 /srv/gerrit/git/mediawiki/extensions/ContentTranslation.git/objects/pack/pack-eb8be589724cba1f3c953e6ee6080d48512cddf7.pack
java    10368 gerrit2  534r   REG              253,0      35963 231080088 /srv/gerrit/git/mediawiki/extensions/ContentTranslation.git/objects/pack/pack-bbef63114963a008e075ac1d48ec582429d913b9.pack
java    10368 gerrit2  535r   DIR              253,0       4096 216072302 /srv/gerrit/git/mediawiki/extensions/WebToolsManager.git/refs/meta
java    10368 gerrit2  536r   REG              253,0      22046 231080017 /srv/gerrit/git/mediawiki/extensions/ContentTranslation.git/objects/pack/pack-b69c1b3382654c38e91d553cbaf4acdae24c5ae5.pack
java    10368 gerrit2  537r   REG              253,0       4253 231079947 /srv/gerrit/git/mediawiki/extensions/ContentTranslation.git/objects/pack/pack-6487c0d9fa3fb23f5349d6725f7dc2aa9534f511.pack
...

though they are rather small, they pillup.

Top 20 at the time of the issue, with space added for human consumption.

$ cut -f 7,9 gerritpidof|sort -gr|head -n20
2 068 117 446	/srv/gerrit/git/mediawiki/services/ores/editquality.git/objects/pack/pack-040f862c4b3994aa4ee9f64ffc1a03bbd0b911f1.pack
672 617 588	/srv/gerrit/git/mediawiki/services/ores/editquality.git/objects/pack/pack-e005323c520f90e9d5f7a61f1a93a378b5d2ec3d.pack
489 693 968	/srv/gerrit/git/mediawiki/core.git/objects/pack/pack-6b1e63a3e0e27a0e434fad72f6fea0bd913ad669.pack
313 292 315	/srv/gerrit/git/operations/puppet.git/objects/pack/pack-4006e2dd35757fb7e9d81d7e14bb460911ea4cc6.pack
245 008 370	/srv/gerrit/git/mediawiki/core.git/objects/pack/pack-c147f5cffdd5dd9876ccba295d06e9b2bb15680b.pack
212 897 685	/var/lib/gerrit2/review_site/index/changes_0048/closed/_fo2f.fdt
170 787 746	/var/lib/gerrit2/review_site/index/changes_0048/closed/_fo2f_Lucene50_0.tim
129 765 702	/srv/gerrit/git/mediawiki/extensions/Wikibase.git/objects/pack/pack-0b6c9a176510cd8df8b3f8c7a8d80315e4fd9b6b.pack
128 484 991	/var/lib/gerrit2/review_site/index/changes_0048/closed/_fo2f_Lucene50_0.pos
118 139 986	/srv/gerrit/git/operations/puppet.git/objects/pack/pack-6b1ae1a9d65a732cd12e9edae493255b96eb96e2.pack
112 303 380	/srv/gerrit/git/operations/mediawiki-config.git/objects/pack/pack-479b4177a6c07911124f6a0f6dba1719cf8ac772.pack
111 743 297	/srv/gerrit/git/mediawiki/extensions.git/objects/pack/pack-92594be30e306d406a694103fe275a4a3b9e956f.pack
110 320 267	/srv/gerrit/git/mediawiki/extensions/MobileFrontend.git/objects/pack/pack-97e360360f69ea1ae30543db2cce7523bc37c951.pack
84 871 965	/var/lib/gerrit2/review_site/index/changes_0048/closed/_fo2f_Lucene50_0.doc
81 301 349	/srv/gerrit/git/operations/mediawiki-config.git/objects/pack/pack-fba3fc256e452708dc2f1a79100888c066a5d0eb.pack
76 177 134	/srv/deployment/gerrit/gerrit-cache/revs/40d88dc46992e015c404295643cbfef15e025496/gerrit.war
76 177 134	/srv/deployment/gerrit/gerrit-cache/revs/40d88dc46992e015c404295643cbfef15e025496/gerrit.war
76 177 134	/srv/deployment/gerrit/gerrit-cache/revs/40d88dc46992e015c404295643cbfef15e025496/gerrit.war
69 436 603	/srv/gerrit/git/mediawiki/extensions/UniversalLanguageSelector.git/objects/pack/pack-f8754cb289f7b8747fff5e9f8f9476b9a5dfa1ff.pack
68 417 669	/srv/gerrit/git/mediawiki/extensions/VisualEditor.git/objects/pack/pack-c0da11482aab712c03640bca569be1c33e6f738c.pack

mediawiki/services/ores/editquality.git is rather large which might not help.

lsof: F30290036

On cobalt from lsof -p $(pidof java) (F30290036), I see a LOT of pack files opened:

It's pretty typical to have a lot of pack files open. In fact, the number of packfiles opened before restart was lower than after (which is to be expected since there's nothing in memory at startup time):

https://grafana.wikimedia.org/d/Bw2mQ3iWz/gerrit-javamelody?orgId=1&from=1568079721750&to=1568122921750&var-Application=&var-Window=5m

Top 20 at the time of the issue, with space added for human consumption.

$ cut -f 7,9 gerritpidof|sort -gr|head -n20
2 068 117 446	/srv/gerrit/git/mediawiki/services/ores/editquality.git/objects/pack/pack-040f862c4b3994aa4ee9f64ffc1a03bbd0b911f1.pack
672 617 588	/srv/gerrit/git/mediawiki/services/ores/editquality.git/objects/pack/pack-e005323c520f90e9d5f7a61f1a93a378b5d2ec3d.pack
489 693 968	/srv/gerrit/git/mediawiki/core.git/objects/pack/pack-6b1e63a3e0e27a0e434fad72f6fea0bd913ad669.pack
313 292 315	/srv/gerrit/git/operations/puppet.git/objects/pack/pack-4006e2dd35757fb7e9d81d7e14bb460911ea4cc6.pack
245 008 370	/srv/gerrit/git/mediawiki/core.git/objects/pack/pack-c147f5cffdd5dd9876ccba295d06e9b2bb15680b.pack
212 897 685	/var/lib/gerrit2/review_site/index/changes_0048/closed/_fo2f.fdt
170 787 746	/var/lib/gerrit2/review_site/index/changes_0048/closed/_fo2f_Lucene50_0.tim
129 765 702	/srv/gerrit/git/mediawiki/extensions/Wikibase.git/objects/pack/pack-0b6c9a176510cd8df8b3f8c7a8d80315e4fd9b6b.pack
128 484 991	/var/lib/gerrit2/review_site/index/changes_0048/closed/_fo2f_Lucene50_0.pos
118 139 986	/srv/gerrit/git/operations/puppet.git/objects/pack/pack-6b1ae1a9d65a732cd12e9edae493255b96eb96e2.pack
112 303 380	/srv/gerrit/git/operations/mediawiki-config.git/objects/pack/pack-479b4177a6c07911124f6a0f6dba1719cf8ac772.pack
111 743 297	/srv/gerrit/git/mediawiki/extensions.git/objects/pack/pack-92594be30e306d406a694103fe275a4a3b9e956f.pack
110 320 267	/srv/gerrit/git/mediawiki/extensions/MobileFrontend.git/objects/pack/pack-97e360360f69ea1ae30543db2cce7523bc37c951.pack
84 871 965	/var/lib/gerrit2/review_site/index/changes_0048/closed/_fo2f_Lucene50_0.doc
81 301 349	/srv/gerrit/git/operations/mediawiki-config.git/objects/pack/pack-fba3fc256e452708dc2f1a79100888c066a5d0eb.pack
76 177 134	/srv/deployment/gerrit/gerrit-cache/revs/40d88dc46992e015c404295643cbfef15e025496/gerrit.war
76 177 134	/srv/deployment/gerrit/gerrit-cache/revs/40d88dc46992e015c404295643cbfef15e025496/gerrit.war
76 177 134	/srv/deployment/gerrit/gerrit-cache/revs/40d88dc46992e015c404295643cbfef15e025496/gerrit.war
69 436 603	/srv/gerrit/git/mediawiki/extensions/UniversalLanguageSelector.git/objects/pack/pack-f8754cb289f7b8747fff5e9f8f9476b9a5dfa1ff.pack
68 417 669	/srv/gerrit/git/mediawiki/extensions/VisualEditor.git/objects/pack/pack-c0da11482aab712c03640bca569be1c33e6f738c.pack

mediawiki/services/ores/editquality.git is rather large which might not help.

That is definitely a large repo. git gc for that repo in particular (on Fridays) takes as much time as all other repos combined.

There is a setting in the gerrit.config: packedGitLimit which is the size of pack files to keep in memory at a given time. It might make sense to increase this so that all the train repos fit in memory at the same time (if that is not currently the case): https://gerrit-review.googlesource.com/Documentation/config-gerrit.html#core.packedGitLimit

It is also the case that this machine needs more memory T222391: Gerrit Hardware Upgrade (+ upgrade from jessie to stretch or buster)

thcipriani triaged this task as Normal priority.Tue, Sep 10, 2:52 PM