I tried to upgrade Zuul from upstream commit 66c8e52 to 30a433b , that would fix T129938 but also include a major refactoring of how Zuul interact with Gerrit. It worked fine but after a few dozen of minutes Zuul stopped receiving/processing events from Gerrit and thus I reverted back to 2.1.0-95-g66c8e52-wmf1precise1
I have picked up a few other patches and built 2.1.0-151-g30a433b-wmf2precise1 but it has the same issue.
Potential thread dump is P3204
Need to reproduce the trouble (I havent been able to do so on my local machine though).
Upstream shortlog, the serie of patches by Joshua Hesketh is the Zuul/Gerrit refactoring:
$ git shortlog 66c8e52..30a433b --no-merges Alexander Evseev (1): Split pipeline description by double newlines on status page Andreas Jaeger (2): Use git.openstack.org everywhere Remove argparse from requirements Antoine Musso (1): Update merge status after merge:merge is submitted Doug Wiegley (1): Bump pbr minimum version, to avoid testrepository requirement Evgeny Antyshev (2): Connection names for legacy configs Don't require 'commit' attribute in merge event JP Sullivan (1): Add vim swap files to .gitignore James E. Blair (8): Pass ZUUL_TEST_ROOT through tox Try to make test_idle less racy Add job mutex support Fix default merge failure reports Cloner: use cache if dest exists Add job tags Detect dependency cycles introduced with new patchsets Add report URL to status.json Jan Hruban (1): Tidy up tests/base.py Joshua Hesketh (14): Refactor sources out of triggers Add base class for reporters Add base class for sources Add base class for triggers Configure triggers dynamically Add support for 'connection' concept Document the new connections in zuul Add in extra connections tests Remove ActionReporters Move Item formatting into Reporters Fix regression in change tracking Fix memory leak reloading triggers Cache is held and managed by connections Don't reload connections on HUP Ondřej Nový (1): Deprecated tox -downloadcache option removed Paul Belanger (4): Bump APScheduler to >=3.0 Remove webob requirements cap Log 'Received unrecognized event type' as warning Pin paramiko < 2.0.0 Richard Hedlind (1): Add exception handler to updateBuildDescriptions Sachi King (1): Fix test for new WebOb Thomas Bechtold (1): Fix documentation example Tobias Henkel (1): Cloner: Don't fall back on infrastructure failure