Gerrit has run out of HTTP threads and needed a restart several times in several weeks. Initially I blamed the upgrade to 2.15.12 ; however, our thread woes have continued after down-grading to a version on which we were previously stable (2.15.8).
Watching javamelody monitoring over the period, I think symptoms are similar to T148478 -- that is, gerrit seems to run fine for a few days and then we hit some kind of condition that causes GC thrashing.
GC thrashing causes a slow down of http timings resulting in concurrent thread-use increase (threads stay active for longer hence more threads in parallel).
These longer than average http average times seem to begin when we hit the gerrit heap limit and GC (I guess) brings memory use back down. Memory use climbs back up to our heap-size very rapidly triggering java gc thrashing.
I think we'll need to monitor java GC more closely an fine-tune some values in terms of gerrit thread-use. The gerrit scaling manual has some advice on how to do this.