Since august 21st (after the train was deployed to group1, but not necessarily because of that), we've started seeing occasions in which the two full-php7 api servers with high traffic would:
- Spike up in cpu
- Occupy all the fpm workers as busy
- raise the response times.
The weirdest part of it is the slowdowns appear to happen in tandem between the two servers, at the same times.
This is also probably the cause of the slowdowns noticed by @Tarrow in T230976
We've identified most slow requests seem to come from parsoid, but I will be more precise when the investigation happens.
It appears that this is related when templates change on euwiki (just a theory). Even if this is due to templates changes, it shouldn't be slowing down our API as much as it does