Looks like the transient error auto-resolved after the deployment completed. Seems like the appropriate action may be to suppress the watchdog during service restart/service upgrade. thoughts? Should this issue be closed in favor of a task addressing the watchdog? also to look at -- somewhere between 60 and 120 seconds seems like an eternity for a soft service restart of a critical service.
Jun 26 2019
@aborrero thanks! I will submit a patch to gerrit -- first time for that too, so a bit of reading for me on submission guidelines
Note: this is my first writeup for wikitech and any feedback regarding tags or content is invited and will be appreciated. I am ready and available to submit a patch contribution once a decision on this issue is clear.
@Andrew @aborrero -- I added you as subscribers due to the fact that you are the most recent committers on the referenced puppet hieradata file
Q: also, has anyone looked at the mean/variance of timing to the existing call to the UUID factory? the reason I ask is that there is a scenario where 99/100 calls will be super-fast, then the 100th call will block (on linux) due to depletion of the entropy pool, and the kernel will block until enough entropy has been collected. Subsequent calls may also be slower due to less available system entropy between calls
More than 2% of MediaWiki API wall-clock time is spent in the call to UIDGenerator::newUUIDv1() in ApiMain.php, added in a2b85209e2 ("Emit new style API action logs into Monolog"): May 8th: 2.40%, May 7th: 2.76%.