I have noticed a vast numbers of curl timeouts coming from the mw-jobrunners
https://logstash.wikimedia.org/goto/50d17abf3123f4330986c011f14ae177
I reckon it is worth understanding the impact, as well investigating why they occur
I have noticed a vast numbers of curl timeouts coming from the mw-jobrunners
https://logstash.wikimedia.org/goto/50d17abf3123f4330986c011f14ae177
I reckon it is worth understanding the impact, as well investigating why they occur
I think this is logspam. Looking at logstash most instances seem to happen in ThumbnailRenderJob which sets a low timeout (1s) and is supposed to ignore timeout errors because the point is only to hit swift's 404 handler with a HEAD request to forward the thumbnailing to Thumbor https://gerrit.wikimedia.org/g/mediawiki/core/+/c3f19ea0dbcfa693e08ae573023c6128d16d5f40/includes/JobQueue/Jobs/ThumbnailRenderJob.php#112
Unfortunately, looking at https://gerrit.wikimedia.org/g/mediawiki/core/+/c3f19ea0dbcfa693e08ae573023c6128d16d5f40/includes/Http/GuzzleHttpRequest.php (which handles the libcurl call) I don't see an easy way to suppress that error message.
As to why it doesn't fire an alert, that's probably because it's errors and not exceptions.
As part of T414805: FY 25/26 WE 5.4.10 Standard Thumbnail Sizes Only, there's is ongoing work to stop pregenerating thumbnails, which would retire ThumbnailRenderJob. It's currently blocked on T415282: MediaSearch should stop relying on render map config.
Declining this task, given our limited capacity it's better to wait until the ongoing work retires the culprit job