Page MenuHomePhabricator

Host lookup failed [-9999]: Unknown error -9999
Open, Needs TriagePublic

Description

Seen when running GenerateFancyCaptchas in a screen session for T159581 and trying to work out what happened for this in T159607

Warning: socket_sendto(): Host lookup failed [-9999]: Unknown error -9999 in /srv/mediawiki/php-1.29.0-wmf.14/includes/debug/logger/monolog/LegacyHandler.php on line 220

Event Timeline

Reedy created this task.Mar 4 2017, 5:23 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 4 2017, 5:23 PM
Tgr added a subscriber: Tgr.Mar 6 2017, 5:43 AM

That error message is strangely useless but apparently HHVM returns 10000 + h_errno from gethostbyname and 1 is [[https://github.com/freebsd/freebsd/blob/master/include/netdb.h#L153|HOST_NOT_FOUND]]. The host is stored as an instance property, so this is probably the LegacyHandler equivalent of T151428.

Krinkle added a subscriber: Krinkle.

Not seen in WMF Logstash, but sounds like this wouldn't happen unless manually triggered/confirmed. Leaving in untriaged to be confirmed.

Looked beyond last 7 days to last 30 days and did find it happening on regular web requests as well. Triaging as prod issue confirmed in wmf.20.

[{exception_id}] {exception_url}   ErrorException from line 47 of /srv/mediawiki/php-1.32.0-wmf.20/vendor/liuggio/statsd-php-client/src/Liuggio/StatsdClient/Sender/SocketSender.php: 

PHP Warning: Host lookup failed [-10002]: Unknown error -10002

#1 /srv/mediawiki/php-1.32.0-wmf.20/vendor/liuggio/statsd-php-client/src/Liuggio/StatsdClient/Sender/SocketSender.php(47): socket_sendto(resource, string, integer, integer, string, integer)
#2 /srv/mediawiki/php-1.32.0-wmf.20/includes/libs/stats/SamplingStatsdClient.php(115): Liuggio\StatsdClient\Sender\SocketSender->write(resource, string)
#3 /srv/mediawiki/php-1.32.0-wmf.20/includes/MediaWiki.php(951): SamplingStatsdClient->send(array)
#4 /srv/mediawiki/php-1.32.0-wmf.20/includes/GlobalFunctions.php(1188): MediaWiki::emitBufferedStatsdData(BufferingStatsdDataFactory, GlobalVarConfig)
#5 /srv/mediawiki/php-1.32.0-wmf.20/includes/MediaWiki.php(923): wfLogProfilingData()
#6 /srv/mediawiki/php-1.32.0-wmf.20/includes/MediaWiki.php(734): MediaWiki->restInPeace(string, boolean)

Happens on multiple wikis, urls include regular page views, Special:HideBanner, /w/api.php, and more. Not quite sure what they have in common. I spot checked a few of these errors and looked for other errors with the same reqId but didn't find anything that would have this as side-effect. Looks like it is its own error that just randomly happens sometimes?

mmodell changed the subtype of this task from "Task" to "Production Error".Aug 28 2019, 11:10 PM

Looks like this error is HHVM-specific and I couldn't find other occurrences in logstash, ok to resolve and keep investigating T230245 ?

fgiunchedi moved this task from Inbox to Radar on the observability board.Dec 9 2019, 11:33 AM
Reedy added a comment.Dec 9 2019, 2:04 PM

Looks like this error is HHVM-specific and I couldn't find other occurrences in logstash, ok to resolve and keep investigating T230245 ?

It isn't HHVM specific (I'm not even sure it's monolog specific, but that's the code where it actually surfaces), but maybe where it was appearing in the logs more frequently was (ie in 2017/2018 when it was hhvm).

Certainly, as per T230245#5582062, if you run the script on PHP7 and give it a high enough quantity (ie the 10K) you'll be able to get the error too

With my hacky workaround in place, it's probably not happening now and as such isn't in the logs.

I'm not adversed to closing this one in favour of T230245 as that does include this error, but no one seems to really be investigating anything towards getting it fixed ;). Are we going to move the extra tags from this task over there?

Looks like this error is HHVM-specific and I couldn't find other occurrences in logstash, ok to resolve and keep investigating T230245 ?

It isn't HHVM specific (I'm not even sure it's monolog specific, but that's the code where it actually surfaces), but maybe where it was appearing in the logs more frequently was (ie in 2017/2018 when it was hhvm).
Certainly, as per T230245#5582062, if you run the script on PHP7 and give it a high enough quantity (ie the 10K) you'll be able to get the error too
With my hacky workaround in place, it's probably not happening now and as such isn't in the logs.

Ah! My bad, I skimmed through the task while grooming observability backlog.

I'm not adversed to closing this one in favour of T230245 as that does include this error, but no one seems to really be investigating anything towards getting it fixed ;). Are we going to move the extra tags from this task over there?

I'm ok to leave it open, as you mentioned both are waiting to be investigated further and fixed anyways!