Page MenuHomePhabricator

Investigate why --delete seemingly deleted all but 1 captcha file when run in cronjob
Closed, ResolvedPublic

Description

Confirmed that there is only one captcha file in swift:

tgr@terbium:~$ mwscript eval.php --wiki=metawiki
> $backend = FileBackendGroup::singleton()->get( $wgCaptchaFileBackend );

> $baseDir = FileBackendGroup::singleton()->get( $wgCaptchaFileBackend )->getRootStoragePath() . '/captcha-render';

> $fl = $backend->getFileList( [ 'dir' => $baseDir ] )

> var_dump(iterator_to_array($fl))
array(1) {
  [0]=>
  string(41) "9/9/9/image_67350e1b_9996cc25d9b58fc9.png"
}

The captcha generation script does not update/clear the cache used in FancyCaptcha::pickImage*, which it probably should, but that shouldn't be able to cause anything remotely similar.

Event Timeline

So, running a cronjob to create upto 10000... No --delete parameter, I see

Warning: socket_sendto(): Host lookup failed [-9999]: Unknown error -9999 in /srv/mediawiki/php-1.29.0-wmf.14/includes/debug/logger/monolog/LegacyHandler.php on line 220
Warning: socket_sendto(): Host lookup failed [-9999]: Unknown error -9999 in /srv/mediawiki/php-1.29.0-wmf.14/includes/debug/logger/monolog/LegacyHandler.php on line 220
Warning: socket_sendto(): Host lookup failed [-9999]: Unknown error -9999 in /srv/mediawiki/php-1.29.0-wmf.14/includes/debug/logger/monolog/LegacyHandler.php on line 220
Warning: socket_sendto(): Host lookup failed [-9999]: Unknown error -9999 in /srv/mediawiki/php-1.29.0-wmf.14/includes/debug/logger/monolog/LegacyHandler.php on line 220
Warning: socket_sendto(): Host lookup failed [-9999]: Unknown error -9999 in /srv/mediawiki/php-1.29.0-wmf.14/includes/debug/logger/monolog/LegacyHandler.php on line 220
Warning: md5_file(/tmp/mw-fancycaptcha-1488645777-a1b4a7/a/6/1/image_0ec90eab_a614abaa81a86580.png): failed to open stream: Too many open files in /srv/mediawiki/php-1.29.0-wmf.14/includes/libs/filebackend/SwiftFileBackend.php on line 331

Do we get the feeling something happened with the copying in in batches, but the delete apparently worked fine? Not sure why it'd end up with 1 file still, but the above... At least for the last line

I filed T159610 to stop this information from cronjob runs going into the black hole that is /dev/null

Reedy added a subscriber: aaron.

Ping @aaron for the "Too many open files in /srv/mediawiki/php-1.29.0-wmf.14/includes/libs/filebackend/SwiftFileBackend.php on line 331" - do we know how many is too many? And why the layer let the script do that?

Do we get the feeling something happened with the copying in in batches, but the delete apparently worked fine? Not sure why it'd end up with 1 file still, but the above... At least for the last line

Or the 1 was some sort of fluke, a file that couldn't be deleted for some other reason?

do we know how many is too many?

That looks like reaching the max number of open files per process (ulimit -n) or per system. Maybe there's a leak of open descriptors in that script?

do we know how many is too many?

That looks like reaching the max number of open files per process (ulimit -n) or per system. Maybe there's a leak of open descriptors in that script?

Maybe. It worked before... rECOE7de460c289041e6fd729d0209c08077ab95f1b27 changed it from being a tight loop doing each copy/move operation, to passing a big array of paths on disk, with the hope it'd be more performant

Batching itself is fine and well within reasonable ulimits (in fact FileBackend has a hard batch operation limit anyway, on top of the concurrency limit). Any issue would either be leaks or something else.

Change 341962 had a related patch set uploaded (by Aaron Schulz):
[mediawiki/core] Make sure Swift store operations close the source file handle

https://gerrit.wikimedia.org/r/341962

reedy@terbium:~$ /usr/local/bin/mwscript extensions/ConfirmEdit/maintenance/GenerateFancyCaptchas.php mediawikiwiki --wordlist=/etc/fancycaptcha/words --font=/usr/share/fonts/truetype/freefont/FreeMonoBoldOblique.ttf --blacklist=/etc/fancycaptcha/badwords --fill=10000 --oldcaptcha
Current number of captchas is 1000.
Generating 9000 new captchas.. Done.

Generated 9000 captchas in 1459.5 seconds
Copying the new captchas to storage...Warning: md5_file(/tmp/mw-fancycaptcha-1489075661-03e546/a/8/e/image_733e516f_a8edc35843fb98e8.png): failed to open stream: Too many open files in /srv/mediawiki/php-1.29.0-wmf.15/includes/libs/filebackend/SwiftFileBackend.php on line 331
Warning: socket_sendto(): Host lookup failed [-9999]: Unknown error -9999 in /srv/mediawiki/php-1.29.0-wmf.15/includes/debug/logger/monolog/LegacyHandler.php on line 220
Warning: socket_sendto(): Host lookup failed [-9999]: Unknown error -9999 in /srv/mediawiki/php-1.29.0-wmf.15/includes/debug/logger/monolog/LegacyHandler.php on line 220
<snip>
Warning: socket_sendto(): Host lookup failed [-9999]: Unknown error -9999 in /srv/mediawiki/php-1.29.0-wmf.15/includes/debug/logger/monolog/LegacyHandler.php on line 220
Warning: socket_sendto(): Host lookup failed [-9999]: Unknown error -9999 in /srv/mediawiki/php-1.29.0-wmf.15/includes/debug/logger/monolog/LegacyHandler.php on line 220
Warning: socket_sendto(): Host lookup failed [-9999]: Unknown error -9999 in /srv/mediawiki/php-1.29.0-wmf.15/includes/debug/logger/monolog/LegacyHandler.php on line 220
Warning: socket_sendto(): Host lookup failed [-9999]: Unknown error -9999 in /srv/mediawiki/php-1.29.0-wmf.15/includes/debug/logger/monolog/LegacyHandler.php on line 220
Warning: socket_sendto(): Host lookup failed [-9999]: Unknown error -9999 in /srv/mediawiki/php-1.29.0-wmf.15/includes/debug/logger/monolog/LegacyHandler.php on line 220
Warning: md5_file(/tmp/mw-fancycaptcha-1489075661-03e546/a/8/e/image_733e516f_a8edc35843fb98e8.png): failed to open stream: Too many open files in /srv/mediawiki/php-1.29.0-wmf.15/includes/libs/filebackend/SwiftFileBackend.php on line 331
Warning: socket_sendto(): Host lookup failed [-9999]: Unknown error -9999 in /srv/mediawiki/php-1.29.0-wmf.15/includes/debug/logger/monolog/LegacyHandler.php on line 220
Warning: socket_sendto(): Host lookup failed [-9999]: Unknown error -9999 in /srv/mediawiki/php-1.29.0-wmf.15/includes/debug/logger/monolog/LegacyHandler.php on line 220
Errored.
Warning: socket_sendto(): Host lookup failed [-10002]: Host name lookup failure in /srv/mediawiki/php-1.29.0-wmf.15/includes/debug/logger/monolog/LegacyHandler.php on line 220
Warning: implode(): Invalid arguments passed in /srv/mediawiki/php-1.29.0-wmf.15/extensions/ConfirmEdit/maintenance/GenerateFancyCaptchas.php on line 197
Removing temporary files... Done.

Whole captchas generation process took 2303.2 seconds
reedy@terbium:~$

Change 341962 merged by jenkins-bot:
[mediawiki/core] Make sure Swift store operations close the source file handle

https://gerrit.wikimedia.org/r/341962

Change 342703 had a related patch set uploaded (by Reedy; owner: Aaron Schulz):
[mediawiki/core] Make sure Swift store operations close the source file handle

https://gerrit.wikimedia.org/r/342703

Change 342703 merged by jenkins-bot:
[mediawiki/core] Make sure Swift store operations close the source file handle

https://gerrit.wikimedia.org/r/342703

Mentioned in SAL (#wikimedia-operations) [2017-03-15T18:32:37Z] <legoktm@tin> Synchronized php-1.29.0-wmf.16/includes/libs/filebackend/SwiftFileBackend.php: Make sure Swift store operations close the source file handle - T159607 (duration: 00m 44s)

Looks like the various fixes are meaning this doesn't break all over the place now. Yay