Tracking ticket for collecting comments + issues about the kubernetes backend for tool labs' webservice command.
|Open||None||T139107 Issues with 'webservice' kubernetes backend (tracking)|
|Open||None||T140415 `webservice restart` does not always wait for service to stop before trying to start again|
|Resolved||• yuvipanda||T141098 Kubernetes does not mount shared path|
|Open||None||T141100 Tools with names longer than 24 characters cannot start kubernetes webservices|
Tried this with my tool "wikidata-todo". Result:
Warning: include_once(php/common.php): failed to open stream: No such file or directory in /data/project/wikidata-todo/public_html/index.php on line 6 Warning: include_once(): Failed opening 'php/common.php' for inclusion (include_path='.:/usr/share/php:/usr/share/pear') in /data/project/wikidata-todo/public_html/index.php on line 6 Fatal error: Call to undefined function get_request() in /data/project/wikidata-todo/public_html/index.php on line 90
It appears either the symlinks (php/common.php) are not mounted correctly, or the cwd is wrong.
@Magnus I was initially mounting only /data/project/$toolname inside the container, but since it looks like the symlink points to a *different* shared common tool, that won't work. I'm going to merge https://gerrit.wikimedia.org/r/#/c/296868/ that'll mount all of /data/project and fix this. I've already moved wikidata-todo and it seems to work fine now! \o/
OK, this works, but now another of the scripts in the same tool fails.
Warning: mysqli::mysqli(): php_network_getaddresses: getaddrinfo failed: Name or service not known in /data/project/magnustools/public_html/php/common.php on line 93 Warning: mysqli::mysqli(): (HY000/2002): php_network_getaddresses: getaddrinfo failed: Name or service not known in /data/project/magnustools/public_html/php/common.php on line 93 Fatal error: Call to a member function real_escape_string() on boolean in /data/project/wikidata-todo/public_html/duplicity.php on line 19
The line in question (first warning) is:
$db = new mysqli($server, $mysql_user, $mysql_password, $dbname);
$server is "tools-db", $user is "s51211", password is correct, $dbname is "s51211__duplicity_p". All good.
PHP version foobar? tools-db not available?
Restarted the webservice with default settings, works fine.
tools-db is a deprecated alias which is specified in /etc/hosts, and I think the containers only use DNS. tools.labsdb is the stable name of the host. (but we should probably add /etc/hosts to the containers as well)
@Magnus ok, 'tools-db' works now. I moved wikidata-todo over again and verified the link you provided works fine. I've moved it back to gridengine just now though, just in case there are other things that're broken and I just do not know where to look for them. Can you move it again and verify / report other broken stuff?
Thank you so much for your patience!
Tried a few. Had some issue with "magnustools", so tried to restart with "--backend=gridengine". Now it's running as an "unkillable" job (state "dr"), returning 503s. Most of my other tools depend on this one, please help ASAP!
@Magnus I see it is running under kubernetes now rather than gridengine - my plain 'webservice start' rather than 'webservice --backend=gridengine start' had started it under kubernetes rather than gridengine because I guess that was the last successful start of the webservice. I've explicitly moved it back to gridengine just now.
T138787 was the cause of the earlier dr, and I'm shuffling instances around to try to handle it.
I switched my tool and I'm having an issue with file uploads. I am not sure if this is due to changes in PHP or if this is due to a storage issue. Honestly any help would be appreciated.
I am providing my logs below in case anyone finds them useful.
`DEBUG 20160703 21:20:56 image:397: Uploading /data/project/magog/tmp/RedsRetired14.png#12#0 to File:RedsRetired14.png..
TRACE 20160703 21:20:56 image:398: File owner: tools.magog
TRACE 20160703 21:20:56 image:399: File permissions: 100644
DEBUG 20160703 21:20:56 image:400: Size of file: 15200
DEBUG 20160703 21:20:56 image:428: RedsRetired14.png
TRACE 20160703 21:20:56 Wiki:561: Running API query with params https://commons.wikimedia.org/w/api.php?action=upload&filename=RedsRetired14.png&comment=%28BOT%29%3A%20Uploading%20old%20version%20of%20file%20from%20en.wikipedia%3B%20originally%20uploaded%20on%202015-07-24%2023%3A52%3A32%20by%20%5B%5B%3Aen%3AUser%3AMB27|MB27%5D%5D&text=&token=[redacted]&ignorewarnings=1&file=%40%2Fdata%2Fproject%2Fmagog%2Ftmp%2FRedsRetired14.png%2312%230&format=php&servedby=&requestid=642075177
ERROR 20160703 21:20:56 Wiki:618: API Error...
Text: File upload param file is not a file upload; be sure to use multipart/form-data for your POST and include a filename in the Content-Disposition header.
After reading your log in some more detail, the issue is probably this:
You're not including the file contents, but the path to a file. In addition, it seems you're doing a GET rather than a POST request? Without the backend code, it's hard to say.
It may be coincidence, but the reason I wanted to switch back one of my tools to gridengine was that OAuth uploading of files to Commons stopped working. It works fine on gridengine.
I did not investigate the cause in detail. Could just be a different PHP version.
Uploads using the @file syntax now require CURLOPT_SAFE_UPLOAD to be set to FALSE. CURLFile should be used instead.
@yuvipanda It's the oauth uploader in magnustools, source here:
@valhallasw Yes, that could be it.
@yuvipanda Correction, the actual code is in
Probably line 1306
There are still several of my tools that won't start kubernetes webservice because it thinks the gridengine one is still running. Sometimes it is (old gridengine webservice permanently in "dr" state), sometimes it isn't. @yuvipanda mentionend something about a manifest file being left behind.
Tools with this issue include:
I'm not able to get my ws-search tool to run under Kubernetes:
2016-09-28 08:27:51: (mod_fastcgi.c.2569) unexpected end-of-file (perhaps the fastcgi process died): pid: 10 socket: unix:/var/run/lighttpd/php.socket.ws-search-1
2016-09-28 08:27:51: (mod_fastcgi.c.3353) response not received, request sent: 886 on socket: unix:/var/run/lighttpd/php.socket.ws-search-1 for /ws-search/index.php?, closing connection
There's no old gridengine webservice running (qstat returns empty), webservice status says it's running, and it isn't trying to do any curl calls.
Any ideas on what I should be looking for to fix it?
Thanks @valhallasw — that's strange though, because I had also restarted it under SGE and last I knew it was running! But I guess something else went amiss there. (Which I believe is one of the advantages of Kubernetes? That it'll restart stopped jobs?) That ws-search tool is rather unfunctional at the moment, for other reasons, so I'm not bothered if it's not always up. :-)
ricordisamoa, wikidipendenza and dewkin are converted without much hassle. However ricordisamoa and wikidipendenza's jobs seemed to restart when their service.manifest were left behind. It seems that ricordisamoa's service.manifest had backend: gridengine in it while dewkin's service.manifest had not.