User Details
- User Since
- Oct 17 2014, 11:21 AM (183 w, 1 d)
- Availability
- Available
- IRC Nick
- annika
- LDAP User
- Gifti
- MediaWiki User
- Giftpflanze
Jan 22 2018
You're right, it seems that we don't need so much RAM at the moment. At least it was necessary in the past or otherwise the instance would be really slow. I think we could first try with an xlarge instance.
Jan 20 2018
Aug 8 2017
I went through my table creation statements and added engine statements. I dropped four tables that I want to recreate and changed the engine of the remaining to InnoDB. Thank you for your prompt reaction!
Mar 6 2017
< yuvipanda> !log tools set complex_values slots=300,release=trusty for tools-exec-gift-trusty-01.tools.eqiad.wmflabs
This seems to have done the trick, thank you!
@chasemp The old instance can be cleaned up now.
Feb 26 2017
Killing the Precise instance (and defaulting to Trusty) shouldn't pose a problem. I switched my code to Trusty (I didn't even know anymore that I had to hardcode that) and I will test it with the next regular run starting on the 1st of March. (Also, I'm kinda swamped with school stuff but I'm glad to have found a minute for this.)
Feb 17 2017
Feb 15 2017
< annika> chasemp: […] I guess you mean: You will provide a trusty exec node with an appropriate queue? And I will just use that instead of the current one?
< chasemp> annika: basically, a second trusty node in the same queue for you to migrate to
< annika> that would be fine
Jan 25 2017
So, we deleted the old instace before creating the new one anyway because we wanted to keep the name. You can now readjust the quota.
Jan 19 2017
Jan 18 2017
Considering my earlier comment (bigram instance), the needed numbers would be 1+8=9 cores and 2+36=38GB. But otherwise you're right. And I actually planned to do the transition without additional temporary resources. But it surely would give us more security.
I'm not aware of any script errors. Maybe there are db queries or other things that cause it to hang.
Dec 19 2016
We need more RAM for data processing. We are not aware of other possibilities. Afais, "bigram" has even more RAM, so we'd like to have enough room for 1 small and 1 bigram instance.
Dec 5 2016
Aug 28 2016
Aug 6 2016
Seems to work again.
Jul 26 2016
Jul 5 2016
I do.
Jun 2 2016
@doctaxon apparently they do, but it seems the claim is unsubstantiated
Mar 31 2016
Mar 25 2016
Feb 27 2016
(Ya.)
I had no illusions of resolving the ticket with my summary alone (which should be clear if you read my comment carefully). I just wanted to document what's happening. But thank you for mentioning the root mail (which I couldn't remember).
DrTrigon added me to the service group, I will keep him updated about anything I will do regarding the tool.
I've got an answer from DrTrigon: It's been too long ago, he doesn't remember how to configure the tool. He wishes the admins to add me. Because that's still impossible (yay), I reminded him of the process of adding me as a maintainer. I also offered him to reply in this ticket (which I hope to suffice).
Feb 25 2016
Feb 24 2016
Because there's no policy in place yet, per @Andrew, I tried and sent an e-mail to DrTrigon, who hopefully will respond.
Resolved for both of us.
Feb 19 2016
Feb 14 2016
Feb 7 2016
Jan 29 2016
There are also URLs that don't redirect to the domain root but show the same content as the domain root default document.
Jan 26 2016
Especially interesting portions are:
dwllib.tcl, proc check_curl*: curl -gLksm200 -A $user_agent -w %{http_code} -o /dev/null $url
List of good status codes: 200 201 202 206 226 229 301 302 304 401 412 503 507 999 (and the next 4 lines)
The file "exceptions" contains tcl [string match] url patterns mostly of sites that don't want to be tested/crawled (they may have a tool labs block). This means that you cannot determine if they're dead or not. (I test with a throttling of 1 second, there may be other methods that prevent blocking.)
Jan 24 2016
Jan 20 2016
Well, you could add a check if the API response is valid json. If not, you should repeat the request after a reasonable timespan.
Jan 19 2016
This error originates from the Wikimedia API.
Jan 18 2016
Jan 17 2016
Jan 15 2016
2016-01-01 00:00 UTC and 2016-01-15 00:00 UTC: 0 0 1,15 * * jlocal ./dwla.sh
Jan 10 2016
I have had the same problem (tools.giftbot):
Dec 3 2015
Nov 27 2015
Nov 25 2015
mysqlsel/db server: Lost connection to MySQL server during query while executing "mysqlsel $dewiki_p "select el_to from externallinks where el_from=$pageid" -list"
Nov 24 2015
The last run of my job started at 24/11/2015 21:15 UTC and ended at 22:43 (then I got the error that I lost connection). The code then tries to reconnect in a loop, then I get the error can't connect to mysql. It connects just fine when I (update the data file and) restart the job.
Nov 23 2015
Nov 22 2015
Nov 21 2015
Nov 20 2015
Please make the form respecting the settings (e.g. 30 days) when applying form changes.
Nov 8 2015
Sep 18 2015
https://en.wikipedia.org/wiki/Wikipedia:Bots/Requests_for_approval/Cyberbot_II_5: "Future feature requests, such as detecting unmarked dead links, should be made under a subsequent BRFA." From that I conclude that the workload won't be overlapping with my efforts. What is consuming the resources is probably keeping all data in memory instead of writing it to disk (which should be preferable I guess?). A point which concerns me, is that the bot traverses over the complete article namespace via the API (for template transclusion and article text), which should better be done via the DB and/or dumps.
Sep 16 2015
- giftbot
- tcl
- webservice and 5 continuous jobs
- yes
- no
- no
Aug 30 2015
Aug 20 2015
Aug 18 2015
Aug 2 2015
error:
mysqlreceive/db server: Lost connection to MySQL server during query while executing "mysqlreceive [set db [get_db dewiki]] {select distinct el_to from externallinks join page on el_from = page_id where page_namespace = 0} url { if [ca..." (file "/data/project/giftbot/dwl1a.tcl" line 18)
Jul 31 2015
Jul 27 2015
Jul 6 2015
The instance is now broken. You can't log in, when it tries to create the home directory, it fails.
Jul 4 2015
Do it.
Jun 22 2015
Jun 21 2015
Yes, I can.
Jun 20 2015
May 14 2015
tools-exec-gift can be rebooted when there aren't any jobs running (there is a cascade running for some days every 1st and 15th of the month). The best point of time for rebooting it now would be before 2015-05-15T00:00 (UTC). No files in /tmp. I do not expect any issues with rebuilding the instances (but you never know; I also think there is some special sge configuration (?) that would have to be retained).
Apr 14 2015
Apr 11 2015
The change still seems to be in effect. That means that if I change my crontab, jlocal is prepended with jsub, which renders it unusable. Please effectively revert it.
Mar 30 2015
Mar 11 2015
Mar 2 2015
Feb 18 2015
Jan 23 2015
Jan 17 2015
Everything is fine, obviously. Closing.
Is this still an issue?