Sun, Mar 15
Thu, Mar 12
I am using the latest sseclient. I assume that it isn't "real" as event streams shouldn't be 504ing constantly? That wouldn't make sense & would mean that recent changes should be broken etc? This script is connecting to it through pywikibot's. site_rc_listener().
Wed, Mar 11
This is happening only a few seconds after I start the script.
Going to resolve this then, given that it appears unavoidable. At least now that it throughs an exception, it can be handled (ie skip to the next item in the queue).
Sun, Mar 8
Fri, Mar 6
Thu, Mar 5
Mar 2 2020
I just discovered that rcwatcher.py crashed at some point within the past couple of days. Interesting.
Traceback (most recent call last): File "rcwatcher.py", line 65, in <module> main() File "rcwatcher.py", line 57, in main run_watcher() File "rcwatcher.py", line 41, in run_watcher for change in rc: File "/usr/local/lib/python3.8/site-packages/pywikibot/comms/eventstreams.py", line 291, in __iter__ self.source = EventSource(**self.sse_kwargs) File "/home/thesanddoctor/sseclient/sseclient.py", line 48, in __init__ self._connect() File "/home/thesanddoctor/sseclient/sseclient.py", line 63, in _connect self.resp.raise_for_status() File "/home/ccc/.local/lib/python3.8/site-packages/requests/models.py", line 941, in raise_for_status raise HTTPError(http_error_msg, response=self) requests.exceptions.HTTPError: 429 Client Error: Too Many Requests for url: https://stream.wikimedia.org/v2/stream/recentchange
Feb 28 2020
@Dvorapa I saw that myself & just updated my above comment prior to seeing your response. It appears to have slipped through; I will have to add a catch for that. Others still relevant though.
Since updating to the latest master version of sseclient (post-fix merges) more of the workers crashed than usual (4 of the 5). 3 of the 4 crashes were due to the same issue.
Feb 26 2020
@zhuyifei1999 requests has been updated & the workers/feeder all restarted. I have re-started test_rc.py and will post back here if anything crashes. If it is good in a few days/week or something like that I think we could consider this resolved. Thanks for your help so far!
Feb 25 2020
@zhuyifei1999 Yes, first and third are two separate customers. The first and second are working with the same customer. The third (code) is just a plain/direct printing of the file name straight from recent changes listener and trying to turn it into a file page (until it fails).
@zhuyifei1999 the first and second traceback are from "production" worker instances and pop items off of the same redis queue (all fed by a single instance of rcwatcher.py), thus they wouldn't get the same image. So it isn't feasible that they would crash all at once. They basically get images first-come, first-serve from recent changes.
@Dvorapa python 3(.8)
@Dvorapa all encoding is set to UTF-8.
@zhuyifei1999 For both of these worth noting that I have not updated the the latest version with the change in behaviour that this task merged.
Feb 24 2020
Thanks @Yaron_Koren !
Feb 23 2020
@zhuyifei1999 Unknown at this point. Implemented and running alongside it now. If/when either it or the any of the 5 workers crash, will report back here.
@zhuyifei1999 The only log currently available is as follows (and linked above):
Feb 22 2020
@Dvorapa Commons. The issue appears to happen at random. I have improved the ordering of my logs so next time it happens it should hopefully actually tell me the file name at issue (configured to log the file name before trying to make a FilePage object out of it, hopefully it will do that before crashing). Given that the files are only run from recent changes if they are new uploads, this isn't something easily repeatable and does appear to happen at random. I will update here when I have more logs. Thank you for your patch to make the exception catchable.
@Dvorapa grabs the file from recent changes using site_rc_listener (script, ImageObj) and then sends it to rcworker (linked above) using redis. rcworker then creates a pwb FilePage object out of the title from the recent changes log and processes the file. site_rc_listener is what must be giving it the invalid image titles? Something just doesn't add up here for me as it doesn't make sense why the script is being given invalid image titles by pwb's site_rc_listener.
Merged. Thanks @Tgr !
With @Tgr 's help, a new patch set has been uploaded that is functional. Just awaiting review.
@Dvorapa But what would cause it to return it then when looking at images? I am sort of confused here.
@Yaron_Koren could you please take a look?
Feb 21 2020
@zhuyifei1999 Do you think that such a raise could be made? The problem that I see with both handlings though is that the titles are not "invalid" as they are the valid image titles on the wiki(?). I am also having this issue when it comes to my Commons Corruption Checking task.
Feb 16 2020
Feb 7 2020
Feb 6 2020
Jan 31 2020
@DannyS712 is there anything further needing doing here (specifically this subtask) or is this good to close?
Jan 26 2020
Stopped mine manually. Looks like it is down here too. British Columbia if it matters.
traceroute en.wikipedia.org traceroute to dyna.wikimedia.org (22.214.171.124), 64 hops max, 52 byte packets 1 192.168.1.254 (192.168.1.254) 5.488 ms 5.123 ms 1.664 ms 2 10.31.128.1 (10.31.128.1) 1091.086 ms 910.696 ms 982.800 ms 3 126.96.36.199 (188.8.131.52) 1120.761 ms 686.399 ms 999.062 ms 4 184.108.40.206 (220.127.116.11) 1003.838 ms 63.769 ms 1001.126 ms 5 ae7.cs2.sea1.us.zip.zayo.com (18.104.22.168) 999.608 ms 133.960 ms 908.272 ms 6 ae3.cs2.sjc2.us.eth.zayo.com (22.214.171.124) 940.233 ms 980.503 ms 1124.357 ms 7 ae27.cr2.sjc2.us.zip.zayo.com (126.96.36.199) 151.749 ms 36.743 ms 139.132 ms 8 ae11.mpr4.sfo3.us.zip.zayo.com (188.8.131.52) 479.827 ms 938.800 ms 1046.789 ms 9 * * * 10 * * * 11 * * * 12 * * * 13 * * * 14 * * * 15 * * * 16 * * * 17 * * * 18 * * * 19 * * * 20 * * * 21 * * * 22 * * * 23 * * * 24 * * * 25 * * * 26 * * * 27 * * * 28 * * * 29 * * * 30 * * * 31 * * * 32 * * * 33 * * * 34 * * * 35 * * * 36 * * * 37 * * * 38 * * * 39 * * * 40 * * * 41 * * * 42 * * * 43 * * * 44 * * * 45 * * * 46 * * * 47 * * * 48 * * * 49 * * * 50 * * *
Jan 20 2020
Jan 19 2020
Jan 17 2020
Oops. Didnt mean to reopen.
@Masumrezarock100 Just noticed that you moved this on my workboard. Thanks!! :)
Jan 16 2020
My apologies for the delayed response, I was offline most of the day (internet troubles and a snowstorm). I figured that this was quite the simple addition and did not realize that on-wiki consensus would be needed in order to implement (I have never made nor handled such a request before). It would not negatively affect anyone at current.
Jan 15 2020
Jan 12 2020
Surprised I didn't close this sooner. Thanks @Peachey88 for updating the projects (which reminded me).
Jan 10 2020
Jan 6 2020
@Pwirth you're welcome! :)
Jan 4 2020
@Pwirth done. Apologies for the delay (timezones).
@Sophivorus thanks! I will make corrections to the others. I was going off of the documentation in the parent task, but shall make the patches.
Jan 2 2020
@Yaron_Koren Would you possibly be able to review this commit? Thanks!
@Pastakhov Could you please take a look at this patch? Thanks!
@Sophivorus Could you please take a look at this? Thanks!
Jan 1 2020
Approved and merge started.