Aug 7 2023
Thank you, @Krinkle. I'll look into this and let you know if I have questions!
Jun 6 2023
still in progress?
May 18 2023
May 17 2023
May 3 2023
Currently, there isn’t any way for you to track the progress of the dumps because we don’t produce them locally; we fetch them via an API from WME. At the time of the run, the WME lets us access a list of wikis and namespace, this information is also retrieved on the fly rather than from a static list, and they could change on the next run.
May 2 2023
The 2023-04-20 run was not completed because of a token refresh failure we experienced (here https://phabricator.wikimedia.org/T335368) The current run is still in progress and we can’t say for now if the files are missing or not; we’ll have to wait for the dumps to be completed.
Apr 25 2023
Apr 20 2023
We saw this when the job that dumps metadata for all revisions was running for this wiki; This is the only wiki impacted so far
Mar 13 2023
One way this could be done might be to add the checksum for dumps created this month via a script and run the script daily for new dump files in some directory trees. Most dumps in https://dumps.wikimedia.org/other belong to other teams and are not maintained by us (Platform Engineering). They are also produced by different scripts, some on different hosts.
Mar 6 2023
Hello @jbond, the errors result from the manual rsync of data we were running on the new dumpsdata hosts. You might see them on dumpdata1007 which started rsync today, the rest are completed. The errors are just during the rsync, and they aren’t serious; Please ignore them :-)
Mar 3 2023
Closing this because the run took four days to complete, this is better than a week!
Mar 1 2023
Feb 17 2023
Hello @EBernhardson, The dumps were completed within a few days, and the resources used on the host are within limits. This is better and faster than when they took more than a week to complete. Thank you!
The dumps run are done and we didn’t get any errors. Thank you @Ladsgroup
Feb 16 2023
Hello @hashar, are you still seeing these errors, and when did they start?
Feb 15 2023
Hello, just checking in to find out what is going on with the OS installation on dumpsdata1006, and please, when will it be ready for use?
So far, we haven’t had any errors; we'll be certain when the full run completes in a day or two, and we can update the task by then.
Feb 13 2023
We'll be closing this now because everything looks okay, and we didn’t get any errors over the weekend, but we are not sure what fixed this. It could be the last update @Rgaudin made by increasing the max connection or specifying a dedicated lock file for our module.
Feb 10 2023
Hello @Rgaudin, we haven’t had any errors since the last one, which is a good thing :-) let's see how things are over the weekend, and thanks for helping with this.
yeah, we still see them, and we got this on Feb 2, 8:40 am UTC.
We got an error yesterday 9th Feb from clouddumps1002 at 5 pm UTC
The error we got was:
Error while running rsync, check the logs...
Please can we get the logs around the time we got these errors?
Feb 9 2023
Hello @Rgaudin, we have a slightly different log from Feb 6 at 12:15 to Feb 8 at 08:15 and it looks like most of the parts missing are the runs where we get an error. Please can we get the missing logs where there was maybe a connection error or any other errors
Feb 8 2023
@ArielGlenn From the last Memory and CPU usage during the last run (please see the image above), we are good to decompress blocks using multiple threads because currently, the last run used less than 25% CPU and memory usage also isn’t much. I think we are good to go!
The job starts on the 26th of every month, and the last run was completed on the 2nd of February. I have attached the MEMORY and CPU usage before and after the job was completed.
Feb 3 2023
Hey WMCS folks, we've been getting the errors again after the last patch; we decided to do some testing on the script just after the loop completes and discovered some processes lingering around in TIME-WAIT and one in FIN-WAIT1. On the remote end, these may not be deemed open connections and so might have no effect on the connection problem.
Jan 25 2023
We’ve been seeing a lot of these recently;
Jan 9 2023
Dec 14 2022
Dec 5 2022
Oct 21 2022
In response to my previous comment on the runtime, The last run started on Oct 11th and ended on Oct 21st. It took 10 days in total to complete the run.
Oct 14 2022
Oct 12 2022
Oct 7 2022
Sep 15 2022
Aug 26 2022
Aug 16 2022
The first sql/xml dumps run (1st of every month) is a full run, this means it contains all historical data and takes longer to dump. Currently we still have the wikidata dumps ongoing, and expect it to complete tomorrow. This gives us a margin of a few days in case there are errors causing us to delay or rerun some parts.
Aug 5 2022
Jul 8 2022
Jun 29 2022
we have deployed a patch so that we'll get email notification the next time something like this happens.
Jun 10 2022
Jun 8 2022
Hello Ryan, please can someone on your team make a gerrit patch with the neccessary changes and we can +2 and deploy. Here is the page that has to be modified.
Apr 25 2022
Apr 12 2022
Apr 5 2022
Apr 1 2022
Mar 28 2022
Mar 21 2022
Mar 4 2022
Feb 25 2022