Hello,
In https://dumps.wikimedia.org/wikidatawiki/entities/ , this week directory (20250630) is empty.
Could you check why so that next week have a chance to be generated.
Thanks
| Melderick | |
| Jul 5 2025, 3:53 PM |
| F63362249: image.png | |
| Jul 7 2025, 3:46 PM |
| F63346957: image.png | |
| Jul 7 2025, 11:47 AM |
| F63345550: image.png | |
| Jul 7 2025, 11:30 AM |
Hello,
In https://dumps.wikimedia.org/wikidatawiki/entities/ , this week directory (20250630) is empty.
Could you check why so that next week have a chance to be generated.
Thanks
@Melderick: Thanks for reporting this. For future reference, please use the bug report form (linked from the top of the task creation page) to create a bug report. Thanks!
Hello,
Apologies for this. The dumps have been created, but they have been inadvertently published to the wrong location.
This is all related to our recent work on T352650: WE 5.4 KR - Hypothesis 5.4.4 - Q3 FY24/25 - Migrate current-generation dumps to run on kubernetes and the wikibase dumps were handled as part of T394389: Migrate the additional dump types from snapshot1016 to Airflow
I believe that the wrong path was configured here: https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/blob/main/test_k8s/dags/dumps/mediawiki_wikibase_dumps.py?ref_type=heads#L233
...and the result is that instead of your files appearing where you expected: https://dumps.wikimedia.org/wikidatawiki/entities/
...they are instead here: https://dumps.wikimedia.org/other/wikidatawiki/
My next step will be to validate this assumption and correct the destination path, then move the dump files to their correct location.
btullis opened https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/1535
Dumps_v1: Add missing path element in the wikibase sync destination
(Turns out, image suggestions, and thus Growth and Apps, don't actually depend on Wikidata dumps anymore since T394757? Removing our tag again.)
The reason that the wikidatawiki/entities link works is because of this symlink.
btullis@clouddumps1002:/srv/dumps/xmldatadumps/public$ ls -l wikidatawiki/ total 7700 drwxr-xr-x 2 dumpsgen dumpsgen 1236992 May 20 09:52 20250401 drwxr-xr-x 2 dumpsgen dumpsgen 208896 Jun 1 09:39 20250420 drwxr-xr-x 2 dumpsgen dumpsgen 1728512 Jun 21 09:34 20250501 drwxr-xr-x 2 dumpsgen dumpsgen 204800 May 26 02:35 20250520 drwxr-xr-x 2 dumpsgen dumpsgen 1777664 Jun 20 10:27 20250601 drwxr-xr-x 2 dumpsgen dumpsgen 217088 Jun 30 15:15 20250620 drwxr-xr-x 2 dumpsgen dumpsgen 180224 Jul 7 09:32 20250701 lrwxrwxrwx 1 root root 30 Sep 22 2015 entities -> ../other/wikibase/wikidatawiki drwxrwxr-x 2 dumpsgen dumpsgen 2306048 Jul 7 00:17 latest
We can see that the link points to ../other/wikibase/wikidatawiki and this supports my previous assumption that the path for generated files is incorrect. We need to add that missing wikibase/ path element.
btullis merged https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/1535
Dumps_v1: Add missing path element in the wikibase sync destination
OK, the code is fixed.
I have manually triggered a sync of the commonswiki wikibase dumps, which were affected by the same issue.
If this works as expected and fills up https://dumps.wikimedia.org/commonswiki/entities/20250630/ then I will do the same for the wikidatawiki entities.
The manual sync run has now completed.
I think that we can now mark this as resolved.
@BTullis Thank you for the quick fix.
I will keep an eye on this week dumps. They usually start showing up by wednesday evening.
Confirming that this week dumps are present in https://dumps.wikimedia.org/wikidatawiki/entities/
The bzip2 one looks way too short so I created a new bug : T399119
It's the same thing again this week: https://dumps.wikimedia.org/wikidatawiki/entities
The latest JSON dumps are there (latest-all.json.bz2 and latest-all.json.gz), but the NT and TTL dumps are still those from a week ago.