Page MenuHomePhabricator

all-titles-in-ns0 is missing from latest dump
Closed, DuplicatePublic

Description

Event Timeline

This is a repro of T413767.

TLDR: the HTML advertises availability before the rsync kicks in, thus you get 404s.

dumpstatus.json file does show the correct state:

curl https://dumps.wikimedia.org/enwiki/20260301/dumpstatus.json | jq | grep pagetitles -B 1 -A 5
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 75588  100 75588    0     0   376k      0 --:--:-- --:--:-- --:--:--  376k
    },
    "pagetitlesdump": {
      "status": "waiting",
      "updated": "",
      "files": {
        "enwiki-20260301-all-titles-in-ns0.gz": {}
      }
--
    },
    "allpagetitlesdump": {
      "status": "waiting",
      "updated": "",
      "files": {
        "enwiki-20260301-all-titles.gz": {}
      }