The first run of an rclone-based replacement for swiftrepl (cf. T299125) failed because of i/o errors. Specifically, it found 27,143 objects which were in eqiad swift container listings (and not in codfw container listings) that do not in fact exist - attempting to HEAD or GET them results in HTTP 404 (not found).
It seems likely that this is a subset of the problematic objects - any such objects only in codfw, or in both codfw and eqiad won't have been discovered.
It would be good to know:
- Can we delete the corresponding container entries? [so e.g. rclone will no longer error out, and so our container listings are accurate]
- Do (any of) these correspond to objects that MW still thinks exist (a very limited sample of searching the wikis for filenames suggests not)
- if so, can we restore them from backup?
- Can we find what MW thinks happened to these objects?
- Are we making more such bad objects? If so, how/why? T289996 is probably relevant here
A small set of example object names and types is in P43158 (NDA). To produce a full list, I recorded rclone output to /home/mvernon/logoutput on ms-be1069. I verified that there are no object names with newlines in (with grep -c and wc), and then produced a list of the bad objects:
sed -ne 's/^.* ERROR : \(.*\): Failed to copy: failed to open source object: Object Not Found/\1/p' <logoutput >sadobjects
As expected, that has 27143 lines in. Since it might be useful, I've processed a list of top-level containers (de-sharded) to show roughly how the objects are distributed:
mvernon@ms-be1069:~$ cut -f 1 -d '/' sadobjects | sed -e 's/\...$//' | sort | uniq -c | sort -bgr 16297 wikipedia-commons-local-public 3265 wikipedia-en-local-public 1165 wikipedia-it-local-public 1075 wikipedia-ja-local-public 989 wikipedia-az-local-public 905 wikipedia-commons-local-transcoded 605 wikipedia-ru-local-public 502 wikipedia-bn-local-public 312 wikipedia-uk-local-public 312 wikipedia-de-local-public 293 wikipedia-id-local-public 251 wikipedia-fr-local-public 134 wikipedia-commons-local-deleted 125 wikipedia-ko-local-public 120 wikipedia-zh-local-public 113 wikipedia-sr-local-public 77 wikipedia-pnb-local-public 69 wikipedia-hu-local-public 68 wikipedia-th-local-public 63 wikipedia-tr-local-public 57 wikipedia-lv-local-public 52 wikipedia-fi-local-public 45 wikipedia-ca-local-public 39 wikipedia-he-local-public 22 wikipedia-en-local-transcoded 21 wikipedia-ro-local-public 16 wikipedia-ru-local-transcoded 13 wikipedia-sh-local-public 12 wikiquote-hu-local-public 12 wikipedia-it-local-transcoded 10 wikivoyage-zh-local-public 10 wikipedia-test-local-public 8 wikipedia-ta-local-transcoded 8 wikipedia-pt-local-deleted 8 wikipedia-ka-local-public 7 wikipedia-hy-local-public 5 wikipedia-hr-local-public 4 wikipedia-bcl-local-public 4 wikimedia-id-internal-local-public 4 wikibooks-si-local-public 3 wikisource-fr-local-public 3 wikiquote-ja-local-public 3 wikipedia-th-local-transcoded 3 wikipedia-jv-local-public 3 wikipedia-de-local-deleted 3 wikipedia-commons-gwtoolset-metadata 3 wikipedia-ar-local-public 2 wikiversity-en-local-public 2 wikipedia-wa-local-public 2 wikipedia-id-local-deleted 2 wikipedia-hi-local-public 2 wikipedia-eo-local-public 2 wikipedia-en-local-deleted 2 wikipedia-az-local-deleted 1 wikisource-jv-local-public 1 wikisource-it-local-public 1 wikisource-es-local-public 1 wikisource-bn-local-public 1 wikiquote-it-local-public 1 wikipedia-wuu-local-public 1 wikipedia-uk-local-transcoded 1 wikipedia-mr-local-public 1 wikipedia-lb-local-public 1 wikipedia-kk-local-public 1 wikipedia-ca-local-deleted