Deleted files are addressed in the filearchive table & backend storage via a unique fa_storage_key derived from the SHA-1 hash of the file contents. Given practical attacks on SHA-1 creating collisions based on file prefixes, it would be possible to create sets of image files with consistent hashes, upload them, and then create confusion as to which file comes back after deletion/undeletion.
- add fa_sha256 column and store SHA-256 value in there
- add fa_sha256 index and truncate at 10 characters like sha1 (see also T51190)
- add sha256 hash option to ArchivedFile object
- use the fa_sha256 value in the fa_storage_key of newly deleted files
- provide duplicate file lookup via sha-256 instead of sha-1 (cf T74070 -- ApiArchiveFile)
- comment to mark some old assumptions like LocalRepo::getHashFromKey