Page MenuHomePhabricator

Migrate filearchive's fa_storage_key from SHA-1 to SHA-256
Open, LowPublic

Description

Deleted files are addressed in the filearchive table & backend storage via a unique fa_storage_key derived from the SHA-1 hash of the file contents. Given practical attacks on SHA-1 creating collisions based on file prefixes, it would be possible to create sets of image files with consistent hashes, upload them, and then create confusion as to which file comes back after deletion/undeletion.

Recommended update:

  • add fa_sha256 column and store SHA-256 value in there
    • add fa_sha256 index and truncate at 10 characters like sha1 (see also T51190)
  • add sha256 hash option to ArchivedFile object
  • use the fa_sha256 value in the fa_storage_key of newly deleted files
  • provide duplicate file lookup via sha-256 instead of sha-1 (cf T74070 -- ApiArchiveFile)
  • comment to mark some old assumptions like LocalRepo::getHashFromKey

Event Timeline

brion created this task.Feb 24 2017, 7:10 PM
Restricted Application added projects: Multimedia, Commons. · View Herald TranscriptFeb 24 2017, 7:10 PM
Ltrlg added a subscriber: Ltrlg.Feb 25 2017, 12:07 AM
Reedy moved this task from Unsorted to Add / Create on the Schema-change board.Apr 26 2017, 2:58 PM
Reedy updated the task description. (Show Details)Apr 26 2017, 3:30 PM
MarkTraceur triaged this task as Low priority.Jul 10 2017, 3:21 PM
MarkTraceur moved this task from Untriaged to Triaged on the Multimedia board.