After some initial rewarding experiments using standard difference hashes and perception hashes, I'm creating this task to start a wider discussion on whether and how image hashes could be implemented globally on Commons.
The benefit of an image hash is that very near duplicates can be found, for example where the EXIF has changed or whether images have been altered through saturation changes or other minor visual enhancement, or where the same image exists on Commons in different resolutions. A globally available and searchable imagehash would help battle copyright violations, and a system for finding close imagehashes would provide benefits of intelligently clustering related images. For experiments finding duplicates and close matches refer to https://commons.wikimedia.org/wiki/User:Fae/Imagehash.
Using my local machine, it takes around 2 seconds to both download a 320px thumbnail and create the 64bit hash. When run on the servers it should be a magnitude faster, making it realistic to generate hashes in realtime, at the same time the SHA1 values for images are created for Commons files.
There are implementation questions, such as which hash(es) to implement based on their benefits, and whether the standard 64bit hash would be sufficient. These would need more experiments and testing to make the best choices. This wider project should be a WMF supported venture, though unpaid volunteers can do some interesting things, a comprehensive project needs coordination and a modest amount of investment to get right.