File extension letter case treatment should be unified on Commons
Closed, ResolvedPublic

Description

Currently, the UploadWizard accepts an upper case file name extension (i.e. JPG), does not change it but neither allows the user to do so (see bug 34703). Therefore, it is possible to upload two files, one ending in .jpg and the other .JPG.

However, elsewhere on Commons, such as in the RenameLink gadget or the tool used by filemovers, .jpg and .JPG are considered the same and the tools won't allow renaming between these two variants, therefore making it impossible to repair consistency if a series of pictures have been uploaded using one case with only one or a few of them accidentally using another case. All these tools seem to share a common "file name sanitization" module, as reported at http://commons.wikimedia.org/wiki/User_talk:Rillke#RenameLink_forces_case_in_file_extension

As evidenced at https://commons.wikimedia.org/wiki/User_talk:Blahma#File:Brno.2C_T.C3.A1bor_15.jpg file movers need to rename files manually if they are requested to change letter case in file extension, such as renaming "foo.JPG" to "foo.jpg".

File names should be treated equally everywhere on Commons, so if file name extensions get "normalized" by file moving tools, the same "sanitization" should be performed in the UploadWizard.


Version: unspecified
Severity: normal

bzimport set Reference to bz40326.
Blahma created this task.Via LegacySep 18 2012, 10:09 AM
bzimport added a comment.Via ConduitSep 18 2012, 12:43 PM

Bryan.TongMinh wrote:

There are some checks in place, but they only work if the existing file has the normalized extension, i.e., a warning will be given if you try to upload X.JPG and X.jpg exists, but not the other way around.

It is probably a good idea to just give a warning if the same filename exists with any extension. That would need support in FileRepo to search files without extension. I'll have a look at it.

bzimport added a comment.Via ConduitSep 18 2012, 1:16 PM

Bryan.TongMinh wrote:

I21eddc5d

Blahma added a comment.Via ConduitSep 18 2012, 3:51 PM

Thank you, Bryan, for your quick response and act. It would inded be nice to see a warning when a similarly named file already exists.

However, this does not solve the problem completely, because it means that the Upload Wizard will go on stating that "file names differing only in extension case are possible", while the tools that might be used subsequently (file renaming) state "file name extensions must be normalized as if they were case-insensitive". This means that confusion will persist, unless a consensus on this is found across Commons. Perhaps we should invite more people into this discussion?

The easiest solution is, obviously, to equip the Upload Wizard with the same "sanitization" mechanism which is already used by the file mover, but I understand that this is not something what you are willing to do at the moment, am I right?

bzimport added a comment.Via ConduitSep 20 2012, 5:23 PM

Bryan.TongMinh wrote:

I'm not aware of any software restrictions on file moving that perform the normalizations you describe. As far as I am aware it is possible to move a file to A.JPG if the file A.jpg already exists.

Blahma added a comment.Via ConduitSep 20 2012, 11:10 PM

I have found the "cleanFileName" function from http://commons.wikimedia.org/wiki/MediaWiki:Gadget-AjaxQuickDelete.js being called as a part of https://commons.wikimedia.org/wiki/MediaWiki:RenameRequest.js which holds the code for the RenameLink gadget. Indeed, when you are not a file mover, visit a file page and click the "Move" tab, the gadget's dialog appears and there if you change the value in the "Enter the new name" field to "A.JPG" and leave that field by focusing another one, cleanFileName gets called and the field's value is automatically normalized to "File:A.jpg".

Yes, I could insert the Rename template manually, but the gadget seems to be more efficient.

And, I am not a filemover so I cannot check myself, but User:Taketa has suggested at http://commons.wikimedia.org/w/index.php?title=User_talk%3ABlahma&diff=77628385&oldid=77600880 that the same normalization occurs in the actual filemoving interface (and is the cause of .JPG and .jpg considered "identical"). Could someone please recheck this?

Ciencia_Al_Poder added a comment.Via ConduitDec 22 2012, 3:44 PM

(In reply to comment #2)

I21eddc5d

Assigning bug to author ( of gerrit change 24124 ), +patch-in-gerrit

Nischayn22 added a comment.Via ConduitMar 10 2013, 8:12 PM

Is this fixed? The patch got merged.

Bawolff added a comment.Via ConduitMay 10 2013, 1:36 AM

(In reply to comment #1)

There are some checks in place, but they only work if the existing file has
the
normalized extension, i.e., a warning will be given if you try to upload
X.JPG
and X.jpg exists, but not the other way around.

It is probably a good idea to just give a warning if the same filename exists
with any extension. That would need support in FileRepo to search files
without
extension. I'll have a look at it.

People seem to want to be able to do that though. See bug 46741

MarkTraceur added a comment.Via ConduitSep 9 2014, 5:41 PM

Looks like this is just about done - I don't think it's necessary to change the filenames to have the same extensions in UW. I got the warning reliably by uploading a .jpg and .JPEG with the same name.

Gilles added a project: Multimedia.Via WebDec 4 2014, 9:24 AM
Gilles raised the priority of this task from "Normal" to "Unbreak Now!".Via WebDec 4 2014, 10:11 AM
Gilles moved this task to Closed on the Multimedia workboard.
Gilles lowered the priority of this task from "Unbreak Now!" to "Normal".Via ConduitDec 4 2014, 11:20 AM

Add Comment