Uploading file with a slash in name removes all letters preceding it
Closed, DeclinedPublic

Description

Author: M8R-udfkkf

Description:
If you use a "/" in the filename for an api upload such as:

Content-Disposition: form-data; name="filename"
Content-Type: text/plain

Gerald Ford Papers- Final Issues for Decision, Army Corps of Engineers- 12/4/74 - HEW and Labor(Gerald Ford Library)(1554461).pdf

The response cuts out the everything before the final slash:

<api>
<upload result="Success" filename="74_-_HEW_and_Labor(Gerald_Ford_Library)(1554461).pdf">

<warnings badfilename="74_-_HEW_and_Labor(Gerald_Ford_Library)(1554461).pdf" exists="74_-_HEW_and_Labor(Gerald_Ford_Library)(1554461).pdf" />

The full filename isn't even given back in the badfilename.


Version: unspecified
Severity: normal

bzimport added a subscriber: wikibugs-l.
bzimport set Reference to bz41190.
bzimport created this task.Via LegacyOct 18 2012, 9:57 PM
Aklapper added a comment.Via ConduitOct 18 2012, 11:00 PM

Does this happen on Wikipedia, or on your own MediaWiki installation (if so: which version?)?

Aklapper added a comment.Via ConduitOct 18 2012, 11:26 PM

Smallman: I also wonder about the exact steps to reproduce this ("api upload" is a bit vague as there are several ways to upload data), and which filesystem you use locally to upload from.

bzimport added a comment.Via ConduitOct 18 2012, 11:27 PM

sumanah wrote:

It's especially helpful for us to know what version of MediaWiki this problem is affecting (see the Special:Version page, such as https://www.mediawiki.org/wiki/Special:Version ).

bzimport added a comment.Via ConduitOct 19 2012, 12:33 AM

M8R-udfkkf wrote:

I should be more specific. It's for my Wikimedia commons upload bot:
https://commons.wikimedia.org/wiki/Special:Contributions/Smallbot

Aklapper added a comment.Via ConduitOct 19 2012, 12:36 AM

Smallman: Still it's not yet clear to me how to reproduce this exactly.
Elaborating very welcome!

bzimport added a comment.Via ConduitOct 19 2012, 12:40 AM

M8R-udfkkf wrote:

Here's the request/response recorded by fiddler...

the binary portion is mangled (3/4 cut out) to make the upload fit in 2MB...but the binary portion doesn't really matter for this bug.

attachment filename.7z ignored as obsolete

bzimport added a comment.Via ConduitOct 19 2012, 12:44 AM

M8R-udfkkf wrote:

To reproduce, try to upload with filename "Gerald Ford Papers- Final Issues for Decision, Army Corps of Engineers- 12/4/74(Gerald Ford Library)(1554461).pdf" to commons api with multipart post...not chunked and ignorewarnings. The file will come out "File:74 - HEW and Labor(Gerald Ford Library)(1554461).pdf"

Let me know if you need more details.

Umherirrender added a comment.Via ConduitOct 19 2012, 4:56 PM

Without ignorewarnings= you will get a badtitlename error, so WORKSFORME.

The gui will give you a hint about the change of the name, when not checking the ignore warnings checkbox.

Chad added a comment.Via ConduitOct 19 2012, 5:09 PM

The content of attachment 11206 has been deleted by

Chad H. <innocentkiller@gmail.com>

who provided the following reason:

Contains private info (eg: session cookies)

The token used to delete this attachment was generated at 2012-10-19 17:08:59 UTC.

Ciencia_Al_Poder added a comment.Via ConduitOct 19 2012, 6:06 PM

Created attachment 11208
Simple HTML form to upload a file through the api (targets Wikimedia Commons) to demonstrate the bug

I've successfully created a simple form that targets the API of WikiMedia commons and provides all the required fields to upload a file for testing this bug.

STEPS TO REPRODUCE:

Each parameter is an input, you need to fill the token (a handy link on the html is included to get one) and the file you want to upload. I've prefilled the target title with the same title provided by Smallman but with a PNG extension, so choose a random PNG filename for upload (It's easier to create a PNG file rather than a PDF one).

I've included the "stash" parameter to prevent the file from actually being uploaded to Commons. It's sufficient to see the results, so please leave this parameter checked ;)

ACTUAL RESULTS

This is the response with ignorewarnings unchecked:

<upload result="Warning" filekey="10xx8haqtku8.von8i.46251.png" sessionkey="10xx8haqtku8.von8i.46251.png">
  <warnings badfilename="74(Gerald_Ford_Library)(1554461).png" />
</upload>

This is the response with ignorewarnings checked:

<upload result="Success" filekey="10xx8s87hzog.f41ttt.46251.png" sessionkey="10xx8s87hzog.f41ttt.46251.png">
  <warnings badfilename="74(Gerald_Ford_Library)(1554461).png" />
  <imageinfo timestamp="2012-10-19T18:01:14Z" user="" userid="" anon="" size="80" width="11" height="11" parsedcomment="" comment="" url="http://commons.wikimedia.org/wiki/Special:UploadStash/file/10xx8s87hzog.f41ttt.46251.png" descriptionurl="http://commons.wikimedia.org/wiki/Special:UploadStash/file/10xx8s87hzog.f41ttt.46251.png" sha1="23d8e32905b6d3f4a2b89124c60db0c4bf64ac4d" mime="image/png" mediatype="UNKNOWN" bitdepth="0">
    <metadata>
      <metadata name="frameCount" value="0" />
      <metadata name="loopCount" value="1" />
      <metadata name="duration" value="0" />
      <metadata name="bitDepth" value="8" />
      <metadata name="colorType" value="truecolour" />
      <metadata name="metadata">
        <value>
          <metadata name="_MW_PNG_VERSION" value="1" />
        </value>
      </metadata>
    </metadata>
  </imageinfo>
</upload>

As you can see, <warnings badfilename="74(Gerald_Ford_Library)(1554461).png" /> doesn't match the original filename, which should be "Gerald Ford Papers- Final Issues for Decision, Army Corps of Engineers- 12/4/74(Gerald Ford Library)(1554461).png" and maybe the slashes converted to underscores or whatever, but the filename gets truncated instead.

Attached: uploadtest.html

bzimport added a comment.Via ConduitOct 24 2012, 2:18 AM

sumanah wrote:

Jesús Martínez Novo -- wow! Thank you for that very complete set of steps to reproduce, including the form! I'm asking a few more people to take a look at this.

How often does the average media file have a slash in its filename, though? Is this a tiny case? Can we check for files on Commons that already have slashes in their filenames, if we permit that?

Anomie added a comment.Via ConduitOct 24 2012, 4:07 AM

This isn't really an API bug, as it happens via Special:Upload as well. The root cause is that wfStripIllegalFilenameChars explicitly strips anything up to and including the last "/" character.

But I'm not entirely sure this is a bug at all, rather than a case of "slashes are not allowed in filenames". If anything I suppose it could be changed to treat slashes as it does other characters (which would result in a filename of "Gerald Ford Papers- Final Issues for Decision, Army Corps of Engineers- 12-4-74(Gerald Ford Library)(1554461).png" for the example here), but before changing behavior that goes back in one form or another to at least *2003*[1] I'd want to get input from people more familiar with the file handling code.

[1] https://svn.wikimedia.org/viewvc/mediawiki/trunk/phase3/includes/specials/SpecialUpload.php?revision=1284&view=markup#l52

bzimport added a comment.Via ConduitOct 25 2012, 1:50 PM

Bryan.TongMinh wrote:

The idea behind the current behavior is that if people submit the full path name as file name, e.g., C:\Documents\Picture.jpg, MediaWiki chooses the "Picture.jpg" correctly as file name. As we see in this bug, there are also cases where the (back)slashes should be replaced by other characters. I'm not sure which case is most prevalent, but I would be in favor of keeping the current behavior (e.g., close WONTFIX).

bzimport added a comment.Via ConduitNov 1 2012, 12:43 AM

M8R-udfkkf wrote:

@Bryan, if people submit the full path, then you could check if a drive letter is included. If it's not, then replace with dashes. Anyhow, the filepath shouldn't be submitted...only the filename.

At the very minimum, the proper filename should be should be returned for badfilename.

Anomie added a comment.Via ConduitNov 1 2012, 1:23 PM

(In reply to comment #14)

@Bryan, if people submit the full path, then you could check if a drive letter
is included.

That's very Windows-centric. Users of other operating systems, such as OS X, don't have drive letters but might still submit the full path.

Anyhow, the filepath shouldn't be submitted...only the filename.

Obviously. But users do strange things sometimes.

At the very minimum, the proper filename should be should be returned for
badfilename.

Define "proper". You know the name you passed in, and it returns to you the name it would use. The former is not proper or it wouldn't be a problem, while you seem to consider the latter improper too.

bzimport added a comment.Via ConduitNov 1 2012, 10:58 PM

M8R-udfkkf wrote:

@Brad Jorsch
You're right that many OSs (including many Linux distros) don't use a drive letter. Also...you can't underestimate the ingenuity of the users. As such, you can close as WONTFIX.

I'll add a section for this bug at
http://www.mediawiki.org/wiki/API:Upload

Anomie added a comment.Via ConduitNov 2 2012, 1:03 PM

Ok, closing as WONTFIX.

Gilles added a project: Multimedia.Via WebDec 4 2014, 10:17 AM
Gilles raised the priority of this task from "High" to "Unbreak Now!".Via WebDec 4 2014, 10:21 AM
Gilles moved this task to Closed on the Multimedia workboard.
Gilles lowered the priority of this task from "Unbreak Now!" to "High".Via ConduitDec 4 2014, 11:21 AM

Add Comment

Column Prototype
This is a very early prototype of a persistent column. It is not expected to work yet, and leaving it open will activate other new features which will break things. Press "\" (backslash) on your keyboard to close it now.