Uploading file with a slash in name removes all letters preceding it
Closed, DeclinedPublic

Description

Author: M8R-udfkkf

Description:
If you use a "/" in the filename for an api upload such as:

Content-Disposition: form-data; name="filename"
Content-Type: text/plain

Gerald Ford Papers- Final Issues for Decision, Army Corps of Engineers- 12/4/74 - HEW and Labor(Gerald Ford Library)(1554461).pdf

The response cuts out the everything before the final slash:

<api>
<upload result="Success" filename="74_-_HEW_and_Labor(Gerald_Ford_Library)(1554461).pdf">

<warnings badfilename="74_-_HEW_and_Labor(Gerald_Ford_Library)(1554461).pdf" exists="74_-_HEW_and_Labor(Gerald_Ford_Library)(1554461).pdf" />

The full filename isn't even given back in the badfilename.


Version: unspecified
Severity: normal

Details

Reference
bz41190
bzimport set Reference to bz41190.
bzimport added a subscriber: Unknown Object (MLST).
bzimport created this task.Oct 18 2012, 9:57 PM

Does this happen on Wikipedia, or on your own MediaWiki installation (if so: which version?)?

Smallman: I also wonder about the exact steps to reproduce this ("api upload" is a bit vague as there are several ways to upload data), and which filesystem you use locally to upload from.

sumanah wrote:

It's especially helpful for us to know what version of MediaWiki this problem is affecting (see the Special:Version page, such as https://www.mediawiki.org/wiki/Special:Version ).

M8R-udfkkf wrote:

I should be more specific. It's for my Wikimedia commons upload bot:
https://commons.wikimedia.org/wiki/Special:Contributions/Smallbot

Smallman: Still it's not yet clear to me how to reproduce this exactly.
Elaborating very welcome!

M8R-udfkkf wrote:

Here's the request/response recorded by fiddler...

the binary portion is mangled (3/4 cut out) to make the upload fit in 2MB...but the binary portion doesn't really matter for this bug.

attachment filename.7z ignored as obsolete

M8R-udfkkf wrote:

To reproduce, try to upload with filename "Gerald Ford Papers- Final Issues for Decision, Army Corps of Engineers- 12/4/74(Gerald Ford Library)(1554461).pdf" to commons api with multipart post...not chunked and ignorewarnings. The file will come out "File:74 - HEW and Labor(Gerald Ford Library)(1554461).pdf"

Let me know if you need more details.

Without ignorewarnings= you will get a badtitlename error, so WORKSFORME.

The gui will give you a hint about the change of the name, when not checking the ignore warnings checkbox.

demon added a comment.Oct 19 2012, 5:09 PM

The content of attachment 11206 has been deleted by

Chad H. <innocentkiller@gmail.com>

who provided the following reason:

Contains private info (eg: session cookies)

The token used to delete this attachment was generated at 2012-10-19 17:08:59 UTC.

Created attachment 11208
Simple HTML form to upload a file through the api (targets Wikimedia Commons) to demonstrate the bug

I've successfully created a simple form that targets the API of WikiMedia commons and provides all the required fields to upload a file for testing this bug.

STEPS TO REPRODUCE:

Each parameter is an input, you need to fill the token (a handy link on the html is included to get one) and the file you want to upload. I've prefilled the target title with the same title provided by Smallman but with a PNG extension, so choose a random PNG filename for upload (It's easier to create a PNG file rather than a PDF one).

I've included the "stash" parameter to prevent the file from actually being uploaded to Commons. It's sufficient to see the results, so please leave this parameter checked ;)

ACTUAL RESULTS

This is the response with ignorewarnings unchecked:

<upload result="Warning" filekey="10xx8haqtku8.von8i.46251.png" sessionkey="10xx8haqtku8.von8i.46251.png">
  <warnings badfilename="74(Gerald_Ford_Library)(1554461).png" />
</upload>

This is the response with ignorewarnings checked:

<upload result="Success" filekey="10xx8s87hzog.f41ttt.46251.png" sessionkey="10xx8s87hzog.f41ttt.46251.png">
  <warnings badfilename="74(Gerald_Ford_Library)(1554461).png" />
  <imageinfo timestamp="2012-10-19T18:01:14Z" user="" userid="" anon="" size="80" width="11" height="11" parsedcomment="" comment="" url="http://commons.wikimedia.org/wiki/Special:UploadStash/file/10xx8s87hzog.f41ttt.46251.png" descriptionurl="http://commons.wikimedia.org/wiki/Special:UploadStash/file/10xx8s87hzog.f41ttt.46251.png" sha1="23d8e32905b6d3f4a2b89124c60db0c4bf64ac4d" mime="image/png" mediatype="UNKNOWN" bitdepth="0">
    <metadata>
      <metadata name="frameCount" value="0" />
      <metadata name="loopCount" value="1" />
      <metadata name="duration" value="0" />
      <metadata name="bitDepth" value="8" />
      <metadata name="colorType" value="truecolour" />
      <metadata name="metadata">
        <value>
          <metadata name="_MW_PNG_VERSION" value="1" />
        </value>
      </metadata>
    </metadata>
  </imageinfo>
</upload>

As you can see, <warnings badfilename="74(Gerald_Ford_Library)(1554461).png" /> doesn't match the original filename, which should be "Gerald Ford Papers- Final Issues for Decision, Army Corps of Engineers- 12/4/74(Gerald Ford Library)(1554461).png" and maybe the slashes converted to underscores or whatever, but the filename gets truncated instead.

Attached: uploadtest.html

sumanah wrote:

Jesús Martínez Novo -- wow! Thank you for that very complete set of steps to reproduce, including the form! I'm asking a few more people to take a look at this.

How often does the average media file have a slash in its filename, though? Is this a tiny case? Can we check for files on Commons that already have slashes in their filenames, if we permit that?

This isn't really an API bug, as it happens via Special:Upload as well. The root cause is that wfStripIllegalFilenameChars explicitly strips anything up to and including the last "/" character.

But I'm not entirely sure this is a bug at all, rather than a case of "slashes are not allowed in filenames". If anything I suppose it could be changed to treat slashes as it does other characters (which would result in a filename of "Gerald Ford Papers- Final Issues for Decision, Army Corps of Engineers- 12-4-74(Gerald Ford Library)(1554461).png" for the example here), but before changing behavior that goes back in one form or another to at least *2003*[1] I'd want to get input from people more familiar with the file handling code.

[1] https://svn.wikimedia.org/viewvc/mediawiki/trunk/phase3/includes/specials/SpecialUpload.php?revision=1284&view=markup#l52

Bryan.TongMinh wrote:

The idea behind the current behavior is that if people submit the full path name as file name, e.g., C:\Documents\Picture.jpg, MediaWiki chooses the "Picture.jpg" correctly as file name. As we see in this bug, there are also cases where the (back)slashes should be replaced by other characters. I'm not sure which case is most prevalent, but I would be in favor of keeping the current behavior (e.g., close WONTFIX).

M8R-udfkkf wrote:

@Bryan, if people submit the full path, then you could check if a drive letter is included. If it's not, then replace with dashes. Anyhow, the filepath shouldn't be submitted...only the filename.

At the very minimum, the proper filename should be should be returned for badfilename.

Anomie added a comment.Nov 1 2012, 1:23 PM

(In reply to comment #14)

@Bryan, if people submit the full path, then you could check if a drive letter
is included.

That's very Windows-centric. Users of other operating systems, such as OS X, don't have drive letters but might still submit the full path.

Anyhow, the filepath shouldn't be submitted...only the filename.

Obviously. But users do strange things sometimes.

At the very minimum, the proper filename should be should be returned for
badfilename.

Define "proper". You know the name you passed in, and it returns to you the name it would use. The former is not proper or it wouldn't be a problem, while you seem to consider the latter improper too.

M8R-udfkkf wrote:

@Brad Jorsch
You're right that many OSs (including many Linux distros) don't use a drive letter. Also...you can't underestimate the ingenuity of the users. As such, you can close as WONTFIX.

I'll add a section for this bug at
http://www.mediawiki.org/wiki/API:Upload

Anomie added a comment.Nov 2 2012, 1:03 PM

Ok, closing as WONTFIX.

Gilles moved this task from Untriaged to Done on the Multimedia board.Dec 4 2014, 10:21 AM
Gilles raised the priority of this task from "High" to "Unbreak Now!".
Gilles lowered the priority of this task from "Unbreak Now!" to "High".Dec 4 2014, 11:21 AM

Add Comment