Page MenuHomePhabricator

Add Content Disposition to XML export (proposed patch included)
Closed, ResolvedPublic

Assigned To
None
Authored By
bzimport
Aug 17 2005, 8:47 AM
Referenced Files
F2219: LanguageFr.diff
Nov 21 2014, 8:45 PM
F2218: DefaultSettings.diff
Nov 21 2014, 8:45 PM
F2217: SpecialExport.diff
Nov 21 2014, 8:45 PM
F2216: Language.diff
Nov 21 2014, 8:45 PM
F2212: SpecialExport.php
Nov 21 2014, 8:45 PM
F2214: Defines.diff
Nov 21 2014, 8:45 PM
F2211: special-export-fr.png
Nov 21 2014, 8:45 PM
F2210: special-export-en.png
Nov 21 2014, 8:45 PM

Description

Author: bugzilla.wikimedia

Description:
This is a follow up of bug report 3159 (I erroneously commited as a bug report
to the SpecialImport functionnality).

Indeed the solution is in 'SpecialExport' and the problem is not in 'SpecialImport'.

In order to facilitate the saving ox XML files, exported from mw, in an
appropriate way
(which is not an immediate operation even on good browsers like firefox,
and even for not-too-stupid users...), we should rely on the Content-Disposition
HTTP
header (at least, optionaly).

Everything, including the comments, are provided in my new version of file
SpecialExport.php.
(Just watch all lines around those including the pattern "// !" in this file)
Or, of course, in the SpecialExport.diff file.

I'll immediately upload these 2 files.


Version: 1.11.x
Severity: enhancement

Details

Reference
bz3173

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 8:45 PM
bzimport set Reference to bz3173.

bugzilla.wikimedia wrote:

the full SpecialExport.php file proposal (a diff file will also be uploaded)

attachment SpecialExport.php ignored as obsolete

bugzilla.wikimedia wrote:

the diff file (-Naurb) for the proposed SpecialExport.php patch

attachment SpecialExport.diff ignored as obsolete

This sounds like it would break the ability to view the XML within Mozilla/Firefox
(which is very, very helpful for debugging, experimenting, and exploring the system).

I'd prefer a fix in for the corruption bug in the offending browser(s). Can you narrow
it down to a particular application, version, or setting?

bugzilla.wikimedia wrote:

partial re-design of Special Export

my re-design of SpecialExport.php
+ to take care of the content-disposition problem
+ Brion's remarks to allow viewing in a browser
+ very small bug fixed (marked !FIXED)
+ restructuring the file
(modifs are easely found by a grep on "
!" comments)

attachment SpecialExport.php ignored as obsolete

bugzilla.wikimedia wrote:

screenshot (english)

screenshot of the new design output
(in english,
however the browser is in french :
firefox let you choice between viewing or downloading)

(GNU/Linux Ubuntu 5.04 , mediawiki 1.5.104.4 (patch of 1.5beta4) , firefox
1.0.6)

Attached:

special-export-en.png (453×647 px, 49 KB)

bugzilla.wikimedia wrote:

screenshot in french

(GNU/Linux Ubuntu 5.04 , mediawiki 1.5.104.4 (patch of 1.5beta4) , firefox
1.0.6)

Attached:

special-export-fr.png (410×743 px, 50 KB)

bugzilla.wikimedia wrote:

diff -Naurb

this diff file is rather large,
(thus I also uploaded the final .php file)

attachment SpecialExport.diff ignored as obsolete

bugzilla.wikimedia wrote:

partial re-design of SpecialExport.php

(removing a very old (?) phpDoc comment)

attachment SpecialExport.php ignored as obsolete

bugzilla.wikimedia wrote:

diif -Naurb

quite large (see php file)

attachment SpecialExport.diff ignored as obsolete

bugzilla.wikimedia wrote:

haDoc comments

haDoc comments help the reader to see where a file was hacked
and/or how the file is structured.

this is NOT a substitute for phpDoc, nor for diff !

I join my haDoc file so that you may easely retrieve what
has happened with the file.

See http://meta.wikimedia.org/wiki/HaDoc to eventually read more about haDoc
comments.

attachment SpecialExport.php.hadoc ignored as obsolete

bugzilla.wikimedia wrote:

diff -Naurb

attachment SpecialExport.diff ignored as obsolete

bugzilla.wikimedia wrote:

hacker documentation

attachment SpecialExport.php.hadoc ignored as obsolete

bugzilla.wikimedia wrote:

the SpecialExport.php including eveything (but using internally my conventions)

attachment SpecialExport.php ignored as obsolete

bugzilla.wikimedia wrote:

just FYI : the script producing the haDoc (hacker documentation) file

attachment haDoc.sh ignored as obsolete

This patch contains a lot of extraneous comments, config variables don't match
convention, code style doesn't match convention, comment doesn't match phpdoc
convention. Use of a float to control behavior is very unusual and not recommended.

Usually when one submits a patch to an existing project, one attempts to match the
code style of that code. One in particular doesn't rewrite the entire file in some
totally different style, renaming all the variables and functions and changing all
the indentation and whitespace -- it makes it harder to review, and would make
maintenance of the code much much harder as well.

It will absolutely not be accepted in this state; please rewrite following the
existing coding style if you'd like others to actually be able to review the patch
behind all the noise.

Note that the xmldispositionfilename parameter appears to be placed into an HTTP
header and HTML output without escaping; this may be a security risk.

zigger wrote:

(Please add wikibugs-l@wikipedia.org to the CC list when assigning a bug.)

bugzilla.wikimedia wrote:

the patched SpecialExport.php file

this is the full file.
standard conventions are now strictly respected.

However, the file also includes a few comments to the attention
of developpers (they are clearly indicated as such) : these comments
are intended to explain the choices and should be removed before distribution.

I have removed my haDoc comments (which were not 'noise' ;-( ...)
and my own naming conventions (underscore for private, lowercase for instance
methods, uppercase for static) as requested.

Nevertheless, WITHIN the newly created functions and methods, local variables
are named according to the hungarian prefix convention (yes, I am a very very
old
computer engineer from the old C time and the first UNIX systems - and it's
probably much too late to change me...). Because, this is only used WITHIN
functions, nothing forbid to write a boolean $zCurOnly instead of $curonly.

Attached:

bugzilla.wikimedia wrote:

diff -Naurb for Defines.php

a few constants must be made available for DefaultSettings, LocalSettings,
and SpecialExport.

Mind that I replaced my fuzzy parameters (real between 0 and 1)
by a few unsigned short values.

Attached:

bugzilla.wikimedia wrote:

diff -Naurb for Language.php

2 prompt messages ha

Attached:

bugzilla.wikimedia wrote:

diff -Naurb for LanguageFr.php

2 prompt messages have been added

+ the submit button needs a specific message
(the reason is explained as a comment in the SpecialExport.php file)

Mind the diff files may not be very accurate regarding line numbers
because I just extracted the lines related to SpecialExport

(my Language.php and LanguageFr.php already include many other changes
which are not related to SpecialExport)

attachment LanguageFr.diff ignored as obsolete

bugzilla.wikimedia wrote:

diff -Naurb for SpecialExport.php

The essential is the content disposition management.
Mind that the patch also include a few corrections
for very minor bugs.
These fixes are documented in the file.

Attached:

bugzilla.wikimedia wrote:

diff -Naurb for DefaultSettings.php

again, line numbering may have been distrurbed by other patches...

Attached:

bugzilla.wikimedia wrote:

Brion wrote :

''Note that the xmldispositionfilename parameter appears to be placed into an HTTP
header and HTML output without escaping; this may be a security risk.''

The filename is not escaped because it is NOT the filename provided by the user.
Instead, the filename *suggested* by the user is passed thru a function to generate
a valid Unix file name.
Unless I made an error, this function should return a value without risk.
If I made an error, the function the solution will be to correct this function
(but not to escape).

Mind that this function (and her sister wfRemoveAccents) may eventually be moved
to GlobalFunctions.php if one needs them to generate filenames elsewhere.

bugzilla.wikimedia wrote:

diff -Naurb for LanguageFr.php (+ correction of an existing typo)

Attached:

New patch still does not follow coding conventions.

wfRemoveAccents() looks unhelpful and far from complete if it were
desireable. wfUnixFileName() looks extremely unhelpful, and will make
completely illegible useless names for many languages. Note that failing to
escape output because at the moment it comes from a function which will
produce 'safe' output is an unsafe practice. Many valid Unix filenames would
produce JavaScript injections if output directly into HTML.

None of this addresses the original question, either: what browsers are
affected by this problem to begin with and can they be fixed?

  • Bug 10603 has been marked as a duplicate of this bug. ***

dev wrote:

Here's my patch:

  • SpecialExport-0.php 2007-07-16 18:11:59.000000000 +0200

+++ SpecialExport.php 2007-07-16 18:10:22.000000000 +0200
@@ -151,6 +151,7 @@ function wfSpecialExport( $page = '' ) {

// This should provide safer streaming for pages with history
wfResetOutputBuffers();
header( "Content-type: application/xml; charset=utf-8");

+ header("Content-Disposition: attachment" );

$pages = explode( "\n", $page );

$db = wfGetDB( DB_SLAVE );

robchur wrote:

Fixed in r24172. I didn't use the patches above.

dev wrote:

verified. Thanks a lot !