Can't upload PDF / ODF Hybrid
OpenPublic

Description

Author: fun-stuff

Description:
The new LibreOffice supports exporting PDFs in a hybrid ODF / PDF file format. When trying to upload such a file MediaWiki reports that the ZIP file is ambiguous or has been damaged. (I have a German installation so I can't tell you the exact error message.)

I already took ZIP files out of the MediaWiki Blacklist and added the file extensions PDF, ODT and ZIP.

I think includes/ZIPDirectoryReader.php checks the file and throws out the error cause it doesn't know the new file format yet.

The new PDF / ODF hybrid format makes it easy to open documents for everyone while maintaining the possibility to edit them which might also be a great thing for Wikipedia. Therefore, this is a major bug for me.
Please fix this and thanks for the great software.

Tobias


Version: 1.18.x
Severity: normal

bzimport added a project: MediaWiki-Uploading.Via ConduitNov 21 2014, 11:29 PM
bzimport added a subscriber: Unknown Object (MLST).
bzimport set Reference to bz28188.
bzimport created this task.Via LegacyMar 22 2011, 3:55 PM
MaxSem added a comment.Via ConduitMar 22 2011, 3:57 PM

Can you attach a sample file or provide a link to it?

bzimport added a comment.Via ConduitMar 22 2011, 5:55 PM

fun-stuff wrote:

Test PDF/ODF (ODT) Hybrid Document

Can be edited with LibreOffice 3.3 Writer and viewed with any PDF viewer. But it cannot be uploaded in MediaWiki 1.18alpha.

Attached: test_pdf_odf_hybrid.pdf

MaxSem added a comment.Via ConduitMar 22 2011, 6:32 PM

The cause is "ZipDirectoryReader: Fatal error: trailing bytes after the end of the file comment".

In simple words, we expect zip files to be... zip files and not contain something scary. We need to hack our detector to handle zips embedded in something known.

bzimport added a comment.Via ConduitMar 23 2011, 11:46 AM

fun-stuff wrote:

Thanks for clarification und for taking care of the problem so quickly. I hope you can fix this bug in the near future.

MarkAHershberger added a comment.Via ConduitApr 26 2011, 3:34 AM

From comments in triage:

"Workaround: 'don't save your PDF that way'. (Problem with workaround: if someone else made the file, you might not know how to re-save it.)"

So, we thought about dealing with it: "This presents same security threats as a PDF file.... need to check security model, probable threats."

"Our security checks are working as intended by detecting that the files have been smashed together unexpectedly. Might be possible to tweak it to consider 'oh that's ok' but not sure how much we want to. If not careful might accidentally allow all sorts of evil appended to a PDF file."

bzimport added a comment.Via ConduitApr 26 2011, 9:14 AM

fun-stuff wrote:

Thanks for the comments.

I can imagine that deciding whether this is an 'OK' PDF file saved as hybrid ODF or not is difficult to code. However, I think it would be a great loss if this wasn't implemented as this format is so versatile.

bzimport added a comment.Via ConduitFeb 21 2013, 9:17 AM

dovijacobs wrote:

Hi, I asked about this problem here (and was referred to this bug):
http://commons.wikimedia.org/wiki/Commons:Village_pump#Uploading_embedded_PDFs_created_through_LibreOffice

The embedded PDF is an extremely useful file format, and one of the best features in the open source LibreOffice project. It is becoming extremely popular and is already being used in hundreds of millions of files around the world.

Therefore, I'd like to reiterate the comment before mine, which was made nearly two years ago: "I think it would be a great loss if this wasn't implemented as this format is so versatile."

If that was true two years ago, it is far more true today. I hope it can be made a basic part of PDF support in Wikimedia projects.

bzimport added a comment.Via ConduitMar 29 2013, 1:42 PM

dovijacobs wrote:

In the meantime I've been uploading classic texts and educational materials at Internet Archive instead of at the Commons:
http://commons.wikimedia.org/wiki/Category:Talmud_(digital_text_vowelized_and_formatted)

This is extremely inconvenient for proper use at Wikimedia projects. I hope this will be taken care of eventually.

Gilles added a project: Multimedia.Via WebNov 24 2014, 3:42 PM

Add Comment