Page MenuHomePhabricator

Integrate book reader into MultimediaViewer
Open, LowPublic

Description

Books and other paged media can be uploaded as DjVu files [1] (and maybe in the future as ePUB, although that is currently not supported for security reasons - bug 17858); it would be nice if MultimediaViewer could display such files.

[1] https://commons.wikimedia.org/wiki/Help:DjVu


Version: unspecified
Severity: enhancement

Details

Reference
bz58033

Event Timeline

bzimport raised the priority of this task from to Low.Nov 22 2014, 2:20 AM
bzimport added a project: MediaViewer.
bzimport set Reference to bz58033.
bzimport added a subscriber: Unknown Object (MLST).
Tgr created this task.Dec 5 2013, 12:15 PM

According to Fabrice Florin, it is possible that community developers can implement special handlers for showing complex formats like djvu or epub in Media Viewer, using hooks or plugins to be provided next year.

This would open the possibility to have an integrated djvu or epub reader directly into the Media Viewer. See recent epub.js
http://fchasen.github.io/epub.js/

And an example: http://book.23rdcenturyromance.com/

ePub files are currently not stored in Commons, but they are generated in Wikisource. A new epub export extension could have in cache the generated ePub files, and this could be served to the Media Viewer in Wikipedia or Wikisource.

An author page in Wikipedia/Wikisource could have a media gallery of books pulled from Wikidata (edition items by author in current language). On clicking on the cover it would display the epub or djvu version, whatever is available. From this point it should be possible to invite readers to transcribe the book in Wikisource if the book is only available as djvu.

(In reply to comment #1)

According to Fabrice Florin, it is possible that community developers can
implement special handlers for showing complex formats like djvu or epub in
Media Viewer, using hooks or plugins to be provided next year.

That seems like an odd approach to take. Code to view special media formats should live in those formats respective MediaHandler extensions.

(In reply to comment #0)

Books and other paged media can be uploaded as DjVu files [1] (and maybe in
the
future as ePUB, although that is currently not supported for security
reasons -
bug 17858); it would be nice if MultimediaViewer could display such files.
[1] https://commons.wikimedia.org/wiki/Help:DjVu

Don't forget PDF :)

Tgr added a comment.Dec 5 2013, 6:18 PM

(In reply to comment #2)

That seems like an odd approach to take. Code to view special media formats
should live in those formats respective MediaHandler extensions.

David probably meant the javascript code to customize MediaViewer for certain formats; paged media requires a slightly different UI from images, and other formats might need even more customization (3D stuff, for example).

Tpt added a comment.Dec 5 2013, 8:34 PM

I believe that it's MultimediaViewer that should contains a generic viewer for paged media (like djvu, pdf and paged tiff) because it's a generic kind of medias that have an implementation in MediaWiki core. Then specific tweaks for given media types support should be done in the MediaHandler extensions (except for DjVu that is a format supported by MediaWiki core)

Why would we want to upload epub or djvu text files (independent of their scans) back to Commons? I don't see the point.

Half of the benefit of dynamic production of epub files is that they will be quick and easy to do, and will always be of the most recent update of the proofread process. If we have epubs at Commons they will then have to be version controlled to a proofread status. If we need access to epub at Commons, it would seem wiser to have a dynamic process to take generate the respective WS work.

If we are talking djvu, then we are talking image and text, as there is no point without find 'in situ'.

(In reply to comment #4)

I believe that it's MultimediaViewer that should contains a generic viewer
for
paged media (like djvu, pdf and paged tiff) because it's a generic kind of
medias that have an implementation in MediaWiki core. Then specific tweaks
for
given media types support should be done in the MediaHandler extensions
(except
for DjVu that is a format supported by MediaWiki core)

Yes I agree for paged media - other things like video might more appropriately live in TMH possibly, etc. I was more trying to emphasize that it should be part of either core or an extension. It should *not* be some hack in a local MediaWiki:Common.js

(In reply to comment #5)

If we are talking djvu, then we are talking image and text, as there is no
point without find 'in situ'.

To clarify that I am talking about the finding text in relation to its position in the text, which is getting to the right page and then locating the text — the 2-D position awareness.

(In reply to comment #5)

Why would we want to upload epub or djvu text files (independent of their
scans) back to Commons? I don't see the point.

This discussion is not about uploading epub files to Commons, but about adding epub/djvu viewing capabilities to the Media Viewer.

The epub files would be generated dinamically as they are now (or cached if there are no changes), but the user would be able to read them online (instead of downloading them and using an external viewer).

(In reply to comment #8)

(In reply to comment #5)

Why would we want to upload epub or djvu text files (independent of their
scans) back to Commons? I don't see the point.

This discussion is not about uploading epub files to Commons, but about
adding
epub/djvu viewing capabilities to the Media Viewer.
The epub files would be generated dinamically as they are now (or cached if
there are no changes), but the user would be able to read them online
(instead
of downloading them and using an external viewer).

Comment 0 is just about uploaded paged formats.
Triggering mediaviewer on dynamically generated files for download seems like it would be an entirely separate can of worms

It would be great if paged objects (e.g. books) in various carriers /PDF/Djvu or pages in a category) could be viewed in a book reader similar to these:

  1. World Digital Library (click the image to open reader): http://www.wdl.org/en/item/3039/#item_type=book&institution=national-library-of-sweden
  2. Internet archive book reader: https://archive.org/stream/fiveweeksinballo00vernuoft#page/n1/mode/2up

To be useful these features are likely to be required:

  1. Zoom in page to see details or images.
  2. View page spread to see illustrations and images that cover two pages
  3. Turn page (obviously)
  4. Make link to a specific page (and page region) to be able to make precise references
  5. Customize viewer for specific types of paged media (e.g. a three page map where zooming is necessary but where page spreads may be irrelevant).
Restricted Application added subscribers: Matanya, Aklapper. · View Herald TranscriptAug 29 2015, 8:14 AM

@Peterkz these are works from archive.org, where they have an inbuilt reader, so not sure duplicating that is really a priority.

The Wikisources are focusing on reproducing the texts with corrected text (from the scans), and cleaned-up images. For that reason theWSes rate this as a low priority, especially seeing djvu readers with search capability are available, without text search the proposed viewer is quite limited in capability.

@Peterkz these are works from archive.org, where they have an inbuilt reader, so not sure duplicating that is really a priority.

The Wikisources are focusing on reproducing the texts with corrected text (from the scans), and cleaned-up images. For that reason theWSes rate this as a low priority, especially seeing djvu readers with search capability are available, without text search the proposed viewer is quite limited in capability.

Jdforrester-WMF moved this task from Untriaged to Backlog on the Multimedia board.Sep 4 2015, 6:11 PM

Mass-removing the Multimedia tag from MediaViewer tasks, as this is now being worked on by the Reading department, not Editing's Multimedia team.

Yann added a subscriber: Yann.Mar 3 2016, 9:38 AM
Tgr removed a subscriber: Tgr.Tue, Jul 9, 6:05 PM