**Author:** `alex.brollo`
**Description:**
Djvu files, the main image+text multipage file used by wWikisiource projects, has a very interesting txtfairly robust text layer but mediawiki can't fully access to it (butit (except for rough extraction of puurelain text). Some new API actions, added both to Commons and to Wikisource projects API, would allow to retrieve most interesting data from the whole file or for selected pages. While reading functions are safe, writing functions can be destructive, even if they could be very useful to advanced users;added both to Commons and to Wikisource projects' API, so I think that first step would be to implement only read-only functionswould allow retrieval of more complex text layers and/or interesting [meta]data from the whole file or for selected pages.
While reading functions are safe, writing functions can be destructive, even if they could be very useful to advanced users; so I think that first step would be to implement only read-only functions.
* djvutext (to read structured text in lisp-like syntax) and
* djvutoxml (to extract structured text in xml)
would be IMHO the first two routines to implement.
--------------------------
**Version**: unspecified
**Severity**: enhancement