Page MenuHomePhabricator

Integrate Gallica (Bibliothèque Nationale de France)
Open, Needs TriagePublic

Description

Gallica (https://gallica.bnf.fr) is the French National Library's website for facsimiles of books, newspapers (and other stuff as well).

Most of the documents there are in the public domain, though sometimes it's not the case.

Gallica also implements IIIF, if needs be to download high-quality images of individual pages (maybe for djvu?), however it enforces quite strict rate-limits on the endpoint.

Event Timeline

Adding more context for this task

Gallica is the French National Library's website for copies of books, newspapers and digital records

  • Gallica Book Catalog OPDS API
    • Exposes books in EPUB format
  • Gallica Document API
    • Fetches metadata about any item on the Gallica website
  • Gallica Search API (use if required)
  • Book page URL (Example book) is of the form https://gallica.bnf.fr/ark:/:param1/:arkNumberOfTheBook?rk=<queryParam>
  • Requires a check to verify that the newspaper/manuscript is in public domain before initiating the download
    • Hit the /services/OAIRecord?ark=<arkNumberOfTheBook> endpoint to retrieve technical metadata