Page MenuHomePhabricator

API: image thumb-url for ProofreadPages
Open, LowPublicFeature

Description

Would it be possible to expose the thumb-url below in some API (height/width info not necessary)?
This is the image that appears in https://en.wikisource.org/wiki/Page:The_First_Church_of_Christ,_Scientist,_and_Miscellany.djvu/43
Would be nice to know its link via API instead of e.g. scraping the page or try to reconstruct it from the title.

Thanks.

<div class="prp-page-image">
    <img alt="" src="//upload.wikimedia.org/wikipedia/commons/thumb/d/dd/The_First_Church_of_Christ%2C_Scientist%2C_and_Miscellany.djvu/page43-1024px-The_First_Church_of_Christ%2C_Scientist%2C_and_Miscellany.djvu.jpg" data-file-width="2693" data-file-height="3985" height="1655" width="1024">
</div>

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

@Mpaa: not sure who is recipient of this task but we do not think is analytics, removing pageview api that deals with pageviews (not files)

This is already possible using existing MediaWiki APIs, but it requires some extra processing. Most of the information in a thumbnail URL is also provided in the page name, except for if the file is local or on Commons and the exact hash-based path to the file. Thumbnails for specific pages of a file must also always specify the width. Unfortunately, the Images generator combined with the Imageinfo API doesn't understand multi-page files, so you'll have to extract that information from the title.

If you want the maximum width thumbnail, you first have to get the width, using a query like this. The file page should always be File:<BASEPAGENAME>, but you could use the Images API to check.

Once you have the desired width, you can then use this query to get the thumbnail. iiurlparam is set to page<PAGE>-<WIDTH>px.

@Mpaa what's the use case here? There are two ways (non-exclusive) to go about this:

  • Embed the useful data in the pages JS
  • Provide an API to get the info from the server asynchronously

Maybe we should do both?

Change 740375 had a related patch set uploaded (by Inductiveload; author: Inductiveload):

[mediawiki/extensions/ProofreadPage@master] Add image URLs to JS config variables

https://gerrit.wikimedia.org/r/740375

Change 740532 had a related patch set uploaded (by Inductiveload; author: Inductiveload):

[mediawiki/extensions/ProofreadPage@master] WIP: Add image URLs to proofread API

https://gerrit.wikimedia.org/r/740532

Change 740375 merged by jenkins-bot:

[mediawiki/extensions/ProofreadPage@master] Add image URLs to JS config variables

https://gerrit.wikimedia.org/r/740375

@Mpaa what's the use case here? There are two ways (non-exclusive) to go about this:

  • Embed the useful data in the pages JS
  • Provide an API to get the info from the server asynchronously

Maybe we should do both?

The use case is to send the url of the image to an external service for OCR.

@Mpaa sure, but in what environment? There is now access to the URL for both the thumbnail and full sized image via the JS config variables, so you can get it from a gadget or script right now.

If you wanted to get the URL from, say, Pywikibot script, that needs the other patch to be figured out (blocked on a test problem) to add to the action API.

Aklapper changed the subtype of this task from "Task" to "Feature Request".