Page MenuHomePhabricator

Add Bulk OCR button on Index namespace for each pagelist
Open, Needs TriagePublic

Description

This task deals with adding a button with the label ‘Bulk OCR’ to the Index namespace right above each pagelist section that enables users to OCR all the pages of that book in one go.

Details and specifics

  • This button and subsequent UI elements should only be visible to admin accounts on the wiki
  • Button for each pagelist (can possibly be done by targeting the .ws-index-pagelist-container element)
  • Clicking the button should trigger a series of sequential requests to the OCR API at ocr.wmcloud.org. Use google as the default engine.
    • Only perform OCR requests for pages that are empty (pages marked without any colour on the Index page)
  • The response text obtained from the API should be inserted into the relevant page’s text layer
    • In case any page returns empty text, no text should be inserted into the page’s text layer
    • In case of error at any point, the whole sequence of operations should stop, preventing further transcription
  • If there was a successful transcription and the text was inserted, the status of the page should be changed to Not proofread (similar to pages marked with red on the Index page)
  • It would be helpful to have a warning text close to the button that says This feature is under active development and needs to be used with extreme caution. This message should probably be carried over into subsequent iterations until the entire flow is built.

Possible OOUI components

Details

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Change #1153779 had a related patch set uploaded (by Osuji pius; author: Osuji pius):

[mediawiki/extensions/Wikisource@master] bulkocr: add button to index namespace above pagelist

https://gerrit.wikimedia.org/r/1153779

Change #1153779 had a related patch set uploaded (by Osuji pius; author: Osuji pius):

[mediawiki/extensions/Wikisource@master] Add Bulk OCR to Index namespace

https://gerrit.wikimedia.org/r/1153779

Change #1153779 merged by jenkins-bot:

[mediawiki/extensions/Wikisource@master] Add Bulk OCR to Index namespace

https://gerrit.wikimedia.org/r/1153779

Hi, I would like to work on this task as a starting contribution. I am exploring the Bulk OCR feature and would like to implement the button.

Please guide me if there are any specific requirements.

Thank you!