Page MenuHomePhabricator

Wikisource OCR: Move Wikimedia OCR gadget to Wikisource extension
Closed, InvalidPublic5 Estimated Story Points

Description

As a Wikisource user, I want the Wikimedia OCR gadget to be moved to the All-and-every-Wikisource extension, so that all Wikisource users can be able to view, access, and use the Wikimedia OCR tool with no special installation process required.

Resources:

Acceptance Criteria:

  • Move Wikimedia OCR gadget to the Wikisource extension
  • The API endpoint for the tool should be configurable (e.g. $wgWikisourceOcrEndpoint) and default to https://ws-google-ocr.toolforge.org/api.php (the current value), so that it can be customized for Beta Wikisource and for the new tool's URL (when that is available).

Event Timeline

ifried renamed this task from Wikisource OCR: Move OCR gadgets to Wikisource extension [placeholder] to Wikisource OCR: Move OCR gadgets to Wikisource extension.Feb 25 2021, 4:03 PM
ifried updated the task description. (Show Details)
ifried renamed this task from Wikisource OCR: Move OCR gadgets to Wikisource extension to Wikisource OCR: Move Wikimedia OCR gadget to Wikisource extension.Feb 25 2021, 6:21 PM
ifried renamed this task from Wikisource OCR: Move Wikimedia OCR gadget to Wikisource extension to Wikisource OCR: Move Wikimedia OCR gadget to Proofread extension.Feb 25 2021, 6:24 PM
ifried updated the task description. (Show Details)
ifried renamed this task from Wikisource OCR: Move Wikimedia OCR gadget to Proofread extension to Wikisource OCR: Move Wikimedia OCR gadget to Proofread Page extension.Feb 25 2021, 6:26 PM
ifried updated the task description. (Show Details)

We discussed this in estimation today. We need to figure out how to deal with cases of wikis that do not find the OCR tool useful and would not want it turned on as default. Would it be preference? Would it be something we only do for some wikis? To be discussed in greater detail during our next estimation.

Samwilson subscribed.

This will be introducing a new dependency to ProofreadPage, and it needs to be optional (for 3rd party users of the extension). It could be done as a configuration variable, e.g. $wgProofreadPageOcrEndpoint = 'https://ocr.wmcloud.org/'; which could then be enabled as required for specific Wikimedia wikis. It might also be a user preference, so individual users could turn off the OCR button.

Here are all the gadgets called ocr or GoogleOCR (there might be other names for similar functionality):

Some points that were brought up tonight while discussing this ticket:

  • This may be better to do later on in the project, after we have revamped Wikimedia OCR with improvements that justify us moving it over to be default for people
    • Example of improvement: adding OCR engines found in Basic OCR & IndicOCR
  • If we did this work, it would be viewable by all MediaWiki users, including third party users, so there should be ideally a way to turn this on/off (so that if, they use their own OCR service, they don't need to deal with ours)
ifried renamed this task from Wikisource OCR: Move Wikimedia OCR gadget to Proofread Page extension to Wikisource OCR: Move Wikimedia OCR gadget to Wikisource extension.Apr 8 2021, 5:11 PM
ifried updated the task description. (Show Details)
ARamirez_WMF set the point value for this task to 5.Apr 8 2021, 5:20 PM
ARamirez_WMF moved this task from Needs Discussion to Up Next (May 6-17) on the Community-Tech board.

There is being design work being done around this task so the functionality might change significantly. T280848 has been created to track the work once the design has been completed.