Page MenuHomePhabricator

Wikisource: Create Toolforge Staging Tool for Wikimedia OCR
Closed, ResolvedPublic3 Estimated Story Points

Description

Acceptance Criteria:

Event Timeline

Restricted Application added a subscriber: Aklapper. ยท View Herald TranscriptMar 25 2021, 5:37 PM
ARamirez_WMF set the point value for this task to 3.Mar 25 2021, 5:44 PM
ifried renamed this task from Create Toolforge Staging Tool for Wikimedia OCR to Wikisource OCR: Create Toolforge Staging Tool for Wikimedia OCR.Mar 25 2021, 5:47 PM
ifried renamed this task from Wikisource OCR: Create Toolforge Staging Tool for Wikimedia OCR to Wikisource: Create Toolforge Staging Tool for Wikimedia OCR.

I didn't look closely enough, but now realise that this probably depends on T278438: Wikisource OCR: determine staging tool for wikimedia ocr [4H]. However, I already created the ocr-test tool: https://toolsadmin.wikimedia.org/tools/id/ocr-test

I'll document the plan in the subtask though.

Mentioned in SAL (#wikimedia-cloud) [2021-03-30T01:35:55Z] <wm-bot> <samwilson> T278461. Test site is up and running at https://ocr-test.toolforge.org/ .

I didn't look closely enough, but now realise that this probably depends on T278438: Wikisource OCR: determine staging tool for wikimedia ocr [4H]. However, I already created the ocr-test tool: https://toolsadmin.wikimedia.org/tools/id/ocr-test

I'll document the plan in the subtask though.

Sorry for the confusion! This task is for creating the staging tool on Toolforge, while T278438 is about how we will connect to it for testing purposes (staging version of the gadget, secret URL parameter to use the staging tool, or maybe after moving the gadget to the Wikisource extension we have a config variable pointing to staging URL only on the Beta cluster and/or CommTech wiki, etc.).

dom_walden added a subscriber: dom_walden.

I have briefly tested https://ocr-test.toolforge.org, and it seems to work fine.

To test:

  1. Find an image on https://commons.wikimedia.org (e.g. https://commons.wikimedia.org/wiki/File:Chesham_Cemetery_%E2%80%93_20200803_121424_(50332043101).jpg)
  2. Click the image to find the https://upload.wikimedia.org URL (e.g. https://upload.wikimedia.org/wikipedia/commons/thumb/f/fd/Chesham_Cemetery_%E2%80%93_20200803_121424_%2850332043101%29.jpg/800px-Chesham_Cemetery_%E2%80%93_20200803_121424_%2850332043101%29.jpg)
  3. Enter the upload.wikimedia.org URL into https://ocr-test.toolforge.org

I also tried to setup the OCR gadget on beta (go to https://en.wikisource.beta.wmflabs.org/wiki/Special:Preferences#mw-prefsection-gadgets and check gadget-GoogleOCR), but does not work at the moment, I think due to the error Image URL must begin with 'https://upload.wikimedia.org/' (I think this is what MusikAnimal was talking about above).

I have tested https://ocr-test.toolforge.org, and it is working as expected. I'm marking this work as Done.