Page MenuHomePhabricator

Create a ProofreadPage wikitext editor user script for Wikisource which uses Google Vision API to do OCR
Closed, ResolvedPublic5 Estimated Story Points

Description

Create a wikitext editor user script that adds an OCR button in the Wikitext editor. On clicking the button, the script queries the Google Vision API (via a wrapper API on Tool Labs) with the image of the book on that page (See T140037#2528369 and previous comments there). It populates the content div with the text it got back from the API.

Note that it needs to grab the content language (mw.config.get( 'wgContentLanguage' )) and pass that to the API.

See existing gadget that uses a different OCR service at https://en.wikisource.org/wiki/MediaWiki:Gadget-ocr.js.

Make sure this user script works in both the standard and enhanced edit toolbars. (See code in customiseToolbar() in https://en.wikisource.org/wiki/MediaWiki:Gadget-ocr.js.)

Event Timeline

kaldari renamed this task from Create a wikitext editor gadget for Wikisource OCR to Create a wikitext editor user script for Wikisource uses Google Vision API to do OCR.Aug 11 2016, 10:57 PM
kaldari updated the task description. (Show Details)

When you say "wikitext editor", which of them do you mean? I'm assuming the modified WikiEditor integrated into ProofreadPage?

When you say "wikitext editor", which of them do you mean? I'm assuming the modified WikiEditor integrated into ProofreadPage?

Yep, modified wikitext editor on ProofreadPage.

Jdforrester-WMF renamed this task from Create a wikitext editor user script for Wikisource uses Google Vision API to do OCR to Create a ProofreadPage wikitext editor user script for Wikisource which uses Google Vision API to do OCR.Aug 15 2016, 4:24 PM
Jdforrester-WMF added a project: ProofreadPage.
DannyH set the point value for this task to 5.Aug 30 2016, 5:43 PM
kaldari updated the task description. (Show Details)

I've started a draft of this script at https://en.wikisource.org/wiki/User:Samwilson/GoogleOcr.js (with a different coloured icon to the existing OCR button).

Samwilson moved this task from Ready to In Development on the Community-Tech-Sprint board.

Is there any established method of indicating progress for toolbar buttons, for processes that take some time? At the moment, my script is using e.g. mw.notify( mw.message( 'google-ocr-request-sent' ) ); and I'm going to make it disable both the button and the main text box for the duration of the request. But perhaps there's already a system for this sort of thing?

@Samwilson: Not that I know of. Typically toolbar button functions are more or less instant, so there probably isn't anything built in.

What are your thoughts on the placement of the button? The existing gadget adds the OCR button in the main toolbar, but it looks like your script adds it to the Proofread tools submenu. Do you think folks will find it there?

@kaldari: Okay, well I'll carry on with a custom spinning gif and message box. :) Seems to work. @Bodhisattwa mentioned that it'd be cool to have a progress bar, which I quite agree—I think it's a bit too tricky for now though. :(

And yeah, I wasn't sure about the toolbar placement. I feel like it belongs more in the 'proofread tools' submenu, but yeah perhaps it's a bit hard to find. I also figured that although people are unlikely to have both gadgets loaded at the same time, if they do then that first bit of the toolbar might get stretched a bit far with these double-width icons.

Am I on the right track with the system messages? I haven't created any yet (on EN or elsewhere), just listing them at Wikisource:Google OCR. Is it worth adding default (English) messages, so at least something is displayed?

@Samwilson: Ultimately, any given wiki should only have 1 of the 2 gadgets available (basically wikisource languages that are supported by Google OCR API, but not phetools/Tesseract), so I don't think having 2 buttons on the same toolbar will be an issue. Personally, I would favor having the new gadget also be in the main toolbar so that it is easy to find and the placement is consistent between wikis.

@kaldari: cool, good point. I've moved the button back to the main section. The message-loading is working correctly too.

The script now lives at https://wikisource.org/wiki/Wikisource:Google_OCR/script.js (and its previous locations are redirects or have been fixed to load it from there). I've run it through jshint and it only complains about the multi-line string concatenation... any suggestions for better form there?

Or other things that look bad? Like inline CSS; should that be pulled out into a stylesheet?

@Samwilson: There are a couple of coding conventions that I would normally hold you to if it were MediaWiki code, but since it's only a gadget I don't think we have to be too strict. I'll leave them up to you.

First is that we normally declare all function variables at the beginning of the function:
https://www.mediawiki.org/wiki/Manual:Coding_conventions/JavaScript#Declarations

Second is that we normally document parameters and return values of functions:
https://www.mediawiki.org/wiki/Manual:Coding_conventions/JavaScript#Documentation_comments

I wouldn't worry about the multi-line strings or inline styles (which are minimal).

Thanks @kaldari :-)

I've fixed documentation and variable declarations up a bit, and switched to constructing HTML with jQuery.