Page MenuHomePhabricator

Add 'Indic OCR' (Google Drive API) to Wikimedia OCR tool
Open, Needs TriagePublic

Description

The Indic OCR tool uses the Google Drive API for running OCR. This OCR engine is different to Google's Cloud Vision API, and gives better results for some sorts of text documents.

In 2016 we (Community-Tech) were under the impression that we were not allowed to access Google Drive programatically, but I can't now find a canonical source for why that is. Perhaps it used to be part of the terms but is no longer? Also, we were told by Google that the Drive OCR was soon (i.e. in 2017) going to be integrated with Cloud Vision and that there'd only be one OCR API. That seems to never have happened.

I suggest we either:

  1. add Drive API as an option to the 'Transcribe text' dropdown menu and to the OCR tool; or
  2. make it easy for the indic-ocr tool to do this (if for some reason it isn't allowed to be done by the WMF; the terms of service don't sound like this is the case though).

The primary benefit of doing this would be that all OCR tools would be available via the same UI, and things would be less confusing. It would also be possible to use the existing cropping functionality to OCR only part of a page via Drive (which isn't possible with indic-ocr at the moment).

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

There are some existing Toolforge tools that interface with Google Drive already, so I think we're okay there. I agree it would be nice to add this to Wikimedia OCR. I'd be happy to help with code review, too.