For a long time Indic languages Wikisource projects depended totally on manual proofreading, which not only wasted a lot of time, but also a lot of energy. Recently Google has released OCR software for more than 20 Indic languages. This software is far far better and accurate than the previous OCRs. But it has many limitations. Uploading the same large file two times (one time for Google OCR and another at Commons) is not an easy solution for most of the contributors, as Internet connection is way slow in India. What I suggest is to develop a tool which can feed the uploaded pdf or djvu files of Commons directly to Google OCRs, so that uploading them 2 times can be avoided. -- Bodhisattwa (talk) 13:50, 10 November 2015 (UTC)
This card tracks a proposal from the 2015 Community Wishlist Survey: https://meta.wikimedia.org/wiki/2015_Community_Wishlist_Survey
This proposal received 39 support votes, and was ranked #25 out of 107 proposals. https://meta.wikimedia.org/wiki/2015_Community_Wishlist_Survey/Wikisource#Tool_to_use_Google_OCRs_in_Indic_language_Wikisource
'''Update'''
- OCR4Wikisource python script - (Using Google Drive API; works only in Linux OS)
- ws-google-ocr Tool in tool lab (Using Cloud Vision API)
- Google OCR Javascript in Multilingual Wikisource (Using Cloud Vision API)