Feature summary (what you would like to be able to do and where):
Currently, there is no Bulk OCR tool available with which we can easily OCR all the pages in a given text(s). OCR4wikisource has stopped working for a long time.
I think it would be good to have an option to OCR all the pages in a given Index from the Index page itself.
Use case(s) (list the steps that you performed to discover that problem, and describe the actual underlying problem which you want to solve. Do not describe only a solution):
While the current feature works for many cases, there is still an extra bit of work that goes into OCRing every single page before proofreading it.
Benefits (why should this be implemented?):
This feature will be especially useful during campaigns or activities in which newbies are starting to edit Wikisource. They would not need to worry about choosing the correct engine, language but would only need to focus on the key task, that is, proofreading. This should speed up the proofreading work by reducing the redundancy of having to use OCR every time.
How should this be implemented?:
- To ensure that recent changes are not flooded with these edits, this feature can be limited to approved bot accounts.
- For quality control, there can be a pop-up which shows a few pages with the image and the OCR output before the tool runs through all the pages in an Index.