Steps to replicate the issue (include links if applicable):
- Find a index page on English Wikisource (for example, this index for Local Government Act 1972) and open a page (e.g. page 1 of the Act).
- Select either Tesseract, Google or Transkribus OCR. for the selection
- Click "transcribe text" at the upper right-hand side corner.
What happens?:
- No text generated, and the following error messages popped out respectively:
- Tesseract: Error from the OCR tool: Image retrieval failed: HTTP/2 429 returned for <JPG link of the PDF page>
- Google: Error from the OCR tool: The Google service returned an error: We can not access the URL currently. Please download the content and pass it in.
- Transkribus: Error from the OCR tool: Error Code '500' :: Unable to complete request, try again!
- For some cases, the text was properly generated, but multiple clicks were needed.
What should have happened instead?:
OCR applications should have operated normally in one or two click, and the text in the source PDF file should have been generated normally.
Other information (browser name/version, screenshots, etc.):
- Might relate to T332125, T337495 and T296912.
- Mentioned on English Wikisource Scriptorium. Xover proposes that this may be caused by IP problem, and reports that European users are seriously affected.
- My IP is in Hong Kong.