User Details
- User Since
- Nov 26 2014, 3:27 AM (511 w, 4 d)
- Availability
- Available
- LDAP User
- Unknown
- MediaWiki User
- 維基小霸王 [ Global Accounts ]
Mar 25 2024
Mar 19 2024
There has been a month and still no updates to the vision release notes.
Feb 26 2024
The documentation states:
Feb 23 2024
Please replicate the bug with the following code in debug console in Commons:
Jan 29 2024
Any response?
Jan 24 2024
} elseif ( $isEncoding && ctype_digit( $k ) ) { // json_decode currently doesn't return integer keys for {} $isSequence = $next++ === (int)$k; } else {
Jan 19 2024
Great! Hope it will solve the problem.
Is there anyone know how to let Google change this?
Jan 8 2024
Dec 1 2023
Aug 30 2023
The solution would be easy. Just write a bot, download a PDF from commons, and convert the file to jpg locally. Upload every jpg to Google, get the OCRed text, and use the bot put text to Wikisource.
Jul 3 2023
Jun 26 2023
A limitation of Google OCR has been found: it cannot recognize punctuation marks outside vertical lines. This is a common typesetting practice during the Chinese Republican era. For example, for this image, no punctuation marks were recognized. Are there any options available on Google to recognize them?
I think the problem lies with Wikimedia Commons being slow to respond. If someone manually opens a rarely accessed book on browser and randomly selects a page to view, the server might not display it immediately; it may take some time. The server should extract pages from PDF files, cache them as image files, and then display them. For OCR, there should be dedicated tools to download the entire PDF file, convert it to images using those tools, and then send them to Google for OCR processing.
Jun 5 2023
I want to run many things at the same time. Thanks.
May 31 2023
Yes. We need a lot of tech workforce in zhws.
May 30 2023
Google actually OCR every image pdf it indexes. See the cache pages for
Google OCR cannot recognize punctuations out of line in Chinese verticle text.
Dec 6 2022
@fnegri Done.
Now tw/pdf is under 100G. I shall close this issue.
May 7 2022
Sep 20 2020
Can anyone add it please?
Sep 3 2020
Aug 31 2020
Aug 22 2020
Jun 5 2020
The function of removing space from line breaks is still badly needed in Chinese Wikisource. Line breaks are kept to help proofreading.
Dec 28 2019
Dec 2 2019
Nov 17 2019
Nov 16 2019
Sep 24 2019
Aug 9 2019
Jul 24 2019
Dec 5 2016
The addition of space between lines is for all pages regardless of namespace. So fixing this problem should be involving changing somewhere other than this proofreading extension.
Thank you.
Dec 4 2016
Why no progress so far?
Nov 26 2014
Why no progress so far?