Page MenuHomePhabricator

Arcorann
User

Projects

User does not belong to any projects.

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Saturday

  • Clear sailing ahead.

User Details

User Since
Apr 10 2024, 10:52 AM (14 w, 1 d)
Availability
Available
LDAP User
Unknown
MediaWiki User
Arcorann [ Global Accounts ]

Recent Activity

May 29 2024

Arcorann added a comment to T339845: Investigate alternatives to ghostscript for PDF thumbnailing.

While we're here, can we also implement something that doesn't have so many image issues when thumbnailing PDFs? Having run into yet another issue when proofreading for Wikisource in which the text somehow just fails to render on the image (https://commons.wikimedia.org/w/index.php?title=File%3AThe_sayings_of_Confucius%3B_a_new_translation_of_the_greater_part_of_the_Confucian_analects_(IA_sayingsofconfuci00confiala).pdf&page=28), only to then go to the original PDF hosted on Commons and easily read off the text from there, after having to do the same for a number of other works due to blurring of text, I really think we can do a lot better than what we have now.

May 29 2024, 2:24 AM · Thumbor

Apr 16 2024

Arcorann added a comment to T359703: Add a "Bulk OCR" feature to Index Pages on Wikisource.

I'd like to add a point to the "overhead" comments -- while EditInSequence is intended to reduce much of that overhead, it's currently hampered by several bugs when using it to create pages (notably T340986, where the text layer doesn't appear, forcing the editor to OCR manually, and to a lesser extent T360282 where the index header/footer isn't loaded). I suspect caching OCR would be a nice option to add to EditInSequence, but these bugs ought to be fixed first IMO.

Apr 16 2024, 12:11 PM · Community-Tech, Wikimedia OCR