Page MenuHomePhabricator

Arcorann
User

Projects

User does not belong to any projects.

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Tuesday

  • Clear sailing ahead.

User Details

User Since
Apr 10 2024, 10:52 AM (48 w, 3 d)
Availability
Available
LDAP User
Unknown
MediaWiki User
Arcorann [ Global Accounts ]

Recent Activity

Dec 10 2024

Arcorann created T381858: Side page links do not update when font size is changed (Vector 2022).
Dec 10 2024, 11:14 AM · All-and-every-Wikisource, Desktop Improvements (Vector 2022)

Oct 4 2024

Arcorann added a comment to T375838: Uploading backlog stucked.

For Wikisource, use the DjVu option "from original scans (JP2)" instead. This is currently preferred to uploading as PDF due to the various issues mentioned by me in T363619.

Oct 4 2024, 2:00 AM · IA Upload

Aug 26 2024

Arcorann added a comment to T363619: Remove option for PDF → DjVu conversion (phetools).

On the comment "the original PDFs can be uploaded directly", currently there are enough issues with our handling of PDFs (notably bad text layer extraction -- see T242169 -- and bad thumbnail generation -- see e.g. T224355 and linked issues, also note the related issue T339845) that DjVu is still being recommended over PDF on enWS.

Aug 26 2024, 12:18 AM · IA Upload

May 29 2024

Arcorann added a comment to T339845: Investigate alternatives to ghostscript for PDF thumbnailing.

While we're here, can we also implement something that doesn't have so many image issues when thumbnailing PDFs? Having run into yet another issue when proofreading for Wikisource in which the text somehow just fails to render on the image (https://commons.wikimedia.org/w/index.php?title=File%3AThe_sayings_of_Confucius%3B_a_new_translation_of_the_greater_part_of_the_Confucian_analects_(IA_sayingsofconfuci00confiala).pdf&page=28), only to then go to the original PDF hosted on Commons and easily read off the text from there, after having to do the same for a number of other works due to blurring of text, I really think we can do a lot better than what we have now.

May 29 2024, 2:24 AM · Thumbor

Apr 16 2024

Arcorann added a comment to T359703: Add a "Bulk OCR" feature to Index Pages on Wikisource.

I'd like to add a point to the "overhead" comments -- while EditInSequence is intended to reduce much of that overhead, it's currently hampered by several bugs when using it to create pages (notably T340986, where the text layer doesn't appear, forcing the editor to OCR manually, and to a lesser extent T360282 where the index header/footer isn't loaded). I suspect caching OCR would be a nice option to add to EditInSequence, but these bugs ought to be fixed first IMO.

Apr 16 2024, 12:11 PM · Community-Tech, Wikimedia OCR