Wed, Mar 31
I'm not sure if that would work because that would require parsing the text in a new format that we would have to define -- for example, newlines might be significant in this format whereas they weren't previously. There are probably also multiple possible ways to resolve the file name -- could be on another wiki, for example. I think it makes more sense to just use the existing wikitext facility for making links. Someone could make a tool to generate the wikitext from a list like you gave, though.
Tue, Mar 30
Thanks for explaining the issues with PDF. I expected it would be something like that but it's good to see it all spelled out.
Feb 3 2021
Feb 1 2021
Jan 29 2021
Jan 11 2021
Just created a new task for this.
Jan 8 2021
Dec 31 2020
Joseagush explains the use case in the link to the talk page in the task description. The contributors in Bali are cataloging a large number of manuscripts and converting to PDF is an additional step which is not trivial for them and has resulted in decreased image resolution with their attempts so far. I'm not going to say there is no way to get around this but I also think it's important to make Wikisource accessible to smaller communities like this. They've made a lot of progress already in learning the technical aspects of Wikimedia sites. Just keep in mind that their capacity is still limited, so if they need to learn any new processes (such as img2pdf which would require them to use Python on Windows, use the command line, etc.) then we should be sure it's really needed.
Dec 29 2020
Dec 17 2020
Dec 15 2020
Oct 24 2020
Sep 17 2020
Sep 9 2020
I did a quick test and it seems to work. Haven't tested with a proper OAuth library.
Aug 20 2020
Is there a way to cross-link Wikisource pages across editions, similar to how Wikipedia links articles? Does that even make sense?
Aug 19 2020
Unfortunately I don't have a lot of time to work on it at the moment. There are at least two things that would be involved in porting it:
Aug 17 2020
Any chance we can get this merged soon?
I just tested this locally. The issue arises from ProofreadPage's code in getImageWidth. If the width is not set in the index, the default (self::DEFAULT_IMAGE_WIDTH) is 1024. This is presumably meant to avoid excessively large image files, but it's counterproductive in this case.
Aug 16 2020
Looking into this a bit further, I'm starting to agree with @Xover that this is not a very good way of achieving what I want. Among the issues:
Aug 15 2020
Here is an example of how it looks on Palmleaf.org currently. It will not look exactly like this on Wikisource of course. Sections like Leaf 1, Leaf 2, Leaf 3 will correspond to pages in the Page namespace. The content prior to the "auto-transliteration" heading is what editors will type, and the transliteration will be added below. It should be interleaved like this, page by page, so that readers don't get lost. Given that, it makes sense to me to make it part of the parser output for each page. This makes transclusion work without further effort, and it means that editors can preview the output while editing the page (which, at least in the case of the Balinese work, definitely helps their proofreading efforts).
Aug 13 2020
This is related to something PanLex is currently doing in a gadget I've recently ported from Palmleaf.org. There's a community in Bali that's been doing Balinese palm-leaf manuscript transcription there and it's in the process of being moved to Wikisource. The manuscript scans all come from IA. I've already batch uploaded them to Commons using PDFs from IA.
Aug 12 2020
Aug 10 2020
I've just uploaded a new patchset that splits out ban-latn. There are now three codes:
Aug 8 2020
Thanks! Yes, I can see how this would be a useful feature to have. It's not currently needed for the Balinese work I am doing, but I can imagine it could be needed in the future for the annotation use case. Annotation/correction of manuscripts is certainly a thing people do.
Aug 6 2020
I've now had a chance to discuss this with Joseagush, one of the main coordinators Balinese Wikipedia. His strong preference is for option 1: ban should be Latin-only and ban-bali should be for Balinese script. His argument is that most Balinese language online is in Latin script, and most Balinese people expect Latin and may not be comfortable with Balinese script. Another advantage is that this maintains compatibility with the coding used on ban.wikipedia.org, so that work on Balinese Wikisource (which will contain a good amount of Balinese script) will not unnecessarily interfere with existing Balinese content on Wikimedia. (Incidentally, this also seems to be how Javanese works: jv is Latin script and jv-java is Javanese script.)
Aug 5 2020
New patchset is uploaded now. There is currently no validation of the Index page's language code, but that is arguably the preferable behavior, because Wikisource editors can add sources in whatever language they want for whatever reason they want, and thousands of languages will not be in the list known to LanguageNameUtils::getLanguageNames. To allow this freedom I think the Index field should not be further validated. It isn't currently possible to set the page language to a language not in the Names.php list (I guess for good reason? don't know), but at least it can be recorded accurately in the Index this way.
Based on @Tpt's comments on Gerrit, it looks like it's a lot cleaner to override getPageLanguage in the PageContentHandler to return the language based on the current Index value, and not modify page_lang in the database. Working on an updated patchset now -- there are a couple other things to address beyond that.
It's still possible to override the pagelang on individual pages in the Page namespace by using Special:PageLanguage as usual. Also, if you change the Index's langcode and the pagelang of anything in a corresponding Page doesn't match, it's left alone. Example:
Aug 4 2020
Jul 29 2020
I think I see a fix for this, but you'll have to test it on toolforge -- shall I submit another PR on GitHub?
Must be an issue with views/template.twig, where it does this:
Mar 9 2020
Mar 7 2020
I haven't designed a MediaWiki-specific input method for Balinese script. I'm not sure what should be done for T245360: Add input method for ban-Bali (Balinese in Bali script). I did design a Keyman keyboard. The Keyman stuff is highly context-sensitive and the rules are pretty complex. What's the format/language for input methods?
Mar 6 2020
Feb 24 2020
FYI the latest proposal has been submitted as a project grant: https://meta.wikimedia.org/wiki/Grants:Project/PanLex/Balinese_palm-leaf_transcription_platform_on_Wikisource
Feb 18 2020
Based on recent comments in the above-referenced GitHub issue for oauth2-server, it looks like current best practice is not to allow callback prefixes of any kind. That means it's just the OAuth consumer proposal form that needs to be updated.
The easily fixable issue I had was that I had registered the callback URI as http://localhost:3000 but the library I was using was passing it to the server as http://localhost:3000/, which was enough to cause a mismatch. I registered a new consumer with the slash on the callback URI and it works now.
I've resolved some of the issues and turned remaining issues into separate tasks.
Feb 17 2020
OK, I took several steps just now to investigate this and found the following:
Feb 14 2020
I just tried making a new OAuth 2.0 consumer called archiveleaf which is not owner-only. It's not approved yet (in fact I requested it not be approved since it's being tested) but my understanding is it should still allow me to authenticate as myself (user Lautgesetz). I get the same error.
Feb 10 2020
By the way, there's the possibility of using the ArchiveLeaf extension to work with palm-leaf manuscript collections in other languages. There are no solid plans for that currently but it's something PanLex is investigating. Just wanted to mention that it may be useful beyond this project.