Investigation card for wish #20: Create new Han characters with IDS extension for Wikisource
The main ticket is {T137786}.
The investigation is -- what still needs to be done? Several people have been working on it. How can we help to get it finished?
See also https://meta.wikimedia.org/wiki/2016_Community_Wishlist_Survey/Categories/Wikisource#Create_new_Han_Characters_with_IDS_extension_for_WikiSource
----
== Investigation
**Background:**
- Written Chinese involves "inventing" new characters all the time by combining a bunch of standard characters. For example [[ https://tools.wmflabs.org/idsgen/%E2%BF%BA%E8%BE%B6%E2%BF%B1%E2%BF%B1%E7%A9%B4%E2%BF%B0%E6%9C%88%E2%BF%B0%E2%BF%B1%EF%95%9F%E2%BF%B2%E9%95%B7%E9%A6%AC%E9%95%B7%E5%88%82%E5%BF%83.png?%E5%AD%97%E9%AB%94=%E6%A5%B7%E9%AB%94 | this ]] ideograph is a combination of 10 individual Chinese characters. Together they make sense as a new phrase or sentence. Since such new ideographs are created all the time (like how new phrases are created using standard words), it's not feasible to think about adding them all to Unicode.
- Adding [[https://en.wikipedia.org/wiki/Chinese_character_description_languages#Ideographic_Description_Sequences|IDS]] support to wikis means that rather than using manually-created images to represent these ideographs, contributors will be able to add them directly from the edit interface. IDS operates via 12 special 'operator' unicode characters that prefix other characters to create new ideographs.
- The [[https://www.mediawiki.org/wiki/Extension:Ids|IDS extension]] works by sending these character combinations (defined within an `<ids>…</ids>` element) to a web service that returns a PNG of the resulting ideograph.
**Current situation:**
- Active development is underway (most recent activity a couple of weeks ago) by users including @Shoichi and @awight
- The (GPL-2.0) extension code is at https://github.com/MGdesigner/Mediawiki-IDSextension — it's only about 25 lines of code
- The (Java, AGPL-3.0) web service tool is at https://github.com/sih4sing5hong5/han3_ji7_tsoo1_kian3 (27 open issues, but most are from within about the last 18 months)
- A [[https://github.com/Wikimedia-TW/han3_ji7_tsoo1_kian3_WM|fork]] of this is maintained by Wikimedia Taiwan, but it's currently [[https://github.com/sih4sing5hong5/han3_ji7_tsoo1_kian3/compare/master...Wikimedia-TW:master|the same as]] upstream
- Much of han3_ji7_tsoo1_kian3 is in Chinese, which is a barrier to non-Sinophone developers — however, it [[https://phabricator.wikimedia.org/T148693#2837458|sounds like]] the upstream author has agreed that things should be translated to English instead
- A test installation of the web service is running at https://tools.wmflabs.org/idsgen/
- The `mediawiki/extensions/Ids` repository has been [[https://www.mediawiki.org/wiki/Git/New_repositories/Requests|requested]] by @awight
- A test wiki has been set up at http://ids-testing.wmflabs.org/wiki/
**Still required:**
- [ ] Move the extension repository into Diffusion
- [x] Set up an IDS Phabricator project — #IDS-extension
- [ ] Security (and code style etc.) review of the extension
- [ ] Add ability to configure the web service endpoint to the extension (hardcoded at the moment)
- [ ] Translate han3_ji7_tsoo1_kian3 into English
- [ ] Security (etc.) review of han3_ji7_tsoo1_kian3
- [ ] {T137786}
- [ ] {T148693}
- [ ] Add a caching layer as part of the extension (in case Tool Labs service goes down)