Page MenuHomePhabricator

Deploy IDS rendering engine to production
Open, NormalPublic

Description

We're hoping to put an ideographic description sequence rendering engine somewhere in production (Ganeti?), to serve requests from Extension:Ids. Currently, transcluding non-Unicode Kanji and other unsupported glyphs requires that the author generates an image of the character using a 3rd-party system, and uploads the image to Wikimedia commons. Once Extension:Ids and this backend are integrated, editors will be able to embed the IDS source directly in articles.

Wikimedia Taiwan maintains a fork of the rendering engine here:
https://github.com/Wikimedia-TW/han3_ji7_tsoo1_kian3_WM

This is a Java application which can be run inside of a Jetty container. We'll need to do code and security review of this repo. It's currently running on wmflabs as a proof of concept.

Some obstacles so far:

  • Discuss how to support a new Java application.
  • Discuss who will support it. @awight is happy to provide volunteer developer time.
  • Probably ask the upstream to translate variable names so that code review is easier.

I'm imagining that the architecture should be similar to Extension:Math, or are there other recommendations?

Event Timeline

awight created this task.Oct 19 2016, 10:33 PM

I also paste the testing wiki as an reference. Most funny , there is also some special emojis by IDS there. :)

@Aklapper: Thanks for the pointer! I've linked with the tracking task, but balked on removing the Extension:Ids parent because this is a hard dependency before we can deploy that. i.e. the dependency seems to make a lot of sense from the other direction.

MaxSem added a subscriber: MaxSem.Nov 30 2016, 2:29 AM

Uh, this renderer has not only Chinese documentation and comments, but even identifiers are in Chinese in some places. To me, this means that (almost?) nobody at the Foundation will be able to debug and fix it if something goes wrong. At least, we have zero Chinese-speaking ops, afaik.

but even identifiers are in Chinese in some places

There were actually plans to use Chinese class & filenames in the java servlet (halted due to concerns over filesystem compatibility)... This system was and is mainly intended to solve pan-Chinese (zho) problems.

Perhaps @Shoichi can help negotiate on the source language issue. Unnecessary forks don't sound good given that there are some upstream plans on refactoring, which can be a good opportunity for WMF ops to fix many issues.

My plan is to add English code comments ,also inlcude comments translating those names with Han characters of functions and variables. I sent the proposo to the author in December,and he agreed. I will set up a little translate team to execulate this. Let People in foundation also can understand its code .

Shoichi claimed this task.EditedDec 25 2016, 4:02 AM

han3_ji7_tsoo1_kian3_WM Network code translation for safey code review.

I organized a 5 men translation team on 12/22. My plan is translating its code about network process in priority. ( translation of Graphics layouts and composite are lower priority.)

Shoichi added a comment.EditedDec 27 2016, 5:52 PM

@awight @Niharika @MaxSem: Hi , my translation team started to work. Our work is in
https://github.com/Wikimedia-TW/han3_ji7_tsoo1_kian3_WM/tree/code_vf_H2E the fork reason is for some members who are not used to git working (pure Wikipedian,not tech guys).

This branch is sync to latest upstream master . After getting job done, We'll push back to upstream. (The author agreed)
We will translate network service in high priority,so we start from the folder .

Who will go to review its code? Anything need to translate to English ? Please call me.

Hi @Shoichi is the translation work currently in progress?

Hi @Shoichi is the translation work currently in progress?

Yes, in progress. Sorry for vacation, and in Taiwan,now it is busy. Because our Spring Festival is still traditional one. It is during 1/27~2/1 this year. Many Companies will work hard during January (not December). Some members in my team is affected,and also I need to teach 2 members how to use git. >_<

Another thing,
I have finished adding translation comment in src/main/java/idsrend/services/IDSrendServlet.java,the start point of han3_ji7_tsoo1_kian3.
For comparing, here is original IDSrendServlet.java

I think a reviewer can join now.

Please help me check if it is good to understand the code? If not clear enough to understand, I will improve. The file will be the reference example for our other members.

For better readability, is it possible for you to use a consistent syntax regarding translations? I see a few are used:
/** 組宋體用的工具 / The Tool for compositing Song font . */ (X / x)
//連線 means connetcion (X means x)
//活字加粗 = Make a sparate movable type bolder. (X = x)


Regarding the first type (translating comments itself), I'm not sure about others, but I would personally prefer it's split into two lines:

/** 組宋體用的工具 */
/** The Tool for compositing Song font. */

This ensures the comments have a consistent alignment


Do you have a coding convention? We usually put a space before and after // (CC/PHP#Spaces).

There are also quite a lot of punctuation errors (should be space after comma and period, none before) and typos that make reading harder.

Thanks ,I pushed my new one.

For better readability, is it possible for you to use a consistent syntax regarding translations? I see a few are used:
/** 組宋體用的工具 / The Tool for compositing Song font . */ (X / x)
//連線 means connetcion (X means x)

//活字加粗 = Make a sparate movable type bolder. (X = x)

Regarding the first type (translating comments itself), I'm not sure about others, but I would personally prefer it's split into two lines:

/** 組宋體用的工具 */
/** The Tool for compositing Song font. */

The first type I change to you suggestion,but use C style

/**/

for being Identificated fast as translation comments, and

/**..... */

is original .

About the second/third type, I unified as only second type.

About coding convention of translation comments, now I also change to put a space before and after // .

This ensures the comments have a consistent alignment
----
Do you have a coding convention? We usually [[https://github.com/wikimedia/mediawiki/blob/master/includes/MediaWiki.php#L181|put a space]] before and after `//` ([[https://www.mediawiki.org/wiki/Manual:Coding_conventions/PHP#Spaces|CC/PHP#Spaces]]).
There are also quite a lot of punctuation errors (should be space after comma and period, none before) and typos that make reading harder.
Arthur2e5 added a comment.EditedJan 7 2017, 5:11 PM

Regarding marking translated comments, consider using something like /*e to
replace /**, and //e to replace ///. This would allow easy English javadoc
generation with some quick sed-work.

Regarding marking translated comments, consider using something like /*e to
replace /**, and e to replace /. This would allow easy English javadoc
generation with some quick sed-work.

About /** replacing, do you mean this way:

/*e    
 this is a function
*/

And about //e, like this?

//e this is a function,
Arthur2e5 added a comment.EditedJan 12 2017, 5:43 PM

Yes.


2017-01-14, too lazy to add a comment:

Well, that's just an example. Any character works.

This comment was removed by Shoichi.

Yes.


2017-01-14, too lazy to add a comment:
Well, that's just an example. Any character works.

I got you. 'e' means English. We change to this style.

Excuse me, beacuse our new year is coming at 1/28 ,everyone in Taiwan is very busy in each business. It took a lot of days. >_<

We have added English translation comment to service core:

https://github.com/Wikimedia-TW/han3_ji7_tsoo1_kian3_WM/blob/code_vf_H2E/src/main/java/idsrend/services/HttpserverJetty.java
HttpserverJetty.java is the start point of Jetty "application style: launching

https://github.com/Wikimedia-TW/han3_ji7_tsoo1_kian3_WM/blob/code_vf_H2E/src/main/java/idsrend/services/IDSrendServlet.java
IDSrendServlet.java is the start point of any standard servlet container (For instance: Tomcat)

https://github.com/Wikimedia-TW/han3_ji7_tsoo1_kian3_WM/blob/code_vf_H2E/src/main/java/idsrend/services/IDSrendService.java
IDSrendService.java : service core

https://github.com/Wikimedia-TW/han3_ji7_tsoo1_kian3_WM/blob/code_vf_H2E/src/main/java/idsrend/parser/IDSParser.java
IDS string parser

These are about web service core. The others are about graphics rendering and database query tools(with other java application start point. Need be launched as normal java applicaiton. Now can't use as a web service.)

About security review, I guess it may be enough. If more source code need translation comment (or any information not clear enough ) ,we will move forward after our new year (After 2/3 )

Shoichi reassigned this task from Shoichi to awight.Feb 5 2017, 3:50 PM

Hello awight, about code review (IDS render server) , do you know who can do it? My translation team have made some. I want to know if its is enough?

About cache,after discussion with upstream author , cache put in production server side is better than put in wikis-sites side. No matter how many sites connect to the server, they share the same cache.

No matter how many sites connect to the server, they share the same cache.

Makes sense as the content of these files are supposed to be only dependent on the URL anyway. For the same reason I did an HTTP header PR to tell the browsers to keep glyphs for as long as they want...

P.S.: Shoichi, I am seeing some heavy spamming on http://ids-testing.wmflabs.org/wiki/Main_Page. Are there any abusefilter sort of thing available for stopping these spambots that you can use? (Or just protect the main page for now.)

P.S.: Shoichi, I am seeing some heavy spamming on http://ids-testing.wmflabs.org/wiki/Main_Page. Are there any abusefilter sort of thing available for stopping these spambots that you can use? (Or just protect the main page for now.)

Yes, I also found it. I have fixed main page. Damn it . I will go to find extension to handle it.

awight removed a subscriber: awight.Mar 21 2019, 4:00 PM