Page MenuHomePhabricator

Add language support for Korean
Closed, ResolvedPublic

Description

Event Timeline

revi created this task.Mar 17 2017, 3:06 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 17 2017, 3:06 PM
revi renamed this task from Add language support for ... to Add language support for Korean.Mar 17 2017, 3:06 PM
Halfak added a subscriber: Halfak.Mar 19 2017, 4:01 PM

@revi put together this list based on some other sources: P5073

revi added a comment.Mar 20 2017, 3:19 PM

(Just FYI:) P5072 was added few minutes before halfak made 5073, and 5072 is the authoritative list.

Halfak updated the task description. (Show Details)Mar 23 2017, 2:22 PM

@revi, I almost pulled this to our main workboard, but I realized that we still need a list of "informals". @Ladsgroup said that he's updated https://meta.wikimedia.org/wiki/Research:Revision_scoring_as_a_service/Word_lists/ko with a new run of BWDS. Could you have a look at it to see if it is any more useful.

Alternatively, you could help us build a list of informals from your own knowledge. See the English informals for a large set of examples of the kind of thing we're looking for. https://github.com/wiki-ai/revscoring/blob/master/revscoring/languages/tests/test_english.py#L87

revi added a comment.Mar 23 2017, 2:48 PM

Unfortunately I have to say updated version of BWDS run is still meaningless except one entry.

Also, informals list is what I was going to work on tomorrow.

Gotcha. Sounds good. Sorry for the BWDS issues for Korean. I've been working on that a lot in the last week.

Halfak triaged this task as High priority.Mar 23 2017, 2:53 PM
Halfak moved this task from Untriaged to New development on the Scoring-platform-team board.
revi added a comment.Mar 24 2017, 8:24 AM

I know the list is broad, but paragraph ending with the following words are almost likely to be informal and not encyclopedic, so P5122 is the list. (The list is quite small, so I'll need to adjust it quite often.)

https://gerrit.wikimedia.org/r/345016 -- "Adds hunspell-ko to ores:base [puppet]"

1
2
3```
4>>> import enchant
5>>> ko = enchant.Dict('ko')
6>>> ko.check("foo")
7True
8>>> ko.check("foo asndals")
9False
10>>> ko.check("fooasndals")
11True
12>>> ko.check("fooasndals;sfdfaslnflasndlas")
13False
14>>> ko.check("fooasndalssfdfaslnflasndlas")
15True
16```

Halfak closed this task as Resolved.Apr 14 2017, 5:41 PM