Lists and sources
We currently provide wordlists under the following formats :
# 我 # 你 # 您 # 他 # 我們 # 你們
Sources are richer
Our sources often have richer data, most frequently the English since most dictionaries and wordlists are paired with English to to ease up learning :
I do have even much richer sources, with meaningful and useful alternative columns :
These data can be quite handy if one user want to rename the files so to attack/qurey them via an other entry point : english, IPA, traditional Chinese rather than the more common simplified Chinese.
As of now, I a wordlist provider do and extra work to destroy and remove this otherwise valuable data, so to provide Lingua libre a simplier valid / poorer wordlist :
A proposition would be to recognize pseudo tsv list such :
# [item:我] [simplified:我] [pinyin:wǒ] [IPA:uɔ˨˩˦] [eng:I] # [item:你] [simplified:你] [pinyin:nǐ] [IPA:ni˨˩˦] [eng:you] # [item:您] [simplified:您] [pinyin:nín] [IPA:nin˧˥] [eng:you (polite)] # [item:他] [simplified:他] [pinyin:tā] [IPA:tʰa˥˥] [eng:he] # [item:我們] [simplified:我们] [pinyin:wǒmen] [IPA:uɔ˨˩mən] [eng:we] # [item:你們] [simplified:你们] [pinyin:nǐmen] [IPA:ni˨˩mən] [eng:you]
Then audio records the column item) or the 1st column, and forward other data to the ogg file's metadata under the suggested properties name.