Lists loader: integrate UNILEX data and license.md in code ?
Open, LowPublic
Actions

Assigned To

None

Authored By

	Yug
	Feb 23 2021, 5:19 PM

Description

I found out today that while under copyrights, the UNILEX data uses a licence.md which is a variation of the GNU license.

They maintain 999 lists by frequency.

curl 'https://raw.githubusercontent.com/lingua-libre/unilex/master/data/frequency/ig.txt' | tail -n +5 | sort -k 2,2 -n -r | cut -d$'\t' -f1 | sed -E 's/^/# /g' | head

Comment: remove first 5 lines, sort by 2nd column numerical value descendant, cut to keep first field, add a # to make a list, print only first 20 lines. Creates a Lingualibre compatible wordlist, shows the top 20 items.

Links:

Event Timeline

Yug created this task.Feb 23 2021, 5:19 PM

Yug updated the task description. (Show Details)

Yug updated the task description. (Show Details)Feb 24 2021, 8:23 AM

Yug updated the task description. (Show Details)Feb 24 2021, 10:28 AM

Yug updated the task description. (Show Details)

Yug triaged this task as Low priority.Jul 6 2022, 10:40 AM

Yug renamed this task from RecordWizard: Integrate UNILEX data and license.md in code ? to Lists loader: integrate UNILEX data and license.md in code ?.Jul 7 2022, 11:03 AM

Yug moved this task from RecordWizard (MediaWiki Extension) to RecordWizard and items to record on the Lingua-Libre-Legacy board.Jul 20 2022, 10:23 AM

Lists loader: integrate UNILEX data and license.md in code ?Open, LowPublicActions

Description

Event Timeline

Lists loader: integrate UNILEX data and license.md in code ?
Open, LowPublic
Actions