Scribunto does not understand Thai number as input in some aspects
Closed, ResolvedPublicBUG REPORT
Actions

Assigned To

Authored By

	Bebiezaza
	Apr 25 2022, 3:48 PM

Description

I am trying to write a Scribunto module to convert Thai numbers to Arabic numbers, but it always fails and debug module does not understand the number at all.

Scribunto still passes Thai numbers as input straight through to the output successfully (without any work done to it), so this might be a problem with successfully matching characters to the same character in the table.
more edit: string.gsub() successfully convert the numbers, but still keeping this task open because the original method should have worked too.

See https://th.wikisource.org/wiki/Module_talk:ThaiToArabicNum/testcases test_2 and test_3 to see how it does not work, and https://th.wikisource.org/wiki/Module:ThaiToArabicNum/sandbox is the code.

List of steps to reproduce (step by step, including full links if applicable): go to https://th.wikisource.org/wiki/Module_talk:ThaiToArabicNum/testcases test_2 and test_3
What happens?: Thai to Arabic number conversion completely fails because Scribunto does not understand Thai number as input
What should have happened instead?: Thai to Arabic number conversion passes (like the script not in sandbox)
Software version (if not a Wikimedia wiki), browser information, screenshots, other information, etc.: Thai Wikisource

Related Objects

Mentioned In: T308403: Create new project tag "Thai-Sites"

Event Timeline

Bebiezaza created this task.Apr 25 2022, 3:48 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptApr 25 2022, 3:48 PM

Reedy added a project: I18n.Apr 25 2022, 4:20 PM

Bebiezaza updated the task description. (Show Details)Apr 25 2022, 5:24 PM

Bebiezaza renamed this task from Scribunto does not understand Thai number as input to Scribunto does not understand Thai number as input in some aspects.Apr 25 2022, 5:39 PM

Bebiezaza updated the task description. (Show Details)Apr 27 2022, 6:46 AM

PatsagornY mentioned this in T308403: Create new project tag "Thai-Sites".May 16 2022, 5:30 AM

Bebiezaza added a project: Thai-Sites.May 24 2022, 10:58 AM

This works as intended. Lua doesn't have a good support for unicode compared to other programming languages. As indicated in http://lua-users.org/wiki/LuaUnicode:

Lua's pattern matching facilities work byte by byte. In general, this will not work for Unicode pattern matching

The page suggests a workaround: looping over UTF-8 could be done with the following:

for uchar in string.gmatch("๑๒๓", "([%z\1-\127\194-\244][\128-\191]*)") do
  print(uchar)
end

which produces:

๑
๒
๓

Anyway, closing this as resolved, as we have a functioning code already.

Scribunto does not understand Thai number as input in some aspectsClosed, ResolvedPublicBUG REPORTActions

Description

Related Objects

Event Timeline

Scribunto does not understand Thai number as input in some aspects
Closed, ResolvedPublicBUG REPORT
Actions