Page MenuHomePhabricator

the lua string.match results differ on Windows interpreter, and within Wiki. matching the character set [aáe]
Closed, InvalidPublic

Description

Within Lua 5.1.4 Lua.org interpreter:

yearspart="0-ás"
yearnums,bindVowel = yearspart:match("(%d*0)%-([aáe])s")
print(yearnums,bindVowel)

results 0, á ---as expected...

The same code resulted in my lua code in the module https://hu.wikisource.org/wiki/Modul:Is_decade
resulted: nil, nil

I circumvented the problem by a direct handling of 'á' but it seems to me thet it might be a bug within the wiki's lua interpreter

Event Timeline

Anomie subscribed.

When you run the command line interpreter, it picks up the locale from the environment. On my environment with LC_CTYPE set to en_US.UTF-8 and the rest set to C, I get nil from your example in the command line interpreter. Same if I set them all to en_US.UTF-8 or C. I don't have Hungarian locales installed to test with.

In MediaWiki we set the locale to a known default from $wgShellLocale (by default C.UTF-8). See also T107128: Scribunto string comparison works case insensitive while the standard Lua case sensitive.

For dealing with non-ASCII characters, you should use Scribunto's ustring library rather than the standard string library.

Thanks, I changed my code for using ustring and it works as expected. Thank you for your hint.
I assumed that grouping by [] accepts non letters like 'á' (ASCII code 160). It was obviously not the case.