Page MenuHomePhabricator

Unlike string.gsub, mw.ustring.gsub doesn't add whole match as first capture when pattern has no captures
Closed, ResolvedPublic

Description

Awhile ago I noticed a small difference in behavior between mw.ustring.gsub and string.gsub. When the pattern (argument 2) has no captures, string.gsub treats the whole match as capture 1, but mw.ustring.gsub does not. For instance, string.gsub('abc', '%a', '%1!') returns "a!b!c!", 3, but mw.ustring.gsub('abc', '%a', '%1!') throws an error: "invalid capture index %1 in replacement string".

This behavior of string.gsub isn't officially documented in the Lua 5.1 manual (here is its implementation in the source code), but maybe it is best for the two functions to be compatible in this way. The officially documented way to get the whole match is by capture 0 (which also works when there are captures): string.gsub('abc', '%a', '%0!') and mw.ustring.gsub('abc', '%a', '%0!')"a!b!c!", 3.

Event Timeline

Anomie subscribed.

but maybe it is best for the two functions to be compatible in this way

I agree. We've tried to do that generally.

Change 468996 had a related patch set uploaded (by Anomie; owner: Anomie):
[mediawiki/extensions/Scribunto@master] ustring: Match undocumented string.gsub behavior

https://gerrit.wikimedia.org/r/468996

Change 468996 merged by jenkins-bot:
[mediawiki/extensions/Scribunto@master] ustring: Match undocumented string.gsub behavior

https://gerrit.wikimedia.org/r/468996

This should be deployed to Wikimedia wikis with 1.33.0-wmf.3, see https://www.mediawiki.org/wiki/MediaWiki_1.33/Roadmap for a schedule.