Page MenuHomePhabricator

Include a RegEx library for Lua
Closed, DeclinedPublic

Description

The reason that no RegEx support is added in Lua by default is to reduce the total size of Lua runtime. However as we're running Lua on servers now, size is not something critical for us.


Version: unspecified
Severity: normal
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=47512

Details

Reference
bz50454

Event Timeline

bzimport raised the priority of this task from to Lowest.Nov 22 2014, 1:53 AM
bzimport set Reference to bz50454.
bzimport added a subscriber: Unknown Object (MLST).

On the other hand, it's very easy to create pathological regular expressions. This is much less likely with Lua patterns, and we were able to easily add checkpoints into Lua's pattern processing to allow Scribunto's CPU limiting to continue to work even if someone does manage this. Doing the same for a regex engine is likely to be more difficult.

I don't think this is going to happen.

I think there should be a way open to web admins to dynamically link lua libraries to both LuaSandbox and Lua standalone engine.

@alex-mashin That's not really relevant to this bug. That's more what T63432 is asking for.

jayvdb added a subscriber: jayvdb.Dec 18 2015, 10:20 PM

As this phabricator doesn't show what happened, https://static-bugzilla.wikimedia.org/show_activity.cgi?id=50454 shows @Anomie closed this as wontfix 2013-12-10.

http://www.inf.puc-rio.br/~roberto/lpeg/lpeg.html looks nice, including lpeg.setmaxstack to cause crazy regex to fail.

alex-mashin added a comment.EditedDec 19 2015, 7:42 AM

http://www.inf.puc-rio.br/~roberto/lpeg/lpeg.html looks nice, including lpeg.setmaxstack to cause crazy regex to fail.

Just like lrexlib it won't work until you hack luasandbox (and so I did).

He7d3r added a subscriber: He7d3r.Dec 19 2015, 7:19 PM
jeblad added a subscriber: jeblad.EditedApr 3 2016, 2:12 AM

As Wikidat use regex it has become important to make the same work in Lua, that is we must be able to reuse format as a regular expression (P1793) which is a PCRE-pattern, and thus we need a Perl-type lib for Lua of some kind.

I think this is a blocker for use of property 1793 in Lua, but we could perhaps hack some kind of PCRE-ish sub-pari lib in pure Lua to still be able to use the claims.

thus we need a Perl-type lib for Lua of some kind.

Consider rrthomas.github.io/lrexlib.

Or better, make it possible to enable any external libraries for Lua standalone and sandbox by MediaWiki settings.

Uanfala added a subscriber: Uanfala.Apr 4 2016, 7:04 PM

This should be fixed. Of course breaking reg-exes exist, but we face the same problem as we did when we tried to deny template developers parser functions on the grounds that they were idiots and would break the whole wiki - namely that someone will write Module:RegEx.

And here is some MIT licensed glue to both POSIX and PCRE libraries. http://rrthomas.github.io/lrexlib/

This needs a solution!