While not used in production, CirrusSearch has a fallback implementation of insource regex search written in groovy. This should be re-written in the new 'painless' language that elasticsearch introduced, which is reasonably similar to groovy but is sandboxed (and as such, enabled by default).
Description
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | debt | T151324 [epic] System level upgrade for cirrus / elasticsearch | |||
Resolved | Deskana | T154501 [Epic, Q3 Goal] Upgrade search systems to Elasticsearch 5 | |||
Declined | EBernhardson | T156192 Rewrite regex fallback groovy script in painless |
Event Timeline
Not entirely sure if worthwhile, or if we should alternatively drop support for insource regex without the extra plugin. Painless, while enabled by default, does not enable regex by default. If we are going to require users to apply custom configuration to their elasticsearch server, we might as well go all the way and require the extra plugin rather than setting some particular flag.
Regexes are disabled by default because they circumvent Painless’s protection against long running and memory hungry scripts. To make matters worse even innocuous looking regexes can have staggering performance and stack depth behavior. They remain an amazing powerful tool but are too scary to enable by default. To enable them yourself set script.painless.regex.enabled: true in elasticsearch.yml. We’d like very much to have a safe alternative implementation that can be enabled by default so check this space for later developments!
In addition to the above caveat:
Patterns can only be created via this mechanism. This ensures fast performance, regular expressions are always constants and compiled efficiently a single time.
Basically regex's have to be constants, not provided in a variable. This means we would have to somehow escape the user input such that it can be directly included in the script, rather than as a variable to the script. This seems like too big of an opportunity for a security hole, even if painless is sandboxed such that users shouldn't be able to hurt anything. I'm going to decline this task and leave the groovy implementation in place. Users that desire a safe and performant regex implementation should install the extra plugin.
Based on this I think declining this task is the appropriate resolution.
Based on T156192#3077620 and there being no objections, I'm marking this as declined.