Page MenuHomePhabricator

[Bug] Format constraint incorrectly interprets regexes containing unwrapped "|"
Closed, ResolvedPublic

Description

To check the text ABCxyz against the regular expression [A-Z]{3}|\d{6}, we use the following SPARQL query:

SELECT (REGEX("ABCxyz", "^[A-Z]{3}|\\d{6}$") AS ?matches) {}

This will incorrectly return true (resulting in a “compliance” report), since that full REGEX means “starts with [A-Z]{3}, or ends with \d{6}”. What we actually need is this:

SELECT (REGEX("ABCxyz", "^(?:[A-Z]{3}|\\d{6})$") AS ?matches) {}

to make sure that the ^ and $ anchors really apply to the whole regex. (Make sure to use a non-capturing group – some format constraints use capturing groups and backreferences, so we mustn’t change the number of capturing groups.)

Details

Related Gerrit Patches:
mediawiki/extensions/WikibaseQualityConstraints : masterAdd non-capturing group around regular expressions

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptFeb 28 2018, 6:51 PM
thiemowmde triaged this task as Medium priority.Mar 1 2018, 12:15 PM
thiemowmde added a project: good first task.
thiemowmde moved this task from incoming to consider for next sprint on the Wikidata board.
thiemowmde added a subscriber: thiemowmde.

Yea, simply add (?:…) around all regexes, no matter what they contain. This is always fine.

The code for this can be found in \WikibaseQuality\ConstraintReport\ConstraintCheck\Helper\SparqlHelper::matchesRegularExpressionWithSparql.

thiemowmde renamed this task from Format constraint incorrectly interprets regexes containing unwrapped "|" to [Bug] Format constraint incorrectly interprets regexes containing unwrapped "|".Mar 1 2018, 12:15 PM
thiemowmde added a subscriber: Jonas.

Change 439583 had a related patch set uploaded (by Lucas Werkmeister (WMDE); owner: Lucas Werkmeister (WMDE)):
[mediawiki/extensions/WikibaseQualityConstraints@master] Add non-capturing group around regular expressions

https://gerrit.wikimedia.org/r/439583

Change 439583 merged by jenkins-bot:
[mediawiki/extensions/WikibaseQualityConstraints@master] Add non-capturing group around regular expressions

https://gerrit.wikimedia.org/r/439583

Lucas_Werkmeister_WMDE closed this task as Resolved.Jun 11 2018, 12:35 PM
Lucas_Werkmeister_WMDE claimed this task.