Regexes in JS have a very broken implementation of \w and \b, which considers characters like é to be non-word characters. This means that in the string foo café bar, there is no word boundary between the é and the space, because they're both non-word characters. This means that \bfoo\b and \bbar\b match this string, but \bcafé\b doesn't. Since our current code uses '\b' + phrase + '\b' to find phrases, any phrase that begins or ends with a "special" character can't be found (or worse, only special characters will be found).
We should probably deal with this by not using word boundaries at all, and instead using the other properties listed in T267329 to avoid false positives.