A few items came up in the review of T188321 that should be addressed, but are separate (and smaller) issues than the big fixes made there:
* refactor \b in regexes into a $wordBoundary variable so that it is easy to do something smarter and more location aware in the future (once we figure out what that is)
* add some new exceptions that came up from last-minute review of examples in Tatar transliteration, plus some more proper names
* ~~possibly figure out what to do about roman numerals. The last patch ignores roman numerals as long as they are not one letter long and followed by a period (that is, as long as it doesn't look like an initial). Possibilities include:~~
** ~~stop trying to be clever and ignore roman numerals entirely, letting editors explicitly -{mark them}- as not to be transliterated~~
** ~~only automatically block roman numerals that are two-letters or longer which really cuts down on false positives~~
** ~~stick with the current system~~
~~(I'm happy with any of the roman numeral options—we just have to decide which one is the one we want.)~~