Currently we have an LTA who likes to create user accounts or pages with insulting terms, and will do it in some of the varieties of unicode text. Using the attribute <antispoof> is too harsh as many of the terms one wishes to allow in normal characters, though definitely don't need in unicode.
What would be useful is to have an attribute that allows the simpler (regular?) versions of the text through though is capable of leveraging anti-spoof to block the textual equivalents, and for the purposes of this ticket I am calling <unicodeonly>. I was thinking of something acting similarly forbidden method as the other <...only> attributes.
This methodology can then map like characters outside of the normal code characters, and as new unicode sets are developed could be similarly applied as we upgrade code sets, and not have new regex filter lines created
Examples
current style
.*(?:๐ฑ๐พ๐ฐ๐ฎ|๐ถ๐พ๐ผ๐ฝ|๐ญ๐ฒ๐ฎ|๐ฌ๐ธ๐ฌ๐ด).* .*(?:๐๐ค๐๐๐ช|๐ช๐ผ๐ต๐ฎ๐) <newaccountonly>
expected style
.*(?:huge|must|die|cock).* <unicodeonly> .*asley.* <newaccountonly|unicodeonly>
I would still expect that use of the local mediawiki:titlewhitelist would override the restriction of this new attribute.
I am guessing that there is work in both the titleblacklist and antispoof extensions to get the feature, and the groupings in place.