Currently we have an LTA who likes to create user accounts or pages with insulting terms, and will do it in some of the varieties of unicode text. Using the attribute <antispoof> is too harsh as many of the terms one wishes to allow in normal characters, though definitely don't need in unicode.
What would be useful is to have an attribute that allows the simpler (regular?) versions of the text through though is capable of leveraging anti-spoof to block the textual equivalents, and for the purposes of this ticket I am calling <unicodeonly>. I was thinking of something acting similarly forbidden method as the other <...only> attributes.
This methodology can then map like characters outside of the normal code characters, and as new unicode sets are developed could be similarly applied as we upgrade code sets, and not have new regex filter lines created
.*(?:𝓱𝓾𝓰𝓮|𝓶𝓾𝓼𝓽|𝓭𝓲𝓮|𝓬𝓸𝓬𝓴).* .*(?:𝕒𝕤𝕝𝕖𝕪|𝓪𝓼𝓵𝓮𝔂) <newaccountonly>
.*(?:huge|must|die|cock).* <unicodeonly> .*asley.* <newaccountonly|unicodeonly>
I would still expect that use of the local mediawiki:titlewhitelist would override the restriction of this new attribute.
I am guessing that there is work in both the titleblacklist and antispoof extensions to get the feature, and the groupings in place.