AntiSpoof's current design only allows replacing one character with another. But what I am proposing here is to replace a character with nothing (blank). So we need new functionality for it.
Example: The following texts look the same but are different: ABCD , ABCD
The first one is only four characters A to D; the second has a &znwj; (Unicode character 200C) between each two English letters. Replacing ZWNJ with another character is NOT the solution. Instead, we need a function that can replace ZWNJ with nothing.
This applies to at least the following characters, all of which are "invisible" (i.e. they don't have any width):
- Zero-width space (200B)
- Zero-width non-joiner (200C)
- Zero-width joiner (200D)
- Left-to-right mark (200E)
- Right-to-left mark (200F)
- Line separator (2028)
- Paragraph separator (2029)
- Left-to-right embedding (202A)
- Right-to-left embedding (202B)
- Left-to-right override (202D)
- Right-to-left override (202E)
And perhaps:
- Left-to-right isolate (2066)
- Right-to-left isolate (2067)
- First strong isolate (2068)
- Pop directional isolate (2069)
The top table on https://en.wikibooks.org/wiki/Unicode/Character_reference/2000-2FFF can be a good reference.