Page MenuHomePhabricator

Add more character mappings to AntiSpoof
Closed, ResolvedPublic1 Estimated Story Points

Description

Currently, Antispoof maps ك to ک.

It should also map the following:

ڪ
ك

This will allow ccnorm() to be used to capture all of these using AbuseFilter, instead of using hard-to-read regex patterns like [کكڪﻙﻚ].

Also: The Cyrillic letter Д д (Д д) should be added to the AntiSpoof equivset for A

Event Timeline

Huji triaged this task as Low priority.
Huji updated the task description. (Show Details)
Huji added a subscriber: Yamaha5.

for Persian and Arabic:
here is listed all arabic family characters.
I check the table plus numbers there are some other similar characters which have different Unicode:

ۀ = \u06C0
ۂ =\u06C2
هٔ = \u0647 + \u0654

إ =\u0625
ٳ =\u0673

ٲ =\u0672
أ =\u0623
ٵ =\u0675

، =\u060C
٬ =\u066C
٫ =\u066B

064E
0659

ڼ =\u06BC
ڹ=\u06B9

06EC
06E0
06F0
0660
06DF
06EB
06EA
. = (dot)

0674
0655
0654
065F
0621

٭ =\u066D

  • = *

Persian's number's shape mostly the same as Arabic's but their Unicode is different!
Persian numbers= ۹ ۸ ۷ ۶ ۵ ۴ ۳ ۲ ۱ ۰
Arabic numbers = ٠ ١ ٢ ٣ ٤ ٥ ٦ ٧ ٨ ٩
Persian numbers' Unicode= \u06F9 \u06F8 \u06F7 \u06F6 \u06F5 \u06F4 \u06F3 \u06F2 \u06F1 \u06F0
Arabic numbers' Unicode =\u0660 \u0661 \u0662 \u0663 \u0664 \u0665 \u0666 \u0667 \u0668 \u0669
you can check them here

Change 373596 had a related patch set uploaded (by Huji; owner: Huji):
[mediawiki/extensions/AntiSpoof@master] Add more Persian characeter mappings to AntiSpoof

https://gerrit.wikimedia.org/r/373596

Thanks @Yamaha5 I will create one or more separate patches for those groups as well.

TBolliger renamed this task from Add more Persian character mappings to AntiSpoof to Add more character mappings to AntiSpoof.Sep 8 2017, 7:59 PM
TBolliger updated the task description. (Show Details)

Merging in a near-identical ticket. We'll tackle both at the same time.

Change 373596 merged by jenkins-bot:
[mediawiki/extensions/AntiSpoof@master] Add more Persian characeter mappings to AntiSpoof

https://gerrit.wikimedia.org/r/373596

dmaza moved this task from Code Review to Done on the Anti-Harassment (AHT Sprint 6) board.