Page MenuHomePhabricator

Equivset should normalize some diacriticals
Closed, ResolvedPublic

Description

Abuse Filter ccnorm should normalize some diacriticals like:

Ċ -> C
Ď -> D
È -> E
Ê -> E
ǝ -> E
Ĥ -> H
Ñ -> N
Ň -> N
ᴬ -> A
ᴰ -> D
ᴱ -> E
ᴴ -> H
ᴸ -> L
ᴹ -> M
ᴿ -> R
ᵀ -> T
ᶜ -> C

Event Timeline

He7d3r renamed this task from Abuse Filter ccnorm should should normalize some diacriticals to Abuse Filter ccnorm should normalize some diacriticals.Sep 10 2017, 5:10 PM
He7d3r updated the task description. (Show Details)
Huji renamed this task from Abuse Filter ccnorm should normalize some diacriticals to Equivset should normalize some diacriticals.Apr 13 2018, 12:52 PM
Huji removed a project: AbuseFilter.
ǝ -> E

There is already another mapping for that character

Change 818287 had a related patch set uploaded (by Umherirrender; author: Umherirrender):

[mediawiki/libs/Equivset@master] Expand set for lower/upper case characters which are alone in the set

https://gerrit.wikimedia.org/r/818287

Change 818287 merged by jenkins-bot:

[mediawiki/libs/Equivset@master] Expand set for lower/upper case characters which are alone in the set

https://gerrit.wikimedia.org/r/818287

Ċ -> C
Ď -> D
È -> E
Ê -> E
ǝ -> E
Ĥ -> H
Ñ -> N
Ň -> N

are now part of Equivset, it needs a new release to get them working in AbuseFilter on wmf wikis

Some letters still to be done

Change 904823 had a related patch set uploaded (by Umherirrender; author: Umherirrender):

[mediawiki/libs/Equivset@master] Add characters from the "Phonetic Extensions" Unicode Block (1D00-1DBF)

https://gerrit.wikimedia.org/r/904823

Change 904823 merged by jenkins-bot:

[mediawiki/libs/Equivset@master] Add characters from the "Phonetic Extensions" Unicode Block (1D00-1DBF)

https://gerrit.wikimedia.org/r/904823

All letters mention in this task are now part of Equivset, it needs a new release to get them working in AbuseFilter on wmf wikis