Page MenuHomePhabricator

Add "I" as equivalent of "l"
Open, Needs TriagePublic

Description

I have a global TitleBlacklist entry for my username as .*marcoaurelio.* <newaccountonly|antispoof>. Still today someone managed to create a troll account MarcoAureIio. Testing with https://meta.wikimedia.org/wiki/Special:AbuseFilter/tools resulted as follows:

  • norm('MarcoAureIio') == MARCOAUREIO
  • ccnorm('MarcoAureIio') == MARCOAUREIIO

It looks like I should be added as equivalent of l.

Event Timeline

MarcoAurelio moved this task from unsorted/backlog to working on on the User-MarcoAurelio board.

Change 433006 had a related patch set uploaded (by MarcoAurelio; owner: MarcoAurelio):
[mediawiki/libs/Equivset@master] [WIP] equivset: equal vertical (|) bar to lowercase el (l)

https://gerrit.wikimedia.org/r/433006

Change 433006 abandoned by MarcoAurelio:
equivset: add some more confusable characters

https://gerrit.wikimedia.org/r/433006

I have abandoned the patch above because it gives weird results. If I add that l be equivalent to I:

equivset.in
diff --git a/data/equivset.in b/data/equivset.in
index 85e3260..cf4a7c3 100644
--- a/data/equivset.in
+++ b/data/equivset.in
@@ -49,6 +49,7 @@
 6A j => 4A J
 6B k => 4B K
 6C l => 4C L
+6C l => 49 I
 6D m => 4D M
 6E n => 4E N
 6F o => 4F O

the result is:

equivset.json (generated by bin/console generate-equivset)
diff --git a/dist/equivset.json b/dist/equivset.json
index bbe9915..47fb039 100644
--- a/dist/equivset.json
+++ b/dist/equivset.json
@@ -25,7 +25,7 @@
     "i": "I",
     "j": "J",
     "k": "K",
-    "l": "L",
+    "l": "I",
     "m": "M",
     "n": "N",
     "o": "O",
@@ -184,11 +184,11 @@
     "Ƒ": "F",
     "ƒ": "F",
     "Ɠ": "G",
-    "Ɩ": "L",
-    "Ɨ": "L",
+    "Ɩ": "I",
+    "Ɨ": "I",
     "Ƙ": "K",
     "ƙ": "K",
-    "ƚ": "L",
+    "ƚ": "I",
     "Ɲ": "N",
     "ơ": "Ơ",
     "Ƥ": "P",
@@ -932,7 +932,7 @@
     "₩": "W",
     "₷": "S",
     "₸": "T",
-    "ℓ": "L",
+    "ℓ": "I",
     "℧": "U",
     "ⅆ": "D",
     "ⅼ": "L",
@@ -5449,7 +5449,7 @@
     "i": "I",
     "j": "J",
     "k": "K",
-    "l": "L",
+    "l": "I",
     "m": "M",
     "n": "N",
     "o": "O",

which means that confusable l characters will now stop being treated as l to be treated as I and we don't want that.

MarcoAurelio raised the priority of this task from Medium to Needs Triage.

Further:

I -> lL
! -> iI

if you please, thanks.

nb.- lL and iI means here l and L and I and i.