Page MenuHomePhabricator

Title of invitation list not displaying in Special:MyInvitationLists [1 day investigation]
Closed, ResolvedPublicBUG REPORT

Description

Steps to replicate the issue (include links if applicable):

Create an invitation list in hindi wikipedia using the following invitation list, and also paste that list in for the title of the invitation list:

भारत
उत्तर प्रदेश
भीमराव आम्बेडकर
चन्द्रगुप्त मौर्य
ससुराल सिमर का
यादव
एक दिवसीय अंतरराष्ट्रीय क्रिकेट के कीर्तिमानों की सूची
बिहार
राजस्थान
राम
चमार
अहीर
भारत में स्थानीय वक्ताओं की संख्यानुसार भाषाओं की सूची
ब्राह्मण
पृथ्वीराज चौहान
महाभारत
प्रेमचंद
बिग बॉस 15: संकट में जंगल

What happens?:
On Special:MyInvitationLists, the title of the list does not appear. So then the list can not be accessed because there is no link available to click on.

What should have happened instead?:

  • User should see an error message when invitation list cannot be generated due to lengthy invitation list title. The error message can read: "Add a shorter invitation list name (maximum 255 bytes)."
  • Note that, as decided in T393810#10910190, when entering text in the field you will be able to enter up to 255 "characters", not bytes; if you enter more than 255 bytes, you will still see an error upon submission

Software version (on Special:Version page; skip for WMF-hosted wikis like Wikipedia):
1.44.0-wmf.28 (d4283b9)
22:40, 5 मई 2025

Other information (browser name/version, screenshots, etc.):

gifSpecial:MyInvitationLists (with missing link to the lists)
Screen Recording 2025-05-09 at 1.28.14 PM.gif (3,456×1,700 px, 844 KB)
Screenshot 2025-05-09 at 1.29.35 PM.png (2,672×1,692 px, 291 KB)

This hindi list can also be used as the title in test.wikipedia.org to reproduce the issue, and just use the following as the invitation list:

Main Page
Sandbox

It may be easier to reproduce it here because then you don't need organizer right in hindi wikipedia.

Event Timeline

ifried renamed this task from Title of invitation list not displaying in Special:MyInvitationLists to Title of invitation list not displaying in Special:MyInvitationLists [1 day investigation].May 12 2025, 4:31 PM
cmelo changed the task status from Open to In Progress.May 14 2025, 2:01 AM
cmelo claimed this task.

Hi @vaughnwalters, I think I understand what is happening. The limit for this field is 255 bytes, and the string below, used for testing, exceeds that limit:

भारत
उत्तर प्रदेश
भीमराव आम्बेडकर
चन्द्रगुप्त मौर्य
ससुराल सिमर का
यादव
एक दिवसीय अंतरराष्ट्रीय क्रिकेट के कीर्तिमानों की सूची
बिहार
राजस्थान
राम
चमार
अहीर
भारत में स्थानीय वक्ताओं की संख्यानुसार भाषाओं की सूची
ब्राह्मण
पृथ्वीराज चौहान
महाभारत
प्रेमचंद
बिग बॉस 15: संकट में जंगल

The validation is currently not working correctly to inform the user about the byte limit being exceeded.

There’s another factor: each Hindi character typically counts as 3 bytes (sometimes 2 or 4), unlike English characters, which are usually 1 byte. So for Hindi, the maximum number of characters would be approximately 255 / 3 = 85.

I'm adding a patch to display an error message when the 255-byte limit is reached.

But here's the tricky part: what should the error message say?
We cannot say the limit is 255 characters, because that isn't accurate for all languages, as shown in the examples above.
On the other hand, saying "255 bytes" might be too technical—many users may not understand what that means.

What do you think would be a good message to handle this situation, @ifried, @JFernandez-WMF, @vaughnwalters?

Maybe something like:
The name you entered is too long. Please shorten it and try again.

Character TypeTypical Bytes in UTF-8
Basic Latin letters (A-Z, a-z)1 byte
Numbers (0-9)1 byte
ASCII symbols (!@#\$%^&\*...)1 byte
Latin-1 Supplement (é, ñ, ç, ü)2 bytes
Cyrillic (Russian, Ukrainian, etc.)2 bytes
Greek letters2 bytes
Arabic letters2 bytes
Hebrew letters2 bytes
Hindi, Sanskrit (Devanagari script)3 bytes
Chinese, Japanese, Korean (CJK)3 bytes
Emojis4 bytes

Examples:

LanguageCharacterBytes in UTF-8
EnglishA1 byte
Portugueseç2 bytes
RussianД2 bytes
Arabicع2 bytes
Hindi3 bytes
Chinese3 bytes
Emoji😊4 bytes

Change #1145382 had a related patch set uploaded (by Cmelo; author: Cmelo):

[mediawiki/extensions/CampaignEvents@master] Add the maxlengh validation for invitation list name

https://gerrit.wikimedia.org/r/1145382

The limit for this field is 255 bytes, and the string below, used for testing, exceeds that limit: [...] each Hindi character typically counts as 3 bytes [...]

This is correct, but it doesn't explain why the (truncated) title isn't shown in the interface, so I digged in a bit. When the string from the task description is truncated to 255 bytes, the truncation happens in the middle of a Devanagari character, hence producing invalid UTF-8. With a replacement character added to highlight the truncation, the name would be: भारत उत्तर प्रदेश भीमराव आम्बेडकर चन्द्रगुप्त मौर्य ससुराल सिमर का यादव एक दिवसीय अंतरराष्ट्र�. On its own this isn't enough, as the truncated version would be displayed in the UI. The problem is that the invitation list name, being user-generated, is always escaped before display:

  • In SpecialInvitationList, where it's set as page title, setPageTitleMsg uses Message::escaped(), which uses htmlspecialchars( $string, ENT_QUOTES, 'UTF-8', false )
  • In SpecialMyInvitationLists, links are generated via LinkRenderer::makeKnownLink, which ultimately uses HtmlArmor::getHTML which calls htmlspecialchars( $input, ENT_QUOTES )

Quoting from the PHP manual:

If the input string contains an invalid code unit sequence within the given encoding an empty string will be returned, unless either the ENT_IGNORE or ENT_SUBSTITUTE flags are set.

In both cases, we are passing explicit flags to htmlspecialchars (ENT_QUOTES), meaning neither of the two flags above is passed, and therefore the empty string is returned and "displayed". So this explains why it's always empty.

(Also: it's no longer relevant now that we're fully PHP 8.1+, but the default flags starting with PHP 8.1 include ENT_SUBSTITUTE, whereas they didn't in previous versions.)


what should the error message say?

Note that by using the maxlength attribute, together with the jquery.lengthLimit module, we can prevent longer inputs from being submitted. Longer inputs would only get through for no-JS users, or in case of manual messing with the form. Those would have to be truncated anyway. With that being said, I do think it would be nice to still provide an error message.

On the other hand, saying "255 bytes" might be too technical—many users may not understand what that means.

That which we call a byte, by any other word would taste as sweet. The limit is, and has to be, in bytes. I don't think there's any way around it. The alternative would be to set a character limit equal to at most column_length_in_bytes / 4; this is similar to what is done for edit summaries etc. But I don't think we need to do that (also, it would only leave space for 64 characters, which isn't too short, but not very generous either).

I do realize it is technical, but I don't think we can do anything about it. At least, people with a technical background will know exactly what to do; and if someone non-technical is curious, they can google "what is a byte" and learn something in the process. Worst case scenario, they won't understand the byte thing, but they should at least infer that the name is too long and they have to shorten it (if the copy is sufficiently clear). Which is exactly what the proposed Add a shorter invitation list name says. The difference being that the "simple" message does not provide detailed information even to those who would understand it, and so everyone would just have to shorten the string by trial and error until they find a length that works.

Notes from June 12 meeting:

  • slightly preferred behavior would be enforce byte limit when the user is typing in the title, so they can know right away what they need to fix rather than thinking it is okay and then trying to save, but this is a slight preference and I do not think either behavior is a major imposition.
  • So we can keep current behavior (text that tells someone to change the number of bytes after saving) - no changes to current implementation needed

Change #1145382 merged by jenkins-bot:

[mediawiki/extensions/CampaignEvents@master] Add the maxlengh validation for invitation list name

https://gerrit.wikimedia.org/r/1145382

✅ User should see an error message when invitation list cannot be generated due to lengthy invitation list title. The error message can read: "Add a shorter invitation list name (maximum 255 bytes)."

Screenshot 2025-06-17 at 11.32.05 AM.png (1,974×1,116 px, 164 KB)

✅ Note that, as decided in T393810#10910190, when entering text in the field you will be able to enter up to 255 "characters", not bytes; if you enter more than 255 bytes, you will still see an error upon submission


Screenshot 2025-06-17 at 11.20.41 AM.png (1,986×1,064 px, 197 KB)

Screenshot 2025-06-17 at 11.29.36 AM.png (1,922×596 px, 103 KB)


Titles are showing correctly now in Special:MyInvitationLists. Also, titles that are longer than 255 bytes are not allowed, and correctly display an error if over 255 bytes. Marking this as done / resolved.