Page MenuHomePhabricator

[M] Limit the number of characters a user can add when reporting
Closed, ResolvedPublic

Assigned To
Authored By
JKieserman
Jun 12 2023, 4:10 PM
Referenced Files
F41439138: image.png
Nov 3 2023, 8:09 PM
F41439136: image.png
Nov 3 2023, 8:09 PM
F41439129: image.png
Nov 3 2023, 8:09 PM
F41439110: image.png
Nov 3 2023, 8:09 PM
F41439098: image.png
Nov 3 2023, 8:09 PM
F41439100: image.png
Nov 3 2023, 8:09 PM
F41439164: image.png
Nov 3 2023, 8:09 PM
F41434237: image.png
Nov 3 2023, 8:09 PM

Description

As a user, I should not be able to add more than 500 code points to the additional details and something else details textboxes within the data collection component.

Acceptance criteria:

  • Show remaining code points as a user types once the renaming code points is less than 99
  • Prevent the user from adding more text if they have reached the limit
  • Truncate the text for these textboxes in the API handler to prevent manual API calls sending too much text

Mockups
The design has changed since these were made

Step 2 (Word count).png (812×375 px, 56 KB)

Step 2 (Word count exceeded).png (812×375 px, 57 KB)

Event Timeline

This is blocked on three additional pieces of information:

  1. Are we limiting by characters or words?
  2. What is the limit?
  3. What does the error state look like?

Assuming this task is about the Incident-Reporting-System code project, hence adding that project tag so other people who don't know or don't care about team tags can also find this task when searching per codebase. Please set appropriate project tags when possible. Thanks! :)

I made this ticket T344639 yesterday on this very subject and assigned it to @eigyan

To answer Julia's questions

  1. We're limiting words not characters so that we're more inclusive of long-text languages
  2. 200 words is the limit, as recommend by T*S
  3. Error state mockup is in the ticket I linked above

cc @kostajh

I made this ticket T344639 yesterday on this very subject and assigned it to @eigyan

To answer Julia's questions

  1. We're limiting words not characters so that we're more inclusive of long-text languages
  2. 200 words is the limit, as recommend by T*S

The main problem with limiting by word count is that we don't have off-the-shelf tooling in the MediaWiki ecosystem to support doing this. My understanding is that building an accurate word counter across the 300+ languages we support is not straightforward. (It's not easy for English either, where things get weird if e.g. someone adds an emoji or spaces in between a punctuation mark.)

My suggestions are:

  1. Come back to a word count implementation as a follow-up, after we reach the T337566: [EPIC]: Incident Reporting System - Minimal Testable Product (MTP) milestone, and most likely as a cross-team engineering effort if we think that a JavaScript word counter is going to be useful in other contexts and features
  2. Use a character limit, or a byte limit, as @Mooeypoo mentioned here which is what VisualEditor is doing.
  3. Make the character/byte limit configurable by wiki, so that languages with longer words can have a higher character/byte limit, if needed
    1. ... but set a relatively high character/byte limit for all wikis, to simplify deployment, and if we see that people are sending in long essays, we can lower the limit on those wikis.
    2. Since byte limit is not exactly the same as a character count: to avoid confusing users, we could avoid showing the byte limit indicator until the user reaches something like 90% of the threshold. That way, we'd avoid having users think about why the limit maybe doesn't match their expectations based on what combination of language characters, emojis and punctuation they are using, while also provide feedback to users who are going close to the limit
  1. Error state mockup is in the ticket I linked above

I'll link it in this task's description, so it's easy to reference.

I don't fully understand what a user will see if we do a byte limit. Otherwise, Kosta's suggestions sound good to me.

I don't fully understand what a user will see if we do a byte limit. Otherwise, Kosta's suggestions sound good to me.

Yeah, you're not alone; byte limit is probably the most consistent but also the most confusing to users, because no one outside technical folks actually understands what a byte is and understands why it might show weirdly-jumping numbers when certain characters are inserted. It's not super user-friendly terminology.... but it's probably the most consistent counting, which might make it the least non-user-friendly of all the options.. (if that makes sense)

(Also, as some people pointed out in my thread, JS isn't quite using bytes, it's using code points, which is different, but is also mostly semantics)

There are pros and cons to use each one of the strategies, and none of these is consistent online. In fact, a lot of counters are inconsiste between platforms (Example: Mastodon web counts differently than the official Mastodon Android App. Ha. Helpful...)

VisualEditor does exactly what @kostajh is saying in 3.B; it only shows the count limit when it's 99 before the end. You basically skip the problem for most of your users, and those who DO encounter it, at least have a bit more understanding that this indicates some sort of limit they're approaching.

There's no perfect solution here because we're talking about ~400 languages. I think your best chance is to give this a try (your plan sounds perfect, and is mostly what we do in other products) and then... check! see if you have specific feedback from users in certain languages that may inform what you want to adjust.

It all really depends on the expectations of use cases, and this is one of those unknown-unknowns that the only way to deal with is to start somewhere, and then revalidate.

Changing the algorithm of how you count/limit is not terribly difficult, so whatever you choose is probably low cost / low risk anyways.
Just make sure that your storage mechanism allows for some flexibility with the limits.

@JSengupta-WMF @Madalina:

This is what VisualEditor's max limit looks like when adding a lot of text into the edit summary:

max limit.gif (439×551 px, 274 KB)

Note that the counter doesn't appear in the dialog until we're approaching the limit.

The limit itself is controlled by wgCommentCodePointLimit, which in turn is based on CommentStoreBase::COMMENT_CHARACTER_LIMIT and that is set to 500:

/**
 * Maximum length of a comment in UTF-8 characters. Longer comments will be truncated.
 * @note This must be at least 255 and not greater than floor( MAX_DATA_LENGTH / 4 ).
 */
public const COMMENT_CHARACTER_LIMIT = 500;

Aside from VisualEditor edit summary, the wgCommentCodePointLimit is used in a couple of other places in MediaWiki:

So, my proposal for the MTP deliverable of the incident reporting system dialog, is to use the existing MediaWiki convention of making use of wgCommentCodePointLimit and comparing with mediawiki.String.js's codePointLength() method to compare the difference, and display a counter if there are fewer than 99 characters remaining.

Thanks for flagging this @kostajh. We are in the process of reviewing some of the Codex component and text input is one of those. I will bring it up in the review. Ideally the small handle at the bottom right should allow users to resize the input field but on mobile it gets particularly tricky.

So there are multiple ways to solve this as I was told by DS team. Please see this Codex demo for text area.

I like the mediawiki example you have posted here @kostajh. Let's go with it.

Should this character limit also apply to the "Something else" textbox? Should the limit for this textbox be smaller than the additional details textbox?

@Dreamy_Jazz where does the something else textbox appear?

@Dreamy_Jazz where does the something else textbox appear?

Shown in this screenshot underneath the "What type of behavior?" options. It allows the user to specify what the something else is.

image.png (1×890 px, 82 KB)

A video for how I got to this textbox:

Got it. Yes let's keep the character limit same for that field too. Unless @Madalina thinks otherwise.

Dreamy_Jazz renamed this task from Limit the number of characters a user can add when reporting to [M] Limit the number of characters a user can add when reporting.Oct 30 2023, 4:38 PM

Change 970390 had a related patch set uploaded (by Dreamy Jazz; author: Dreamy Jazz):

[mediawiki/extensions/ReportIncident@master] [WIP] Enforce a character count limit on textarea fields

https://gerrit.wikimedia.org/r/970390

Dreamy_Jazz updated the task description. (Show Details)
Dreamy_Jazz updated the task description. (Show Details)

Change 970390 merged by jenkins-bot:

[mediawiki/extensions/ReportIncident@master] Enforce a character count limit on textarea fields

https://gerrit.wikimedia.org/r/970390

Test wiki created on Patch demo by DJacksonA using patch(es) linked to this task:
https://patchdemo.wmflabs.org/wikis/d2cf105f3d/w

Test wiki on Patch demo by DJacksonA using patch(es) linked to this task was deleted:

https://patchdemo.wmflabs.org/wikis/d2cf105f3d/w/