Page MenuHomePhabricator

Ratelimit : improve handling of upload ratelimits via local JS hack
Open, HighPublicFeature

Description

Diagnosis

  • identifying issue : it's a Commons API ratelimit rule for new users. They are limited to 380 uploads per 72mins.
  • documenting the issue : see LinguaLibre:Ratelimit = LinguaLibre:User_rights
  • documenting workaround : same page.

Human solution (not sustainable)

  • [ongoing] Mentor emerging communities better. Warn them about the 380 limit, mentor them to request higher rights on Commons.

Local solution

  • Dev a ribon warning, display if ratelimit =<380.
  • [optional] Dev error message handling.

Input
-Commons user-rights and edit-count

See also

  • Described here. Here is the error message that Luilui6666 got during the upload step
[RequestQueue] Reject ratelimited
Object:
 *: "See https://commons.wikimedia.org/w/api.php for API usage. Suscrib..."
 code:"ratelimited"
 info:"You've exceeded your rate limit. Please wait some times and tray again...

Twin ticket / long term solution

  • T276992 - Explore possibility for our upload flow OAUTH to have higher ratelimits on Commons.

Event Timeline

This comment was removed by Yug.
Yug claimed this task.

It is not solved because even if we understand the origin of the problem, we still need to "think about a workaround or at least a clearer error message".

This comment was removed by Yug.

What has to be done now is

  • Display a clearer error message when the rate limit is reached

This message should explain what is the problem and what to do (wait until the user has uploaded 500 files, or 1 month after the account creation).
The current message

Object:
*: "See https://commons.wikimedia.org/w/api.php for API usage. Suscrib..."
code:"ratelimited"
info:"You've exceeded your rate limit. Please wait some times and tray again...

is not really clear.

Yug renamed this task from Reject ratelimited to Uploads' ratelimit failures to manage better.Feb 17 2021, 6:42 PM

I dug into the RecordWizard's code to see where a "better" message display could be handled. This is a somewhat in-depth code analysis, if you're only interested in what could be done to solve the issue, skip to the TL;DR at the bottom of the post.

Turns out:

  • Errors are handled in the rw.store.record.js file - in the requestError() function.
  • It's actually slightly more complicated than just adding an if-check.

The way errors are handled right now (setError())... Is kinda unclear to me. The error message coming from a failed request is actually sent in the code, and it should be displayed somewhere. Yet the only thing we have is this :

image.png (611×938 px, 38 KB)

However, this line sets the "error name" in a browser property:

Vue.set( this.data.errors, word, error ); // error is the "error text"
> mediaWiki.recordWizard.store.record.data.errors['comparaison']
"badtoken"

If you're not familiar with Vue.set() (I am learning about it too!), you can read this blog post.

This reveals that the errors are not actually displayed. Sure, the code shows to the user which recordings are errored, but does not disclose why. And that, in my opinion, might need some thinking as to what we should do. Some errors might not be "worth it" to be displayed, while some (such as the rate limit one) definitely should.


TL;DR

In its current form, the code catches the errors (ratelimit, upload errors, bad token...) and shows the impacted recording(s) as "errored". However, it does not disclose the cause of the errors.

In the context of this task, we must define how and which errors should be displayed. While some can remain "silent" (as in what's already happening with the code), some (such as the ratelimit) should be displayed to the user.

How errors should be displayed

The most obvious way would be to create a banner on the RecordWizard (please forbid my horrible UX/UI and graphical design skills):

image.png (451×791 px, 34 KB)

But other means should be explored as well.

Which errors should be displayed

  • Rate limited
  • Any others ?

As an additionnal thing to do (which might have to be considered as a separate task), the RecordWizard should abort uploading all pending recordings if it ever receives a "ratelimited" error. This would save both the user's browser and Commons from uselessly handling requests.

Poslovitch renamed this task from Uploads' ratelimit failures to manage better to Improve handling of upload ratelimits.Feb 22 2021, 3:06 PM

I'm wondering: wouldn't there be a way to check the user's roles on Commons and show him a warning upfront (even before he started recording), stating that he will be limited to 380 words?

If such possibility exists, it should go further. For example the Record Wizard should limit the number of recordings in order not to record more words than the limit.

Commons user-rights assessment: that is exactly what I do by hand so far. See:

Activity assessment : Commons "editcount"'s value can be misleading to know if this user is active on LL. If you want to assess Lili's experience, check Lili's API by changing the base url website :

API: Commons' Mediawiki API can give you nearly any info you want from a wiki, including userights.
Format: You can change format=json by one of the following values: json, jsonfm, none, php, phpfm, rawfm, xml, xmlfm.

Implementation: JS within record wizard ? Other ?

See also :

Good news! It's possible. We can query the userrights and ratelimits applying to the user that made the request.

I crafted a quick request. We should expand upon it, but it shows which results we can use:
https://commons.wikimedia.org/w/api.php?action=query&format=json&meta=userinfo&uiprop=groups%7Cratelimits

In my case, this is an excerpt of what the query returns:

{
  "batchcomplete": "",
  "query": {
    "userinfo": {
      "id": 5623759,
      "name": "Poslovitch",
      "groups": [
        "autopatrolled",
        "*",
        "user",
        "autoconfirmed"
      ],
      "ratelimits": {
        "edit": {
          "user": {
            "hits": 900,
            "seconds": 180
          },
          "autopatrolled": {
            "hits": 10500,
            "seconds": 180
          }
        },
        "upload": {
          "user": {
            "hits": 380,
            "seconds": 4320
          },
          "autopatrolled": {
            "hits": 999,
            "seconds": 1
          }
        },
        "wikibase-idgenerator": {
          "user": {
            "hits": 900,
            "seconds": 180
          },
          "autopatrolled": {
            "hits": 10500,
            "seconds": 180
          }
        }
      }
    }
  }
}

Since the user's browser is doing the query, then the query should return the info about the currently logged in user. Which is what we want.

So, to put things in a nutshell, we should do the following at the "speaker" step:

  1. Query https://commons.wikimedia.org/w/api.php?action=query&format=json&meta=userinfo&uiprop=groups%7Cratelimits to get the user's groups and ratelimits.
  2. If the user is not autopatrolled, then get the values at query.userinfo.ratelimits.upload.user (hits and seconds): they may change in the future, and doing it this way will allow the code to be more robust on the long term.
  3. Query LL's Blazegraph to return the amount of recordings made by the user "since" the currentdate - seconds timestamp.
  4. If the limit has been reached, prevent the user from recording any words (i.e. do not let him go to the "details"/wordlist step).
  5. If the limit has not been reached, calculate how many recordings the user can do, and prevent the user from adding words if he reaches that limit.

And, IMHO, if the user is not autopatrolled, we should show a banner explaining the stuff about the ratelimit and the autopatrolled status.

@Poslovitch, please check comment above (I have no way to know if you seen or missed it). Also, in rare cases, I seen users having other rights such as image-reviewer which also grant the 999/sec, and not autopatroller. See here:

		'upload' => [
			// 380 uploads per 72 minutes
			'user' => [ 380, 4320 ],
			// Effectively no upload rate limit for members of these groups
			'image-reviewer' => [ 999, 1 ],
			'patroller' => [ 999, 1 ],
			'autopatrolled' => [ 999, 1 ],
		],

Also, it depends on how the record wizard send audio files (from server and lingualibre.org or from client's IP address ?), but I bumped into this : T60224. Not sure it's the way out, but maybe this project knows things.

Poslovitch changed the subtype of this task from "Task" to "Feature Request".Feb 22 2021, 7:01 PM

Also, it depends on how the record wizard send audio files (from server and lingualibre.org or from client's IP address ?),

Afaik, it's LL that sends the files, but with the OAuth token of the user, thus identifying the "uploader" as the user - that's why rate limits are working on a per-user basis and they do not cause "Lingua Libre-wide bans".

By the way, @Yug, do you plan to dev that task (since you're assigned to it)? If that's not the case, please unassign yourself, so I (or maybe other devs) can see this task is "free" and start working on it at some point.

This comment was removed by Yug.
Yug removed Yug as the assignee of this task.Feb 22 2021, 7:37 PM
Yug renamed this task from Improve handling of upload ratelimits to Ratelimit : improve handling of upload ratelimits via local JS hack.Mar 9 2021, 11:08 PM
Yug updated the task description. (Show Details)
Yug updated the task description. (Show Details)
Yug updated the task description. (Show Details)
Yug updated the task description. (Show Details)
Yug updated the task description. (Show Details)
Yug triaged this task as High priority.Jul 6 2022, 10:44 AM