Page MenuHomePhabricator

Identify accounts with very high login rate
Open, Needs TriagePublic

Description

Per T256532, we should identify the accounts so we can contact the owners/maintainers of the individual bot, or the underlying software to fix their issue

In most cases, all these logins aren't needed, just follow https://www.mediawiki.org/wiki/API:Login#Additional_notes

If you are sending a request that should be made by a logged-in user, add assert=user parameter to the request you are sending in order to check whether the user is logged in. If the user is not logged-in, an assertuserfailed error code will be returned.

Related Objects

StatusSubtypeAssignedTask
OpenNone
OpenNone

Event Timeline

Reedy created this task.Sat, Jun 27, 1:21 PM
Reedy added a comment.Sat, Jun 27, 1:26 PM

Thanks!

I guess this might be a bit of whack a mole for a while. But a worthwhile investment of time, and helps the bot creators/runners too

Created this bucket:

It has the top twenty in the past two days. 114K times in two days by one user. Poetry.

Text version for easy copy paste:

JarBot114,698
ListeriaBot70,143
Mr.Ibrahembot53,443
SchlurcherBot36,728
Hexabot34,736
Scidudebot22,998
FaFlo16,436
WP 1.0 bot13,605
EmausBot7,084
Matthobot6,846
CommonsDelinker6,763
অভ্যর্থনা কমিটি বট5,360
HitomiAkane3,731
DeltaQuadBot2,965
AlaaBot2,667
Antigng-bot2,632
FlickreviewR 22,616
EntretenimientatoBOT2,492
Luke081515Bot2,327
Perfect1,918
Reedy renamed this task from Identify bots with very high login rate to Identify accounts with very high login rate.Sat, Jun 27, 2:29 PM
Reedy updated the task description. (Show Details)

IMHO, anything over 100 in a 48H period (that's just over 2 per hour) is probably doing something wrong

Thanks. I'll review AlaaBot, as in real it's weird to make 2,667!

Reedy added a comment.Sat, Jun 27, 3:14 PM

IMHO, anything over 100 in a 48H period (that's just over 2 per hour) is probably doing something wrong

Thanks. I'll review AlaaBot, as in real it's weird to make 2,667!

Thanks!

2668/48 is like 55 per hour... so almost one a minute. So something doesn't feel quite right :)

Antigng added a subscriber: Antigng.EditedSat, Jun 27, 4:18 PM

Thanks for notification. The problem is what I'm running is not a single task, but ten individual tasks instead. To minimize common mode failures, they were designed to run as individual processes, each being able to do its own task by itself, from login to querying to editing, without relying on sessions/tokens from other processes. On finishing, they logout and exit. A batch file is created to run all of them every 10 mins.

Reusing sessions or sharing sessions between these tasks would require, at least, communication between processes, which is completely beyond my scope, or at worst, code refactoring.

So at this moment, the only thing I can do is to reduce the frequency at which each of them is run, and my bottom line is once every 30 minutes. Otherwise there'll be service degradation, something not acceptable. This will lead to a login every 30/10=3 mins, or 960/48hrs.
That's it. And I cannot go further beyond that. Also I don't think such a login rate could cause threats to your server.

Reedy added a comment.Sat, Jun 27, 4:31 PM

Hundreds or thousands of logins a day is generally a sign of bigger issues.

You don't necessarily need to share sessions between different processes/tasks. You should be able to persist login sessions across different runs of the same tasks though, which may require some work.

If they're long running processes, they don't need to login repeatedly, just at the start, and use the assert method (and error handling) to check they're still logged in

You don't necessarily need to reduce the run rate, your count isn't that particularly high. The example of JarBot logging in 114K in 48 hours is a bigger one. Nearly 2.4K an hour. Which is 40 per minute, basically every 1.5 seconds... There is absolutely no reason a bot needs to login that frequently

It's not necessarily threats to the server, but more that it should be unnecessary to login that frequently. How often do you get logged out of your browser session, for example?

Jar added a subscriber: Jar.Sat, Jun 27, 4:33 PM

Hi, I will fix my bot Mr.Ibrahembot .

"Per T256532"

so what's the actual problem now here? I cannot view this task.

For a few months now, every time I start a Pywikibot script, it starts with

WARNING: No user is logged in on site wikipedia:hu
Logging in to wikipedia:hu as <account>

I don’t appear on this list as I only start bots manually, so no more than at most a few dozens of logins a day, but scheduled bots can reach quite large numbers this way.

I thought I would mention there is now an Upstream request for mwclient
https://github.com/mwclient/mwclient/issues/256

MBH added a subscriber: MBH.Sun, Jun 28, 11:35 AM
Jar added a comment.Sun, Jun 28, 2:43 PM

Text version for easy copy paste:

JarBot114,698
ListeriaBot70,143
Mr.Ibrahembot53,443
SchlurcherBot36,728
Hexabot34,736
Scidudebot22,998
FaFlo16,436
WP 1.0 bot13,605
EmausBot7,084
Matthobot6,846
CommonsDelinker6,763
অভ্যর্থনা কমিটি বট5,360
HitomiAkane3,731
DeltaQuadBot2,965
AlaaBot2,667
Antigng-bot2,632
FlickreviewR 22,616
EntretenimientatoBOT2,492
Luke081515Bot2,327
Perfect1,918

Can you check again the last 12 hours? I fixed some bugs and I want to know if the bugs related to this issue. Thank you.

I just checked and it got much better but it still logins 10 times a minute and still most of logins in Arabic Wikipedia

Jar added a comment.Mon, Jun 29, 1:06 AM

I just checked and it got much better but it still logins 10 times a minute and still most of logins in Arabic Wikipedia

Good to hear that, I fixed new bugs. If you can post the new logins number of JarBot every 12 hours for the next couple days to know if the issue still I will be thankful.

jijiki added a subscriber: jijiki.Mon, Jun 29, 12:04 PM

I just checked and it got much better but it still logins 10 times a minute and still most of logins in Arabic Wikipedia

Good to hear that, I fixed new bugs. If you can post the new logins number of JarBot every 12 hours for the next couple days to know if the issue still I will be thankful.

Last 12 hours: 1,206 logins (~1.67 per minute, almost 1/60th of the original number). Thanks!

Reedy moved this task from Incoming to Watching on the Security-Team board.Mon, Jun 29, 3:26 PM
FaFlo added a subscriber: FaFlo.Mon, Jun 29, 3:29 PM

Using this to log in with a bot password for the wikiwho API. Logins should not happen THAT often. Are data available if/when the login rates for my account drastically increased? Will try to fix.

Using this to log in with a bot password for the wikiwho API. Logins should not happen THAT often. Are data available if/when the login rates for my account drastically increased? Will try to fix.

Does this help?

This is for the past seven days and each bar is for three-hour timespan (I see a noticeable drop at 25th June went from 20,000 logins every 3 hours to 600 per three hours). It still logs in 10,000 times in the past 48 hours.

Jar added a comment.Wed, Jul 1, 2:22 PM

I just checked and it got much better but it still logins 10 times a minute and still most of logins in Arabic Wikipedia

Good to hear that, I fixed new bugs. If you can post the new logins number of JarBot every 12 hours for the next couple days to know if the issue still I will be thankful.

Last 12 hours: 1,206 logins (~1.67 per minute, almost 1/60th of the original number). Thanks!

Hello Ladsgroup, does it get lower than that?

Last 12 hours: 1,206 logins (~1.67 per minute, almost 1/60th of the original number). Thanks!

Hello Ladsgroup, does it get lower than that?

YES, last 12 hours only 15. THANK YOU.

Hi, SchlurcherBot should also be fixed. I switched approx. 2 days ago to OAuth verification for the majority of my tasks that need login. Schlurcher

Hi, SchlurcherBot should also be fixed. I switched approx. 2 days ago to OAuth verification for the majority of my tasks that need login. Schlurcher

Yes, it went from 200/hour to 10/hour. Thanks!

The average number of logins (everything, everywhere) in the past seven days went from 20K per hour to 7k per hour. The new list in the past 48 hours:

ListeriaBot73,771
Mr.Ibrahembot58,679
AGbot13,813
WP 1.0 bot13,585
CommonsDelinker9,232
EmausBot8,748
Tylernub4,218
AlaaBot4,096
Tylerfed3,787
TylerTok2,937
FlickreviewR 22,616
Luke081515Bot2,313
Entretenimientato2,244
EntretenimientoBorrarBOT2,153
Perfect1,915
DeltaQuadBot1,830
YouTubeReviewBot1,805
Antigng-bot1,791
AlbeROBOT1,651
Tylernok1,431

If we can reduce the top four the below 5K per day, I think the first phase can be called done.

Reedy added a comment.Thu, Jul 2, 10:25 PM

Only AGbot is now in that list that is above the 5K a day

"Per T256532"

so what's the actual problem now here? I cannot view this task.

Can somebody give me an answer concerning this question?

Huji added a comment.Tue, Jul 7, 3:03 PM

"Per T256532"

so what's the actual problem now here? I cannot view this task.

Can somebody give me an answer concerning this question?

That task is private for security reasons. Suffice to say that there is a potential for abuse that we are trying to prevent.