- User Since
- Apr 30 2019, 3:34 AM (203 w, 4 d)
- LDAP User
- MediaWiki User
- ST47 [ Global Accounts ]
Feb 22 2023
Jan 22 2023
@Zabe, were you able to check the log trace associated with the request quoted in the task description? I find it hard to believe that we've been unable to undelete pages with more than a thousand or so revisions since at least June 2018, and instead it seems likely that a recent change made the operation significantly more expensive.
Jan 20 2023
(For what it's worth, even loading https://en.wikipedia.org/wiki/Special:Undelete/Miss_World_2022 is quite slow, at 4-5 seconds. Originally reported at WP:AN.
Oct 7 2022
I currently start this tool with webservice start. What would I need to change to use Kubernetes?
Jul 29 2022
Checkusers and clerks do use an spihelper script which gives them a tool on the Sockpuppet Investigations subpages to quickly block accounts, tag userpages, and update the SPI subpage itself. If there's a deadlock related to inserting autoblocks, I would guess that multiple users with the same last used IP address are being blocked simultaneously, and so multiple autoblocks for the same target could be trying to be inserted in separate threads.
Nov 21 2021
Yes. It uses OAuth to perform edits with a user's account.
Nov 20 2021
Why do you say they are logged with an *invalid* IP? I think that is the private network segment used inside WMF's network, and with IABot running on WM cloud services, it seems reasonable for this IP to be used.
Sep 14 2021
Dec 17 2020
It's not just that really old page, the same issue happens on this test revision:
Oct 24 2020
I don't believe the -r flag has any effect on other RIRs. My patch to ipwhois only uses the -r flag when querying RIPE's servers.
The block was lifted, mentioned a few comments above. However the rate limit is still in place, so we may be blocked again if we exceed 1000 queries to RIPE in 1 day.
Oct 21 2020
Oh,I didn't know that, thanks!
Oct 20 2020
There are two ongoing actions here:
- nskaggs seems to be planning to contact RIPE on behalf of the cloud services team
- the whois tool needs to be updated
I heard back again from RIPE and the IP has been unblocked. Asking them to treat it as a proxy IP would still be desirable, in order to reduce the chance of this happening again.
Oct 19 2020
I have updated as-info and whois-referral. isprangefinder uses the whois tool's json output, it doesn't make whois queries directly.
I received the following from RIPE technical support:
Oct 17 2020
Sep 9 2020
While it might not completely fix the problem for global filters, would a hook "onRenameUserSQL" in the style of this patch to CheckUser solve most of this problem?
Sep 8 2020
According to #wikimedia-cloud:
Thank you Manuel! Apparently I just didn't know what to search for. What is "MCR"?
Sep 2 2020
Okay, but that isn't new.
Fixed in patch set 3
Aug 24 2020
Aug 20 2020
Oh, I have to specifically ASK for it by adding |hidden to aflprop! Yes, this works for me.
Aug 15 2020
DatabaseBlock only does that lookup locally if you don't pass a user to by. (https://github.com/wikimedia/mediawiki/blob/master/includes/block/DatabaseBlock.php#L130-L136) Could you look up the local user object in CentralAuthUser::doLocalSuppression, and pass that to DatabaseBlock(['by' => $user])? Possibly using something similar to LocalRenameJob? (https://github.com/wikimedia/mediawiki-extensions-CentralAuth/blob/043514ea3b2c894d98ed84b7ef2c305f5833bb31/includes/LocalRenameJob/LocalRenameJob.php#L99-L110)
This ticket is a more general problem. The username of user ID 39831648, which IS both locally and globally hidden, is ALSO present in the toolforge replica.
Aug 14 2020
@Tchanders I guess I don't know. On the contributions page, there are links to "change block" or "unblock" (rather than "block"). However, there is no block log that I can see. The username is hidden for good reason, but the user ID (on enwiki, at least) is 39840215.
Aug 12 2020
Awesome, thanks @DannyS712 . I agree that it isn't urgent because the user in question is indef blocked, and as there is a workaround. I do not think a sysadmin should set $wgUserrightsInterwikiDelimiter to any nonsense.
Aug 11 2020
@Tchanders The first time, yes, but when I re-tested it on august 4th, I used the same account as when I originally reported the bug. So it would have been blocked for several days at that point.
Aug 4 2020
I can reliably reproduce it, it does not seem to be intermittent or one-off.
Jul 30 2020
Jul 21 2020
Clearly they haven't really thought this new feature through, since they talk about benefits like only requesting the client's CPU architecture only when it is needed to serve the correct version of a downloadable executable, while undermining that very use case by requiring a full request-response round trip before providing the necessary information. It is as if Google thinks that your average website is able to predict the user's next browsing action, and set the Accept-CH header accordingly. Or they forgot to define a new 300-series status code "Try Again".
Jul 15 2020
I would encourage you to store this in addition to the User-Agent, not instead of it. The User-Agent field may remain useful in many circumstances. I understand that it is extremely complex to make a schema change on the WMF cluster. I submit that it is worth doing so, for these reasons:
Jul 14 2020
I wouldn't recommend it, as we may want that structured data to remain structured for the purposes of filtering. Plus, that field is only 255 characters long.
@Huji, if the extra hints are "accepted" by the server via the "Accept-CH" header in the HTTP response, then they will be included in *subsequent* requests by the same client. No extra hints will be provided in the first request from a given client. If MediaWiki includes the "Accept-CH" header on every response, then at minimum every POST request will have the hints included, as it must have been preceded by a GET (to load the form, get a CSRF token, etc).
Jul 11 2020
I don't know what you mean about checkuser-too-many changing. That is used when a range check for "get edits" or "get users" returns too many results, so the special page shows a list of individual IP addresses with edits. That message was never shown when running "get edits" on a user account or a single IP address, "checkuser-limited" was.
Thanks, not sure why I didn't notice that in my own testing. I have set the limit to 7 edits in my localsettings.php, and get edits for user and get edits for IP are both working for me now.
Jul 10 2020
@Huji, I don't know if I agree with your comment here. Prior to this change, if someone called doUserIPsDBRequest(.., .., 200), then the LIMIT would be 200. I don't know when that would ever happen - since it isn't possible to specify a limit in Special:CheckUser - but the current patch set behaves in the same way as the original code.
I didn't see that you had edited your comment to say that you were working on a new patch set until I had already uploaded a new version of https://gerrit.wikimedia.org/r/c/mediawiki/extensions/CheckUser/+/611399 to fix the "is_null" lint error and the message documentation error.
@Huji I think I fixed the lint errors with this patch, but go ahead and make any other changes that are needed
I don't think the gerrit patch uploader takes Depends-On into account, as I am getting a merge conflict when trying to upload the rebased version of this patch.
@Huji No problem, I do have a patch for that one as well. I have some doubts about how to handle changing the system message, but I'll ask you over in that ticket.
@Huji, I think it's actually easier to do this one first, since it deletes a bunch of duplicate code. I will look at T257641 as well, though.
Jul 9 2020
Claiming, I will develop and test a patch to make get_edits get the 5000 most recent edits, and to format the results in the same way as in the normal (fewer than 5000 edits) case.
Jun 16 2020
Here's a difference. On enwiki, all users can access https://en.wikipedia.org/wiki/Special:AbuseFilter , even if they are logged out. On fawiki, "The action you have requested is limited to users in one of the groups: Administrators, Patrollers, Autopatrollers, Eliminators, Abuse filter editors."
Earlier, I checked several recent hits on private filters on enwiki and couldn't find any examples of missing CU data. This includes hits on a "no action" private filter (1007), and hits on a "disallow" private filter (1050), and includes edits as recently as within the last 24 hours. So this isn't universally happening to all filter hits. There must be some commonality between the users, filters, or perhaps wiki configurations, that is causing this inconsistent behavior.
Jun 15 2020
Are there many other log entries in the database on fawiki (or enwiki) with the exact same cuc_actiontext? Would be helpful to know whether this is a regression, and if so, when it started appearing.
Jun 12 2020
Jun 8 2020
If this bot will only edit in its userspace, requesting a bot flag at https://en.wikipedia.org/wiki/Wikipedia:Bots/Requests_for_approval will be quite painless, and will both hide its edits from recent changes, and grant it ipblock-exempt.
May 28 2020
May 18 2020
So, I've started trying to build the set of known sockpuppet groups based on userpage tagging and block summaries. So far I have 14,700 masters and a total of 174,000 accounts. If we cut that down to cases with 10 or more confirmed accounts, it's 131,000 accounts across 3,100 masters. Currently I'm formatting this as a JSON file with the following schema:
May 14 2020
May 13 2020
Apologies for wasting anyone's time on this. I didn't realize that the API defaults to only 14 days worth of data, instead of defaulting to 90 days like the special page.
May 12 2020
I tested this on a Vagrant dev environment and it worked fine:
I assumed that the API was intentionally filtering out log entries, but that isn't correct. Is there some other reason why entries like these appear on Special:CheckUser but not on the API?
May 8 2020
In T247540, Reedy indicated that the table has about 60,000,000 rows on wikidata and 10,000,000 on enwiki.
Apr 28 2020
Assigning to $renameUserSQL->tables['cu_log'] in two separate hooks doesn't change the fact that they're modifying the same variable, whichever hook runs last corresponds to the entry in $renameUserSQL->tables that will actually be run.
Apr 25 2020
Apr 24 2020
I can't say I've ever had a reason to CheckUser a bot, but I *can* imagine situations where it would be useful to be able to do so. Sampling is a good idea, but does the cu_changes table really represent a significant storage cost?
Apr 21 2020
Mar 9 2020
@Prtksxna True, but we don't use the time limit for that purpose. The only reason we would use a shorter time limit in the current tool is if we run a check on an IP address or range, and get the error that there were more than 5000 edits within the 90 day window. (In fact, the current log doesn't even show what time limit we chose.)
Mar 4 2020
Thanks for the discussion. I unfortunately am not able to log in to Gerrit, due to an issue with my account. My understanding is that it isn't possible to fix. Please email me privately with any other questions about that.
Feb 28 2020
Ah, I see, thanks. New patch 0001 is the same as the above, but with the dist/ files updates. New patch 0002 also collapses multiple level equivalences, which I don't know if it's required but it might be clearer for reviewers.
Feb 27 2020
Good point. I have attached a patch which uses a python script to collect the confusables.txt data and add it in to the equivset database. The patch is against https://phabricator.wikimedia.org/source/Equivset/repository/master/. Several notes:
Feb 21 2020
I don't know if you have any data on how often the 5000 result limit in the current tool is hit, my personal experience is that it's fairly common particularly for mobile IP ranges. If that limit was brought even lower per IP range, e.g. if only the last 1000 edits from an IP range were considered, I think that would be a significant degradation.
Feb 17 2020
Feb 6 2020
Test comment, had notifications off...