Page MenuHomePhabricator

Internal error in ApiQueryAllUsers::execute: Saw more duplicate rows than expected
Closed, ResolvedPublic

Description

https://ru.wikipedia.org/w/api.php?action=query&list=allusers&aulimit=1&auwitheditsonly&aufrom=%D0%9E%D1%82%D0%B4%D0%B5%D0%BB%20%D0%BA%D1%83%D0%BB%D1%8C%D1%82%D1%83%D1%80%D1%8B%20%D0%90%D0%9C%D0%9E%20%D0%9A%D1%83%D1%80%D0%BA%D0%B8%D0%BD%D1%81%D0%BA%D0%B8%D0%B9%20%D1%80%D0%B0%D0%B9%D0%BE%D0%BD&auprop=blockinfo|editcount|registration&rawcontinue&format=json&formatversion=1

returns "Exception Caught: Internal error in ApiQueryAllUsers::execute: Saw more duplicate rows than expected" no matter what I try with that (blocked) user "Отдел культуры АМО Куркинский район"

It must be a dup of https://phabricator.wikimedia.org/T74560 but I'd love to see some immediate purge or whatever on it as ru-wiki literally cannot get anything from list=alluser starting from "Ост*" and somewhere to the letter "Р". Having ArbCom elections on the go is it possible to purge/delete affected records asap? Or at least make it return the next user name instead of plain error message so to jump over damaged parts? Thanks in advance.

Event Timeline

Neolexx raised the priority of this task from to High.
Neolexx updated the task description. (Show Details)
Neolexx added a project: MediaWiki-Action-API.
Neolexx subscribed.
Anomie claimed this task.
Anomie subscribed.

I ran the cleanup script from T74560 against ruwiki and it seems to have fixed the issue.

Thinking over again, I do not understand how any ipblocks table (if it is), even somewhere/completely corrupted, might prevent to get a simple list of users? One cannot give some piece of info - just say "I cannot get it" by an empty string in the relevant query field, do not abort the whole query.

Some users don't have their registration dates due to ancient server crashes - so fine, it says "" (empty string) for the registration field and I can sort it out on my end, API should not care of it, I will - "just give me what you can".

This is in case if the data corruption takes a lot of time to fix / wontfix

sorry, posted before saw you response, gonna check now

indeed purged through
thank you!

Thinking over again, I do not understand how any ipblocks table (if it is), even somewhere/completely corrupted, might prevent to get a simple list of users?

If there are multiple ipblocks entries for the user, the database query returns two rows for the user (and if you happen to be outputting groups too, it doubles each group), and this doubling can break the continuation, so we detect it and throw the error instead.

so we detect it and throw the error instead.

I see. But if you detect it then you know it so instead of dying it could be just a warning field set like "with-duplicates: true" or a like. So to know to extra filter at my end and to continue-from the last non-duplicate in the list and ignore aufrom.

Just a rough idea. It is not really important and overall a bad idea if that "ipblocks-dup bug" happens once in many years or once at all for each wiki. But if it may return rather oftenly, then a warning field and user-end check of that field with some sub to call if needed could be better.

And thank you again for your help!

so we detect it and throw the error instead.

I see. But if you detect it then you know it so instead of dying it could be just a warning field set like "with-duplicates: true" or a like.

You seem to have missed the part where I said that this situation breaks continuation. People tend to complain if making a query with aucontinue=foo returns aucontinue=foo again.

and to continue-from the last non-duplicate in the list and ignore aufrom.

Expecting clients to mangle the API query for correct continuation isn't something we do.

I ran the cleanup script from T74560 against ruwiki and it seems to have fixed the issue.

Is there still an underlying issue adding dupe rows to the db? Or was the script never run against ruwiki? (Should we run it against all wikis to be safe?)

Good question. I had assumed it wasn't run, but now that I check I see an SAL entry on October 27, 2014.

It looks like we had an issue with truncated usernames being inserted into ipblocks that was since fixed. Through a bit of luck I guessed the fix was in May 2015, which led me to {T99941} as the probable cause: we expanded the database field on lots of wikis but only cleaned things up on meta (T102949).

There's one block on testwiki that isn't consistent with this theory; that one should have been fixed by the October 27, 2014 run, unless "all wikis" didn't include testwiki then.

I'll run the cleanup script on the other affected wikis, and see what happens to testwiki when I run it there again.

and see what happens to testwiki when I run it there again.

Got cleaned up. No idea why it didn't get taken care of in October 2014.