Page MenuHomePhabricator

Investigate raising max edit count above 500,000
Closed, ResolvedPublicFeature

Description

Feature summary (what you would like to be able to do and where):

I have made too many edits for XTools to show anymore - when this happened previously somebody at WMF Labs tweaked something. Can the same happen again please? @GiantSnowman

Event Timeline

I sometimes wonder whether a graceful fallback might be possible for high-volume editors. Maybe everything in the last year, rather than everything? Or the last 100,000 edits, rather than everything?

I sometimes wonder whether a graceful fallback might be possible for high-volume editors. Maybe everything in the last year, rather than everything? Or the last 100,000 edits, rather than everything?

That's basically what T182182: Analyze most recent N edits instead of rejecting query in user-related tools is asking for. The issue is the ordering of revisions, which effectively (or at least last I checked) means we have the same query plan no matter how many edits we process. Going by a date range is much more feasible and likely to speed things up in most cases. That feature is tracked at T202552 (only the Edit Counter is left). I can try to prioritize that, but the issue is what if there are a million edits in the past year? The query is then still very likely to time out.

I think what I would like to do is have basically two thresholds -- one is the actual limit that can never be exceeded, and that's something really high like 800K edits. The other is the lower provisional limit that only requires you to login to proceed. The reason for this is web crawlers and the like will pound away at XTools, and the only real way to stop them is with a login wall. Generally speaking, for analytical tools it's totally fine for a query to take a long time to run, so long as the person truly wants and cares about that data and is not just needlessly consuming resources (as is the case with bots).

I will add however that the user in question, GiantSnowman, is not actually subject to the current limit, which is 600,000 edits. When I tried to run them in the Edit Counter, most of the queries were automatically killed. I can increase the query timeout as well, but that's a risky thing to do with all the web crawler traffic we receive. Implementing the provisional limit as described above will likely alleviate that, so I'll start with that and go from there.

Old edits do not change often, perhaps just some option to cache stats based on the olders edits and have more recent ones added to the cached ones dynamically? (And potentially have an option to invalidate cache every now and then, potentially just manually but with a throttle).

Is there any discussion here any more about raising the count of edits for productive editors to see their Edit Count information? I could see all of my data until it reached 600,000 edits, then I was given permission to reach 650,000 edits and five months later I reached this ceiling. It is very useful to see the XTools information which I checked several times a day, both my editor counts and admin counts.

Is there any possibility you could raise the upper limit to 750,000 or even 1,000,000 edits? There can't be that many of us who are affected by this so I don't think the numbers are so high that it would "slow down" the system. XTools really provides invaluable information about ones editing and having that page reduced to a simple edit number is deflating. I'd like to restart this discussion if possible. Thank you.

MusikAnimal moved this task from Maintenance / tech debt to Pending deployment on the XTools board.

I think what I would like to do is have basically two thresholds -- one is the actual limit that can never be exceeded, and that's something really high like 800K edits. The other is the lower provisional limit that only requires you to login to proceed.

For now I've gone with this solution. Tentatively I think the edit count limit requiring login will be 250,000 edits, and the hard limit (the one this task is about) will be raised to 800,000. If this new "provisional" limit requiring login successfully keeps out the bots enough, we should see much better performance overall and we may even be able to get rid of the "hard" limit altogether.

There's also a chance a lot of people will complain about having to login. So we'll just have to test the waters and adjust the limits accordingly.

I hope to have this deployed on Monday.

Actually, since a lot of people will see this new "login required" message, it might be wisest to wait until we get our first round of translations in. This should happen in a day or two, then I can proceed with deployment.

This has now been deployed. The current settings are >250K edits requires login, >700K edits is rejected entirely. I'm going to let it simmer for a while before closing this task. I will likely increase both thresholds depending on how well this shuts out the web crawlers.

MusikAnimal moved this task from Pending deployment to Complete on the XTools board.

Declaring this resolved. The max edit count threshold is currently at 1,000,000 edits and I will likely increase it more.

May be worth noting that, per https://en.wikipedia.org/wiki/Wikipedia:List_of_Wikipedians_by_number_of_edits -- only 872 of us have made more than a hundy thou, and only 14 have broken a million (ofc this excludes bots). However, as it were, those 872 and/or 14 are probably the most likely to be using Xtools -- this seems to me, at least, to suggest that caching them might be really easy.