Page MenuHomePhabricator

Update Special:GlobalContributions to indicate that it does not support displaying edits made by bot users
Closed, DeclinedPublic

Description

Special:GlobalContributions does not currently display edits that have been made by an account with the bot right. This is due to an intentional decision to exclude bot actions from the CheckUser central index tables, which was made due to database concerns (T387923#10603518; https://gerrit.wikimedia.org/r/1067425).

However, the current behaviour may be confusing for end-users, who would currently have no reason to expect that Special:GlobalContributions would behave differently to individual wikis' Special:Contributions pages with regards to bot accounts. Although it is intentional; with no explanation, this behaviour may validly be perceived by end-users as a bug in Special:GlobalContributions.

Special:GlobalContributions should therefore be updated to indicate that it does not currently support displaying edits made by accounts with the bot right; in order to ensure that end-users are aware of this limitation, and in order to hopefully alleviate end-user confusion that may be caused by bot edits not being displayed.


Original title: Special:GlobalContributions does not appear to display edits made by bots

Original description:

Steps to replicate the issue
Visit the Special:GlobalContributions page for a bot account.

What happens?
Edits made by bot accounts on local wikis do not appear to be shown. Examples:

For a reason I'm not sure about, https://meta.wikimedia.org/wiki/Special:GlobalContributions/Leaderbot also doesn't display any recent edits made from that account, except for edits made on testwiki in December 2024.

What should have happened instead?
The accounts' recent contributions from all wikis should be displayed.

Software version
1.44.0-wmf.18

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
A_smart_kitten renamed this task from Visiting Special:GlobalContributions for Leaderbot consistently only displays contributions which that account made on testwiki to Special:GlobalContributions does not appear to display edits made by bots.Mar 4 2025, 10:42 PM
A_smart_kitten updated the task description. (Show Details)

Looks like this seems to be an issue also affecting other bot accounts - I've edited/reframed the task accordingly :)

The reason this is the case is that DBA specified we could not store information on bot users, due to the concern of lots of writes being performed to the table for several high edit bot accounts. We could reconsider this decision now that there is a use case for looking at Special:GlobalContributions for bot accounts.

The check for a bot account is done via looking for the bot user group, which means that some bots will have data in the tool if they do not have the bot group on a wiki.

The reason this is the case is that DBA specified we could not store information on bot users, due to the concern of lots of writes being performed to the table for several high edit bot accounts.

Ah, thanks for the context :) Purely out of curiosity & MediaWiki-archaeological interest, is this documented in a public Phabricator task somewhere? At a search, I found T247540: Should CheckUser track bot edits? from 2020, but the result of that task seemed to be to keep bot edits recorded in CheckUser at that time.

@A_smart_kitten It was one of a few suggestions made in this discussion: T368151#9993273

kostajh subscribed.

The reason this is the case is that DBA specified we could not store information on bot users, due to the concern of lots of writes being performed to the table for several high edit bot accounts. We could reconsider this decision now that there is a use case for looking at Special:GlobalContributions for bot accounts.

The check for a bot account is done via looking for the bot user group, which means that some bots will have data in the tool if they do not have the bot group on a wiki.

Tagging with DBA to get confirmation on whether we should definitely exclude users in the bot user group from the central index tables.

If DBA still doesn't want us to do that, then we should add a warning on Special:GlobalContributions to explain why we're not showing any results.

The reason this is the case is that DBA specified we could not store information on bot users, due to the concern of lots of writes being performed to the table for several high edit bot accounts. We could reconsider this decision now that there is a use case for looking at Special:GlobalContributions for bot accounts.

The check for a bot account is done via looking for the bot user group, which means that some bots will have data in the tool if they do not have the bot group on a wiki.

Tagging with DBA to get confirmation on whether we should definitely exclude users in the bot user group from the central index tables.

"definitely" is a strong word :D The answer is that in this specific case, it's not black and white. i.e. This is a taxing feature on the infrastructure [1]. If users really really want this, then we can implement it with mitigations [2] but if it's not a big need, then please don't. What I mean is that let's make an informed decision.

[1] I dream that one day we could give an estimated cost for each feature people request and ask them to whether they think the feature they want is worth it and if so, please find the budget for it. We are far away from it.

[2] The main mitigation would be to do probabilistic writes, if the last bump was in the last minute, don't write and if it was above one minute but below 10 minute, then bump the timestamp via flipping a coin (and do it 10% of the time). That way, writes don't end up locking one row over and over. This can cause outages.

The reason this is the case is that DBA specified we could not store information on bot users, due to the concern of lots of writes being performed to the table for several high edit bot accounts. We could reconsider this decision now that there is a use case for looking at Special:GlobalContributions for bot accounts.

The check for a bot account is done via looking for the bot user group, which means that some bots will have data in the tool if they do not have the bot group on a wiki.

Tagging with DBA to get confirmation on whether we should definitely exclude users in the bot user group from the central index tables.

"definitely" is a strong word :D The answer is that in this specific case, it's not black and white. i.e. This is a taxing feature on the infrastructure [1]. If users really really want this, then we can implement it with mitigations [2] but if it's not a big need, then please don't. What I mean is that let's make an informed decision.

[1] I dream that one day we could give an estimated cost for each feature people request and ask them to whether they think the feature they want is worth it and if so, please find the budget for it. We are far away from it.

[2] The main mitigation would be to do probabilistic writes, if the last bump was in the last minute, don't write and if it was above one minute but below 10 minute, then bump the timestamp via flipping a coin (and do it 10% of the time). That way, writes don't end up locking one row over and over. This can cause outages.

Just to note, we already debounce the writes to the table for all actions that are not excluded (exclusion being user has bot group);

  • No update occurs if the stored timestamp is within the last minute
  • The write will only occur one out of ten times if the timestamp was within the last hour

We read the current timestamp from a replica DB. That would mean reading the last updated timestamp for every bot action, but if we needed we could cache the result of the DB lookup. However, I'm not sure that is necessary.

Just to note, we already debounce the writes to the table for all actions that are not excluded (exclusion being user has bot group);

  • No update occurs if the stored timestamp is within the last minute
  • The write will only occur one out of ten times if the timestamp was within the last hour

Great, then I suggest roll this out but keep us in the loop so we can monitor stuff.

We read the current timestamp from a replica DB. That would mean reading the last updated timestamp for every bot action, but if we needed we could cache the result of the DB lookup. However, I'm not sure that is necessary.

That shouldn't be an issue.

Thanks @Ladsgroup.

I would be inclined to not do this work for now (we have too much other stuff going on), and do something more straightforward of telling the user that bot users are not supported by the tool. cc @Tchanders

Tchanders added a subscriber: KColeman-WMF.

Thanks @Ladsgroup.

I would be inclined to not do this work for now (we have too much other stuff going on), and do something more straightforward of telling the user that bot users are not supported by the tool. cc @Tchanders

This makes sense to me too. We could change the text at the top of the page to say: "Showing results from the last 90 days, for all wikis where you have the right to view contributions. Limited to 20 results per wiki. Bot edits are not included." Tagging @KColeman-WMF for visibility, since you've been working on the messages.

Thanks @Ladsgroup.

I would be inclined to not do this work for now (we have too much other stuff going on), and do something more straightforward of telling the user that bot users are not supported by the tool. cc @Tchanders

Sounds good to me - hopefully this should alleviate end user confusion about bot edits not being displayed :)
Is it okay to create a feature-request task to track showing bot edits in Special:GlobalContributions, if this is something that might be enabled at a later date?

kostajh renamed this task from Special:GlobalContributions does not appear to display edits made by bots to Update Special:GlobalContributions to indicate that it does not support displaying edits made by bot users.Mar 14 2025, 4:40 PM

Thanks @Ladsgroup.

I would be inclined to not do this work for now (we have too much other stuff going on), and do something more straightforward of telling the user that bot users are not supported by the tool. cc @Tchanders

Sounds good to me - hopefully this should alleviate end user confusion about bot edits not being displayed :)
Is it okay to create a feature-request task to track showing bot edits in Special:GlobalContributions, if this is something that might be enabled at a later date?

Yes, a new task for this would be great, thanks in advance for filing it!

In T387923#10636210, @Tchanders wrote:
This makes sense to me too. We could change the text at the top of the page to say: "Showing results from the last 90 days, for all wikis where you have the right to view contributions. Limited to 20 results per wiki. Bot edits are not included." Tagging @KColeman-WMF for visibility, since you've been working on the messages.

Sounds sensible to me. Is there a page we can link to that explains bot edits (e.g. https://www.mediawiki.org/wiki/Help:Bots)? For users who may be unfamiliar with bots.

A_smart_kitten changed the subtype of this task from "Bug Report" to "Task".

Is it okay to create a feature-request task to track showing bot edits in Special:GlobalContributions, if this is something that might be enabled at a later date?

Yes, a new task for this would be great, thanks in advance for filing it!

Filed as T389055: Special:GlobalContributions: Display edits made by bot accounts :)

Thanks @Ladsgroup.

I would be inclined to not do this work for now (we have too much other stuff going on), and do something more straightforward of telling the user that bot users are not supported by the tool. cc @Tchanders

I would suggest against this, given that I think the change for this ticket is bigger than the change to support bots in the tool (this patch enables support and is only two lines of change). We wouldn't get the data backdated, but doing that now rather than later is better as we will have the data fully searchable at a sooner date (given that 3 months from deployment the data will have fully refreshed). As such, perhaps it is better to support bots over this task?

Thanks @Ladsgroup.

I would be inclined to not do this work for now (we have too much other stuff going on), and do something more straightforward of telling the user that bot users are not supported by the tool. cc @Tchanders

I would suggest against this, given that I think the change for this ticket is bigger than the change to support bots in the tool (this patch enables support and is only two lines of change). We wouldn't get the data backdated, but doing that now rather than later is better as we will have the data fully searchable at a sooner date (given that 3 months from deployment the data will have fully refreshed). As such, perhaps it is better to support bots over this task?

It's helpful that the technical change is simple, but I think the main concern here was balancing the need for this feature vs the cost of writing/storing the extra data, plus I also see costs associated with monitoring, then figuring out to do if we realise that this is too costly after users are already used to having it. Even though T387923#10618806 does say we can potentially do this, it also mentions the idea of weighing up how critical this is compared to those costs.

I'm not stating the opinion that we definitely shouldn't do it right now, but can we make a case for this being worth the cost? The case that it's confusing to have different behaviour would be addressed by improving the messaging.

Thanks @Ladsgroup.

I would be inclined to not do this work for now (we have too much other stuff going on), and do something more straightforward of telling the user that bot users are not supported by the tool. cc @Tchanders

I would suggest against this, given that I think the change for this ticket is bigger than the change to support bots in the tool (this patch enables support and is only two lines of change). We wouldn't get the data backdated, but doing that now rather than later is better as we will have the data fully searchable at a sooner date (given that 3 months from deployment the data will have fully refreshed). As such, perhaps it is better to support bots over this task?

It's helpful that the technical change is simple, but I think the main concern here was balancing the need for this feature vs the cost of writing/storing the extra data, plus I also see costs associated with monitoring, then figuring out to do if we realise that this is too costly after users are already used to having it. Even though T387923#10618806 does say we can potentially do this, it also mentions the idea of weighing up how critical this is compared to those costs.

I'm not stating the opinion that we definitely shouldn't do it right now, but can we make a case for this being worth the cost? The case that it's confusing to have different behaviour would be addressed by improving the messaging.

Sure. Some thoughts:

  1. Links to third-party GUC tools were being updated to point to Special:GlobalContributions to avoid two or more links. My impression is that users of the tool want to avoid having to click to another tool where possible
  2. We are expecting that the clearer messaging would avoid users being confused. It does depend on where we place this messaging, however, if this is just in the subtitle I would argue that this won't be enough for the long or medium term:
    1. For example, T390523: GlobalContributions: Edits are sometimes missing from the returned contributions was filed today saying that edits were missing. However, the cause was the maximum 20 edits shown per wiki. If we are only adding this message to that text, I'm not sure that all users will see this or at least before they try to work out why the tool isn't displaying data.
    2. We probably would need to have a obvious and clear error message in the case that a user tries to do GUC on a bot. However, we will not be able to display this reliably because the bot flag is applied per-wiki and I'm not sure we want to be querying if the user has the bot group on all wikis they have been editing. Plus, a bot account may have GUC contributions from some of the wikis.
  3. We currently link to Special:GlobalContributions in the sidebar of bot account user pages, which implies that the tool works for these users. That would also need to be removed if we are updating the messaging, along with all other links generated by the site interface that could have a target as a bot user.
  4. At least initially, I see not much difference between the queries performed by GUC for an globally active user and a bot account
    1. For example, the query to cuci_user is more expensive for a user who has run the tool to create a local account on all wikis than a bot (given that most bots generally edit on a subset of wikis)
    2. We limit the revisions to only 20 per-wiki, so bot accounts making tens of thousands of edits would not be that different from a user who made 100 edits (given that the number of rows found before applying the limit does not substantially affect performance)
  5. "The case that it's confusing to have different behaviour would be addressed by improving the messaging." - I would slightly disagree with this, given my thoughts above about how clear we can make the messaging in the tool and that we would need updates elsewhere
  6. As a general idea, I think it is better to reduce the difference in behaviour between Special:GlobalContributions and Special:Contributions where such a difference is not immediately clear. There are some places where difference is necessary, but I don't think this is one of the places.
    1. This general idea comes from the place that we want to avoid users having to remember edge cases about our tool. No display for bot account edits is an example of one of these edge cases, given that you can use Special:Contributions for bot accounts. Furthermore, this does not appear consistently if a bot account has edited wikis where they do not have the bot flag yet.
  7. At least on the English Wikipedia, users can request that a bot account have it's flag removed if the bot has been retired. Therefore, if we display the error message based on the existence of the bot group we would need to look for previous groups to make this reliable.

However, on the other hand:

  1. We may have additional complication if the bot account is editing from Wikimedia Cloud Services (WMCS) ranges because we currently exclude those ranges causing updates to cuci_user
    1. This would generally affect bot accounts that are hosted on WMCS
    2. We could remove this exclusion under the same reasons as we removed the bot group check

So in summary, I still think the technical lift and potential user confusion will be less to support bots in GUC than updating messaging and interface links. However, I don't have string opinion on this.

Thanks for the discussion and points of view, all. After reading through everything, I'm now in support of trying out the patch from T389055: Special:GlobalContributions: Display edits made by bot accounts. If we find that it introduces problems, we could roll it back. Do we have metrics in place to understand if rolling out this patch would cause issues?

Thanks for the discussion and points of view, all. After reading through everything, I'm now in support of trying out the patch from T389055: Special:GlobalContributions: Display edits made by bot accounts. If we find that it introduces problems, we could roll it back. Do we have metrics in place to understand if rolling out this patch would cause issues?

Beyond the metrics for Special:GlobalContributions, I don't think we have anything. Those metrics would presumably show increased page load times if bots end up being an issue.

We could add a metric to track the number of rows in cuci_user, however, I'm not sure this could be public given that it's also populated for private actions such as logging in. Therefore, we couldn't add that to a Grafana dashboard.

Thanks for the discussion and points of view, all. After reading through everything, I'm now in support of trying out the patch from T389055: Special:GlobalContributions: Display edits made by bot accounts. If we find that it introduces problems, we could roll it back. Do we have metrics in place to understand if rolling out this patch would cause issues?

Beyond the metrics for Special:GlobalContributions, I don't think we have anything. Those metrics would presumably show increased page load times if bots end up being an issue.

We could add a metric to track the number of rows in cuci_user, however, I'm not sure this could be public given that it's also populated for private actions such as logging in. Therefore, we couldn't add that to a Grafana dashboard.

What about tracking the rate of insertions/removals?

We also have the option for a private Grafana dashboard, if there are privacy concerns. (It does seem like the number of rows would be acceptable to have public, if we don't separate it by wiki.)

Thanks for the discussion and points of view, all. After reading through everything, I'm now in support of trying out the patch from T389055: Special:GlobalContributions: Display edits made by bot accounts. If we find that it introduces problems, we could roll it back. Do we have metrics in place to understand if rolling out this patch would cause issues?

Beyond the metrics for Special:GlobalContributions, I don't think we have anything. Those metrics would presumably show increased page load times if bots end up being an issue.

We could add a metric to track the number of rows in cuci_user, however, I'm not sure this could be public given that it's also populated for private actions such as logging in. Therefore, we couldn't add that to a Grafana dashboard.

What about tracking the rate of insertions/removals?

We also have the option for a private Grafana dashboard, if there are privacy concerns. (It does seem like the number of rows would be acceptable to have public, if we don't separate it by wiki.)

Sure. In which case, I'll add some metrics as part of T389055 to keep an eye on cuci_user, merge that before making the change, and then make the change once we have a baseline on Grafana.

Thanks everyone. We can mark this as declined in favour of actually supporting bot edits.