To improve the results of some queries, we need to label bots as such. This is done in the Sorting Hat database, which stores identities, by setting the field profiles.is_bot to 1. We should at least label all bots that are known to us to be bots.
After looking at the available information, the current list of identities labeled as "bot" is:
- l10n-bot, firstname.lastname@example.org
- jenkins-bot, NULL
- "Wikimedia Jenkins Bot", email@example.com
- jenkins-bot, firstname.lastname@example.org
- wmf-jenkins-bot, NULL
- mw-jenkinsbot, NULL
- jenkins-bot,jenkins-bot@gerrit. wikimedia.org
Second field is the email address declared by the bot, not always available. Closing the task for now. Please, feel free to reopen if you notice new bots, or some error in this list.
Warning: right now maybe not all queries consider this field (is_bot) as they should, but most already do, and in a short time all should do.
Wondering if the list in korma of bots can be automatically updated by pulling data from that very link that Lego provided.
Even if it cannot, providing some basic info where that list of bots is stored by korma and how (often) it is updated is welcome so we can put this on https://www.mediawiki.org/wiki/Community_metrics
The list of bots is currently maintained in the Sorting Hat (identities) database, in table profiles. If the field "is_bot" is 1, the identity is considered a bot. Otherwise, that field is 0.
Unfortunately, except for changing the database, there is no other way (for now) of tagging an identity as a bot.