Page MenuHomePhabricator

Develop approximate index of existing bot accounts and gadgets
Closed, ResolvedPublic

Description

My basic line of thinking is this...

For each wiki, we have a list of accounts with bot flags. Through SUL we know which ones operate on multiple wikis.

We also have listings of gadgets for each wiki, and by mapping names / comparing codebases we can see which ones are effectively "the same" (despite the lack of global gadgets, meaning that you have a lot of per-wiki deviations).

Between these sources of data we can create a very rough draft of all the difference bots and gadgets out there. With some additional tagging / annotations we can build a more useful directory of all the bots and gadgets that are out there. But this initial task is to just aggregate the different data sources together.

Event Timeline

Harej updated the task description. (Show Details)

We have a rough breakdown of all the different gadgets that could run on each wiki. Additional pieces of information we may want:

  • List of editors per gadget (per wiki)
  • List of maintainers of the gadget definitions page (per wiki and also aggregate if there are people maintaining gadgets on multiple wikis)
  • Overall aggregation of "top editors" of gadgets in general
  • Mapping with gadget usage data

As for bots... I wonder if it would be interesting to generate a list of current bot accounts, and then the wikis on which they operate.

As for bots... I wonder if it would be interesting to generate a list of current bot accounts, and then the wikis on which they operate.

In the long term the tasks performed by the bot accounts are more interesting than the accounts themselves, but starting with the account list might make it easier to contact the humans behind the accounts and start to figure that out.

In the long term the tasks performed by the bot accounts are more interesting than the accounts themselves, but starting with the account list might make it easier to contact the humans behind the accounts and start to figure that out.

Starting with data from actual bots will also allow us to construct an ontology based on actual data, rather than hypothetical "this is what I think bots do" stuff.