Page MenuHomePhabricator

Import full list of known instances from Wikibase.world
Open, Needs TriagePublicSpike

Description

Context:

Wikibase world is the community-run, cloud hosted instance that is a directory of Wikibases. When I first manually collated our table of known instances I ran their query "Only Wikibases that are currently online". I am not convinced this list is accurate and I would like to check that there aren't any self-hosted (ie non cloud) instances within Wikibase.world which are live but not returned by this query.

Goal:

catch any missing instances from my first round of manual querying.

Acceptance Criteria

  • import all missing self-hosted Wikibases from wikibase.world to our metrics db
  • mark which instances are no longer online
  • determine urls for apis of live instances and update metrics db
  • filter out key exceptions such as Wikidata, Wikimedia Commons, etc (either by marking them w/in the database or by never importing them, whichever is simpler to build around)

Possible supporting mechanisms (optional):

note: the following is not necessarily in the correct order of operations

  • create a mechanism using cloud's api or raw data (found in the cloud metrics google sheet) to filter out any known cloud instances
  • create a manner to automatically record whether an instance is considered 'live'. (ex api pings back) doesn't need to be perfect for MVP

Event Timeline