**Context**:
[[ https://wikibase.world/wiki/Project:Home | Wikibase world ]] is the community-run, cloud hosted instance that is a directory of Wikibases. When I first manually collated our table of known instances I ran their query [[ https://wikibase.world/query/#PREFIX%20wdt%3A%20%3Chttps%3A%2F%2Fwikibase.world%2Fprop%2Fdirect%2F%3E%0APREFIX%20wd%3A%20%3Chttps%3A%2F%2Fwikibase.world%2Fentity%2F%3E%0A%0ASELECT%20%3FitemLabel%20%3Furl%20%3Fitem%20WHERE%20%7B%0A%20%20%20%20%3Fitem%20wdt%3AP3%20wd%3AQ10%20.%0A%20%20%20%20%3Fitem%20wdt%3AP1%20%3Furl%20.%0A%20%20%20%20%3Fitem%20wdt%3AP13%20wd%3AQ54%20.%0A%20%20%20%20SERVICE%20wikibase%3Alabel%20%7B%20bd%3AserviceParam%20wikibase%3Alanguage%20%22%5BAUTO_LANGUAGE%5D%2Cen%22.%20%7D%0A%7D | "Only Wikibases that are currently online" ]]. I am not convinced this list is accurate and I would like to check that there aren't any self-hosted (ie non cloud) instances within Wikibase.world which are live but not returned by this query.
**Goal**:
catch any missing instances from my first round of manual querying.
**Acceptance Criteria**
[] import all missing self-hosted Wikibases from wikibase.world to our metrics db
[] mark which instances are no longer online
[] determine urls for apis of live instances and update metrics db
[] filter out key exceptions such as Wikidata, Wikimedia Commons, etc (either by marking them w/in the database or by never importing them, whichever is simpler to build around)
Possible supporting mechanisms (optional):
note: the following is not necessarily in the correct order of operations
[] create a mechanism using cloud's api or raw data (found in the cloud metrics google sheet) to filter out any known cloud instances
[] create a manner to record whether an instance is considered 'live' (ex api pings back)