Context
We created a spreadsheet of known instances ("Data in WBS Instances") and filled it with data manually. We are now automating the collection of much of this data, which means the spreadsheet is getting out of date/sync with our latest stats. Nonetheless, this spreadsheet remains the most accessible place for most people on the team to access this data.
Goal
update the spreadsheet with the latest data to bring the two sources into parity, where feasible.
Acceptance Criteria:
- update existing columns w latest
- figure out which data are missing from the spreadsheet (that are present in the database)
- add in data missing in spreadsheet and fill w latest data from the database
- where needed create new tabs to cover missing data
the data to add:
- total items, lexemes, properties, triples
- mediawiki major version, whether it's LTS
- can we find the sparql endpoint? y/n
- user counts for last month (human, bot, total log)
- connectivity stats (specifically: Total Connections, avg distance, Connectivity)
also add, but likely need their own tabs in the sheet:
- property popularity
- extension data