Page MenuHomePhabricator

MVP WBS Metrics tooling from a list of known instances
Closed, ResolvedPublic

Description

Current Situation:
We are creating a Google Sheet by manually finding data on different Wikibase instances for which we have the URL

Goal:
Collect and display key metrics to improve our understanding of how Wikibase Suite is being used in the community. This will facilitate better decision making around how we invest our resources. As well as improving learning and accountability as we attempt to turn our investments into the impact we are aiming to achieve.

Approach to Privacy

  • Will only collect data which is publicly available (ex. through APIs/SPARQL)
  • We are collecting data from known wikibase instances which are openly available (i.e. not made private) on the web
  • Our good-faith assumption is this brings up no privacy issues for the parties whose data is being analyzed
  • We will present the findings back to the WBSG and in broader community once the scope of the engineering and analysis MVP has been met
  • We will remove any instances who wish to be excluded from this analysis

Definition of done:
Create an MVP that allows us to automate the collection and analysis of publicly available data in a local sandbox.

Acceptance Criteria:

MVP scope

  • Local (premise is this is simpler to dev and lets us set aside any privacy concerns during MVP phase)
  • Automation:
    • In scope: storing -> analysis -> display
    • Out of scope: Collection
  • Front-end
    • In scope: static display of analyzed data / key metrics
    • Out of scope: Explorable, in-tool analysis front-end
  • Metrics scope and approach (slices) and source
  • Have a mechanism to remove instances and delete collected data from any instances whose owners request not to be tracked (can be manual)
  • Out of scope: Storing historical data

Specs (subject to change based on Eng discovery)

  • Data sources: individual wikibases via Action API and SPARQL endpoints
  • Database: SQLite
  • Backend python serving GraphQL
  • Analysis: tbd, likely python and/or front-end
  • Front-end: tbd, consider Grafana and google sheets

Additional tasks

  • Any wishlist items not acquirable within scope are noted in this ticket
  • Note opportunities not covered in wishlist during API exploration in this ticket

Event Timeline

jon_amar-WMDE renamed this task from MVP Action API-Based Wikibase Metrics to MVP WBS Metrics tooling from a list of known instances.Jun 28 2024, 3:40 PM
jon_amar-WMDE updated the task description. (Show Details)

we realized that " Front-end... static display of analyzed data / key metrics" (currently assumed to live in google sheets) requires a remote server to bring data into sheets via an easily accessible api. so we are removing this spec from our scope. it will be revisited in a follow-on project.

RickiJay-WMDE changed the subtype of this task from "Spike" to "Task".Aug 21 2024, 8:51 PM