Page MenuHomePhabricator

Wikidata Schema
Closed, ResolvedPublic3 Estimated Story Points

Description

As WME Engineer I want to understand the Product Requirements and create a final schema for the release of wikidata on WME APIs.

Developer Notes:

As normal on data pipelines, each step should have it's own schema. Having multiple schemas will create a decoupling from producers and consumers. It will also allow us to focus on broader picture and future requirements rather than the immediate need. If we decouple eventstream listener for wikidata schema from the final schema we can introduce other consumers of that data in the future for better data collection and processing.

We suggest then to have 2 schemas, one for eventstream listener...that should focus on it's responsability and not be bound to the exact usage of wikidata.

TODO

  • Gather product requirements, if not explicit in PRD and RFC.
  • Define EventStream Listener Schema
  • Define Wikidata Entity Schema
  • Define Final Wikidata Schema - Ehi to create follow up tickets for schema, this is V1 (go with the SDK as is now)
    • all comments in the wikidata article doc are resolved
  • Present and Discuss with team
  • Create the new avro and golang schemas

Acceptance Criteria

  • Team agreement on schemas (product schema and event listener schema)
  • Avro and Golang schemas merged to main in schema repository/submodule

Event Timeline

JArguello-WMF triaged this task as High priority.
JArguello-WMF added a subscriber: LDlulisa-WMF.

Hi @E.Enabulele ! can you please update the checklist with the progress this week? thanks!

SGupta-WMF updated Other Assignee, added: E.Enabulele.
SGupta-WMF added a subscriber: E.Enabulele.
JArguello-WMF updated Other Assignee, removed: E.Enabulele.
JArguello-WMF added a subscriber: SGupta-WMF.
JArguello-WMF changed the point value for this task from 5 to 3.Sep 11 2025, 1:16 PM

document in the decision log team decided to go with current SDK

JArguello-WMF renamed this task from Wikidata Schema to Wikidata Schema.Oct 2 2025, 12:41 PM
JArguello-WMF reassigned this task from E.Enabulele to REsquito-WMF.
JArguello-WMF moved this task from Next Up to MR on the Wikimedia Enterprise (Sprint 83) board.
JArguello-WMF moved this task from MR to In Progress on the Wikimedia Enterprise (Sprint 83) board.

@REsquito-WMF when you feel better, please reply in the google doc for schema thread with either:

“Done in MR” → if the change is already included in the merge request, or

“Moved to V2” → if it’s being deferred to the next version.

Once all comments are updated that way, and the team has reviewed the schema MR, we'll be able to move this one to sign off. Thanks!

@JArguello-WMF appreciate the note...after reviewing the comments...such actions don't make sense.

Can we stop taking examples as a definite guide of what the schema will be?

This is an iterative process that requires back and forward...The final solution will account all requirements.

Scehma finalized after deciding the name and unit tests.