Page MenuHomePhabricator

Plan Out Wikidata Ingestion
Open, Needs TriagePublic

Description

Need to plan out the way we want to ingest Wikidata so we can start scoping it out and putting into tickets.

Context:
In the future, we are exploring adding Wikidata to the streams, hourly diffs, and daily exports. Wikimedia Enterprise users use Wikidata alongside the other Wikimedia projects to baseline their knowledge graphs. First step of this process is to figure out how we could reliably ingest and receive Wikidata (and it's scale) into our infrastructure to inevitably bundle into these different data feeds.

More info: https://www.mediawiki.org/wiki/Wikimedia_Enterprise#Adding_Wikidata_to_Wikimedia_Enterprise

To Do:

  • update architecture diagram
  • create tickets in backlog
  • figure out schema

Related Objects

StatusSubtypeAssignedTask
OpenCpetrillo
OpenNone

Event Timeline

AnnaMikla set the point value for this task to 8.Jan 6 2022, 4:17 PM
Lena.Milenko changed the task status from Open to In Progress.Jan 20 2022, 3:16 AM

Added more context to this description - if any questions, feel free to ping me.

Also documenting much of this work on mediawiki

Lena.Milenko changed the task status from In Progress to Open.Feb 22 2022, 5:01 PM
Protsack.stephan changed the point value for this task from 8 to 13.May 4 2022, 2:34 PM
prabhat subscribed.
Protsack.stephan raised the priority of this task from Medium to Needs Triage.Oct 12 2022, 9:43 AM