Page MenuHomePhabricator

Finalize Data platform "Publish data" landing page
Closed, ResolvedPublic

Description

Steps to publishing:

  • Complete draft
  • Share draft for DPE teams / stakeholder feedback (Deadline: April 29 )
  • Feedback provided by DPE teams / stakeholders (Deadline: May 6)
  • Feedback integrated into draft
  • Draft published at https://wikitech.wikimedia.org/wiki/Data_Platform (this step will be delayed until all other landing pages are also ready to publish) -------------------------------------------

Draft of this page:
https://wikitech.wikimedia.org/wiki/User:Triciaburmeister/Sandbox/Data_platform/Publish_data

Deadline to provide feedback: May 6
Specific feedback requested is listed on the Talk page:
https://wikitech.wikimedia.org/wiki/User_talk:Triciaburmeister/Sandbox/Data_platform/Publish_data

This page is part of a set of new landing pages for the Data Platform docs on Wikitech. See parent task T350914 for more context and links to the other landing page tracking tasks.

Event Timeline

TBurmeister triaged this task as Medium priority.Apr 1 2024, 5:33 PM
TBurmeister created this task.
TBurmeister moved this task from Backlog to Next on the Tech-Docs-Team board.

Status update: draft completed and shared with DPE teams for input and feedback.

TBurmeister moved this task from In progress to Waiting for feedback on the Tech-Docs-Team board.
TBurmeister edited subscribers, added: odimitrijevic; removed: TBurmeister.
TBurmeister edited subscribers, added: TBurmeister; removed: odimitrijevic.

Met with @mpopov to get feedback on this page; need to have another discussion and implement some content changes:

  • Turnilo is more of a tool for end-users; the primary audience of this doc (analysts and data eng team members) would be more likely to need documentation about how to get data into Druid so that end-users can interact with it via Turnilo (and Superset, but that's a separate topic)
  • The conceptual difference between "collecting data / generating new datasets" and "publishing data" is ambiguous. Need to figure out what aligns best with how the teams use the docs; maybe the "Publish data" landing page covers all types of dataset generation and curation that isn't instrumentation-specific? Maybe there should just be one landing page for creating new data sources and generating tables/reports/dashboards that are derived from existing data sources, including event data and instrumentation? Is there a useful distinction (for the docs navigation) between ad-hoc dataset and report creation, recurring/standard report or metrics generation, and "official" data pipelines that generate canonical datasets?

After a review session with DPE on 5/14, I have overhauled the page content at https://wikitech.wikimedia.org/wiki/User:Triciaburmeister/Sandbox/Data_platform/Publish_data. It now includes some of the content for "Collect data" tasks, which may actually end up eliminating the need for that fourth landing page altogether. I've shared the revised draft of the landing page for another round of review, with a summary of the changes. Next review sessions is scheduled for 5/21.

TBurmeister claimed this task.
TBurmeister updated the task description. (Show Details)
TBurmeister added a subscriber: odimitrijevic.

Page published at https://wikitech.wikimedia.org/wiki/Data_Platform/Transform_data. The team agreed it's okay to publish with some existing TODOs, because work on much of the data lifecycle and governance documentation is still ongoing.