Now that we are done with the experimentation phase of T330296: Dumps 2.0 Phase I: Proof of concept for MediaWiki XML content dump via Event Platform, Iceberg and Spark, we should start a proper decision record where we consult with key stakeholders and make lasting decisions.
Description
| Status | Subtype | Assigned | Task | ||
|---|---|---|---|---|---|
| Resolved | xcollazo | T358877 Dumps 2.0 Phase II: Production intermediate table milestone | |||
| Resolved | Ahoelzl | T358886 Decision records for Dumps 2.0 |
Event Timeline
From https://wikitech.wikimedia.org/wiki/Data_Platform_Engineering:
Where to put decision records?
This may vary by product or project. The Data Platform Engineering teams already have multiple places where these docs may be living. Some are on Wikitech at Metrics_Platform/Decision_Records, or in pages under /Evaluations. Some are on mediawiki.org at Data_Platform_Engineering/Data_Products/Decision_Records. For systems or products that already have a decision record somewhere, it may be best to continue that pattern and keep things consistent, but you should link to your decision record location from other locations where people might look for it. If you publish on Wikitech, add Category:Decision_log. Tip: the technical documentation toolkit has a Decision log template.
Well that is quite confusing.
@Ottomata Considering Dumps 2.0 is a new(er?) project, where should we keep decision records?
It is confusing. It seems most folks are putting decision records associated with teams on mediawiki.org, rather than associated with projects on wikitech. I prefer the latter, but for some things it is hard to find the right place on wikitech. E.g. for Event Platform stuff, its obvious decisions records about Event Platform belong there.
If we intend to make some nice comprehensive docs about Dumps 2 in wikitech, let's do it there. If not, then let's just keep in on mediawiki.org associated with the team for now.
My opinion: For BIG decisions, decisions should be on a wiki. For lower level implementation decisions (that may change as a project progresses), phab descriptions are fine. We just have to be clear in phab what the options were and what was chosen. E.g. see "Production cutover ideas" and the Note in T369845: [Refine Refactoring] Refine jobs should be scheduled by Airflow: deployment for a documented small decision. Sometimes even larger decisions are okay on phab, e.g. T198256: RFC: Modern Event Platform - Choose Schema Tech
For me: as long as they are somewhere and can be discovered years later (not in google docs) and linked to is what is important.