Data Platform Engineering Bug Report or Data Problem Form.
Please fill out the following
Please ensure you set priority
What kind of problem are you reporting?
- Access related problem
- Service related problem
- Data related problem
For a data related problem:
- Is this a data quality issue?
No, it's a data documentation/ findability issue
- What datasets and/or dashboards are affected?
https://dumps.wikimedia.org/other/ and Wikipedia Clickstream
- What are the observed vs expected results? Please include information such as location of data, any initial assessments, sql statements, screenshots.
Observed:
The Wikipedia clickstream datasets link on https://dumps.wikimedia.org/other/ currently links directly to the list of data dumps.
Problem: this makes it difficult if not impossible for a user to realize that there is documentation for that data at https://dumps.wikimedia.org/other/clickstream/readme.html.
Proposed fix:
The link on https://dumps.wikimedia.org/other/ should point to https://dumps.wikimedia.org/other/clickstream/readme.html. This is the link used on other dumps pages, like https://dumps.wikimedia.org/other/analytics/. (I don't know how to update this HTML, and haven't been able to figure out what the process might be for doing that, though I did read https://wikitech.wikimedia.org/wiki/Dumps and related pages).
For the DE Team to fill out
Which systems does this effect?
- Hive
- Druid
- Superset
- Turnilo
- WikiDumps
- Wikistats
- Airflow
- HDFS
- Goblin
- Scqoop
- Dashiki
- DataHub
- Spark
- Jupyter
- Modern Event Platform
- Event Logging
- Other
Impact Assessment:
Does this problem qualify as an incident?
- Yes
- No
Does this violate an SLO?
- Yes
- No
Value Calculator | Rank |
---|---|
Will this improve the efficiency of a teams workflow? | 1-3 |
Does this have an effect of our Core Metrics? | 1-3 |
Does this align with our strategic goals? | 1-3 |
Is this a blocker for another team? | 1-3 |