Background
schema.wikimedia.org hosts JSON Schema definitions for Wikimedia Event Schemas. The schemas are presented as raw YAML files.
Motivation
In addition to their programmatic uses, these schemas contain documentation that needs to be read by humans, including descriptions, types, and requirements. For example, WMF product teams building instruments using the Metrics Platform need to read schema definitions to learn about the properties supported by the Metrics Platform schemas. However, the raw YAML files are not easy to read, especially for longer schemas. Attempts to make this information more accessible usually results in the information being manually duplicated in another location, creating documentation that is likely to become out of date as schemas change.
Proposal
Integrate a rendering process into the build step for schema.wikimedia.org that produces an HTML version of the schema definitions that will be served when browsing the site.
Tooling options
json-schema-for-humans
- https://github.com/coveooss/json-schema-for-humans
- Python tool
- Generates HTML or Markdown
- 46 contributors, last commit 2 months ago as of October 9, 2024
jsonschema2md
- https://github.com/adobe/jsonschema2md
- JavaScript tool
- Generates Markdown
- 47 contributors, last commit 2 days ago as of October 9, 2024
[Feel free to add other tools]
Considerations
- If we choose a tool that generates Markdown instead of HTML, we'll need an additional step to render the Markdown into HTML.
- How should the URL paths be organized?
- How do we link from the HTML version to the raw version?
Alternatives
See T372680: Investigate schema visualization tools for schema.wikimedia.org for alternative approaches.