Ticket proliferation disambiguration!
- T185233: Modern Event Platform is the overall Modern Event Platform parent ticket.
- T201063: Modern Event Platform: Schema Repostories is the parent Event Schema Registry task, it describes high level requirements/user stories of this component.
- T201643: RFC: Modern Event Platform: Schema Registry is the RFC ticket, it will hopefully be closed once the RFC process finishes.
This ticket will be used to track and task implementation work for the Schema Registry.
Description
Since we are moving forward with git as the canonical storage of schemas, we can base implementation to be done for Q2 2018-2019 on the existing event-schemas repository. This repository currently contains Draft 4 JSON schemas with some minimal CI jobs to ensure schema consistency. Implementation work for this task will mostly be around git commit/merge hooks and CI improvements.
We also may want to build an HTTP service to serve schemas. If so, this service might be as simple as just an HTTP file server that exposes the git repository (or repositories) hierarchy and schemas.
In either case, schemas will always be addressable via URIs, whether those schemas are checked out on the local filesystem (file://) or via HTTP (http://).
Technical Requirements
- Up to date JSONSchema support (Draft 7?)
- All schema versions maintained in HEAD commit (we won't be using git history to version schemas)
- CI for ensuring schema backwards compatibility
- CI for schema linting, e.g. no camelCase, only snake_case, etc.
- CI for schema field annotations (dimension vs measure, PII, etc.)
- 'latest' schema version is editable and changes to it are reviewable using usual git review tools - T206812
- Post commit or merge git hooks to create new versioned file copies of schemas - T206812
- Schemas can be in YAML or JSON format, but files should not have file extensions so relative schema_uris don't need to include (or append) a proper file extension - T206812
Other ideas
On 2018-10-12, @Pchelolo and @Ottomata brainstormed implementation ideas. Much of the implementation work to be done is around CI and development workflows. Some of this is already done for mediawiki/event-schemas, but we need to do more. I'll try and collect some of the things we need to implement.
- editing of schemas should be done to the current schema version.
- JSON $ref pointers can be used only in the current schema version.
- $ref pointers to other schemas must be strongly versioned. E.g. if we factor out the meta schema,
- every event that uses it will point to a specific version of meta, e.g. meta/1.0.0, or meta/1.2.0.
- versioned $ref pointers in schemas must be manually upgraded by editing the schema and creating a new schema version.
- This will ensure that any changes to referenced schemas will not affect user schemas until they manually update the referenced version. (This is how dependencies normally work anyway.)
- git hooks will dereference current to generate standalone explicitly committed versioned schema files.
- schema version number is manually modified and set in current's $id field.
- if only a code comment or description field change in current schema, don't generate a new schema version.
- backwards compatibility library ensure changes are backwards compatible in git hook and also CI.
- Scheams versinoned with semver