Page MenuHomePhabricator

Create and use new schema repositories
Closed, ResolvedPublic

Description

In T206789: Modern Event Platform: Schema Registry: Implementation, we decided to create two new schema respositories: schemas/event/primary and schemas/event/secondary. We will move schemas from the existent mediawiki/event-schemas repository, adopt any new conventions (like namespacing) and use these repositories in services (EventGate, Refinery, etc.), schema.wikimedia.org, etc.

Details

SubjectRepoBranchLines +/-
operations/puppetproduction+14 -31
operations/puppetproduction+3 -4
operations/deployment-chartsmaster+22 -13
operations/deployment-chartsmaster+16 -9
operations/deployment-chartsmaster+5 -5
operations/deployment-chartsmaster+14 -4
operations/deployment-chartsmaster+3 -3
schemas/event/secondarymaster+170 -6
eventgate-wikimediamaster+1 -1
operations/deployment-chartsmaster+1 -1
operations/deployment-chartsmaster+0 -0
operations/deployment-chartsmaster+134 -118
operations/deployment-chartsmaster+1 -1
eventgate-wikimediamaster+6 -3
schemas/event/primarymaster+9 K -10
schemas/event/secondarymaster+359 -0
schemas/event/secondarymaster+1 K -0
integration/configmaster+10 -0
schemas/event/primarymaster+371 -0
Show related patches Customize query in gerrit

Event Timeline

Change 558651 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[schemas/event/primary@master] Add wikimedia/common and mediawiki/client/error schemas

https://gerrit.wikimedia.org/r/558651

In https://phabricator.wikimedia.org/T206789#5678442 and below we discussed adding some namespacing to schema conventions. This would help avoid conflicts in schema titles between the different schema repositories, but more importantly will add helpful context and ownership of namespaces. A product analyst might feel more free to make change to schemas in /wikimedia/analytics, or even /wikimedia/analytics/product (?) than they would in /mediawiki.

I think it will be really hard to come up with consistent namespacing rules for everything. In https://gerrit.wikimedia.org/r/c/schemas/event/primary/+/558651 I'm giving a first attempt using the client/error schema from mediawiki/event-schemas.

I know that not all clients that use this error schema will be mediawiki. Perhaps /wikimedia/client/error is a better namespace for it? I'm not sure.

I'd possibly even put the mediawiki schemas under /wikimedia too, but they exist currently under just /mediawiki, and it will be nice to not have to migrate all the existent ones to new schema names.

Thoughts @jlinehan ?

fdans moved this task from Incoming to Event Platform on the Analytics board.

Change 558651 merged by Ottomata:
[schemas/event/primary@master] Add wikimedia/common and mediawiki/client/error schemas

https://gerrit.wikimedia.org/r/558651

Change 562340 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[schemas/event/primary@master] Copy schemas from mediawiki/event-schemas that belong in schemas/event/primary repo

https://gerrit.wikimedia.org/r/562340

Change 562341 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[schemas/event/secondary@master] New schemas/event/secondary jsonschema repository

https://gerrit.wikimedia.org/r/562341

Change 562343 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[integration/config@master] Add CI for new event schema repositories

https://gerrit.wikimedia.org/r/562343

Change 562343 merged by jenkins-bot:
[integration/config@master] Add CI for new event schema repositories

https://gerrit.wikimedia.org/r/562343

Change 562341 merged by Ottomata:
[schemas/event/secondary@master] New schemas/event/secondary jsonschema repository

https://gerrit.wikimedia.org/r/562341

Mentioned in SAL (#wikimedia-releng) [2020-01-06T19:45:24Z] <James_F> Zuul: Add CI for new event schema repos T240985

Change 562360 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[schemas/event/secondary@master] Add sparql/query and swift/upload schemas from mediawiki/event-schemas

https://gerrit.wikimedia.org/r/562360

I've just pushed up to changes to copy (and rematerialize) schemas out of mediawiki/event-schemas into primary and secondary repositories.

schemas moved into schemas/event/primary:

  • change-prop/*
  • error
  • mediawiki/*
  • resource_change
  • test/event

schemas moved into schemas/event/secondary:

  • sparql/query
  • swift/upload

This should cover it! If we merge these, we should be able to rebuild the eventgate-wikimedia image with these repos and configure the instances to use them accordingly.

Change 562360 merged by Ottomata:
[schemas/event/secondary@master] Add sparql/query and swift/upload schemas from mediawiki/event-schemas

https://gerrit.wikimedia.org/r/562360

Change 562340 merged by Ottomata:
[schemas/event/primary@master] Copy schemas from mediawiki/event-schemas that belong in schemas/event/primary repo

https://gerrit.wikimedia.org/r/562340

Change 562370 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[eventgate-wikimedia@master] Use new primary and secondary schema repositories instead of mediawiki/event-schemas

https://gerrit.wikimedia.org/r/562370

Change 562370 merged by Ottomata:
[eventgate-wikimedia@master] Use new primary and secondary schema repositories instead of mediawiki/event-schemas

https://gerrit.wikimedia.org/r/562370

Change 562623 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/deployment-charts@master] Use new schemas/event/{primary,secondary} in staging eventgate services

https://gerrit.wikimedia.org/r/562623

Change 562623 merged by Ottomata:
[operations/deployment-charts@master] Use new schemas/event/{primary,secondary} in staging eventgate services

https://gerrit.wikimedia.org/r/562623

Change 562906 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/deployment-charts@master] staging/eventgate-logging-external - fix name of client error schema to precache

https://gerrit.wikimedia.org/r/562906

Change 562906 merged by Ottomata:
[operations/deployment-charts@master] staging/eventgate-logging-external - fix name of client error schema to precache

https://gerrit.wikimedia.org/r/562906

Change 562908 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/deployment-charts@master] eventgate - use new primary schema repository by default

https://gerrit.wikimedia.org/r/562908

Change 562908 merged by Ottomata:
[operations/deployment-charts@master] eventgate - use new primary schema repository by default

https://gerrit.wikimedia.org/r/562908

Change 562909 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/deployment-charts@master] Add missing eventgate-0.0.17.tgz

https://gerrit.wikimedia.org/r/562909

Change 562909 merged by Ottomata:
[operations/deployment-charts@master] Add missing eventgate-0.0.17.tgz

https://gerrit.wikimedia.org/r/562909

Change 562930 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/deployment-charts@master] eventgate-logging-external - use proper schema_title name for mediawiki.client.error stream

https://gerrit.wikimedia.org/r/562930

Change 562930 merged by Ottomata:
[operations/deployment-charts@master] eventgate-logging-external - use proper schema_title name for mediawiki.client.error stream

https://gerrit.wikimedia.org/r/562930

Change 562951 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[schemas/event/secondary@master] Use schema.wikimedia.org to resolve primary schemas and re-materialize

https://gerrit.wikimedia.org/r/562951

Change 562951 merged by Ottomata:
[schemas/event/secondary@master] Use schema.wikimedia.org to resolve primary schemas and re-materialize

https://gerrit.wikimedia.org/r/562951

Change 562954 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[eventgate-wikimedia@master] Update seceondary schema repo to a2edfc5

https://gerrit.wikimedia.org/r/562954

Change 562954 merged by Ottomata:
[eventgate-wikimedia@master] Update seceondary schema repo to a2edfc5

https://gerrit.wikimedia.org/r/562954

Change 562962 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/deployment-charts@master] eventgate - Bump staging services image version

https://gerrit.wikimedia.org/r/562962

Change 562962 merged by Ottomata:
[operations/deployment-charts@master] eventgate - Bump staging services image version

https://gerrit.wikimedia.org/r/562962

Ok, new images and configs deployed to all staging instances. I tested POSTing an example of each event type that each instance accepts. Should be good to go to prod!

Change 563231 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/deployment-charts@master] Use new primary schema repo for eventgate-logging-external

https://gerrit.wikimedia.org/r/563231

Change 563231 merged by Ottomata:
[operations/deployment-charts@master] Use new primary schema repo for eventgate-logging-external

https://gerrit.wikimedia.org/r/563231

Ok, I've deployed usage of the new schema repos totally to eventgate-logging-external, but only to staging for eventgate-main and eventgate-analytics. @akosiaris is working on live canary release functionality in our helm charts. I'd like to be able to deploy this change to a single pod in the active services to be more confident that nothing will break! Even though I tested all the event schemas I could, I'd still prefer if I could serve a fraction of real traffic before blowing away the old schema repo.

I'll wait until the helm canary support is figured out before proceeding. In the meantime I'm developing eventgate-analytics-external over in T233629 and will use these new schema repos there.

I 've went ahead and added some basic canary support in the mathoid chart (since this is the one I feel the most familiar with). https://gerrit.wikimedia.org/r/#/c/operations/deployment-charts/+/469662/ are the changes added and https://gerrit.wikimedia.org/r/#/c/operations/deployment-charts/+/563409/ is the enabling for just codfw. A simple dashboard at https://grafana.wikimedia.org/d/JTnxOdEZk/xxx-mathoid-canary?orgId=1 shows that it is working as expected. I tend to leave it be for a couple of days to make sure no issues arise from this approach, then follow the path that @Joe carved using common_templates in charts and add support for it to all charts, including eventgate. So probably end of next week we will be able to deploy an eventgate (which one I 'll leave to @Ottomata) with canaries enabled

Change 563545 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/deployment-charts@master] eventgate - bump image versions to use latest schema repo versions

https://gerrit.wikimedia.org/r/563545

Change 563545 merged by Ottomata:
[operations/deployment-charts@master] eventgate - bump image versions to use latest schema repo versions

https://gerrit.wikimedia.org/r/563545

Change 572900 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/deployment-charts@master] eventgate-analytics Use primary and secondary schema repos

https://gerrit.wikimedia.org/r/572900

Change 572900 merged by Ottomata:
[operations/deployment-charts@master] eventgate-analytics Use primary and secondary schema repos

https://gerrit.wikimedia.org/r/572900

Change 586356 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] refine - look for schemas both primary and secondary schema repositories

https://gerrit.wikimedia.org/r/586356

Change 586356 merged by Ottomata:
[operations/puppet@production] refine - look for schemas both primary and secondary schema repositories

https://gerrit.wikimedia.org/r/586356

Change 587255 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Remove now unused mediawiki/event-schemas repo

https://gerrit.wikimedia.org/r/587255

Change 587255 merged by Ottomata:
[operations/puppet@production] Remove now unused mediawiki/event-schemas repo

https://gerrit.wikimedia.org/r/587255