Page MenuHomePhabricator

Users should run explicit commands to materialize schema versions, rather than using magic git hooks
Closed, ResolvedPublic

Description

From @Krinkle

It took me quite a while to figure what file should (not) be modified,
what command to run to generate the other files, then to undo those
changes after realizing it had overwritten the previous versions, etc.

I was expecting that jsonschema-tools would find all current.yaml
files, and expand them all (at least any that are new or changed).
But, it seems this is not the case. The materialize command is for
one-offs only, and materialize-modified only works for changing files
that already existed (plus needs -G to suppress a security warning).
For new files, one first needs to stage them with 'git add' and then
run the materialize-modified with -s to expand it.

It seems with the current tooling, the quickest way involving the least
steps for new files is to pass the file to jsonschema-tools materialize
(instead of passing it to git-add and then run another command, and
then another git-add).


If I run this [the npm postinstall hook that installs the jsonschema-tools git hook] in fresh-node or one of the various container or VM dev envs we have, the hook will be rejected. If I run partly inside (to install) and then partly outside (for the commit), then the latter will either fail due to npm not being present or will essentially escape the sandbox and run lots of unaudited dev code from npm on the host machine with full disk, ssh, gpg, Git, and web browser access.

Having the script work standalone means that those without prod access or using secondary devices for this can continue to run it bare, and those with, can run "npm install" (once) and "npm run build" from inside their dev env, and then add/commit from their host as normal.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Change 714875 had a related patch set uploaded (by Ottomata; author: Krinkle):

[schemas/event/secondary@master] build: Document simpler alternative contribution flow

https://gerrit.wikimedia.org/r/714875

@Krinkle's patch does the following

  • npm run build - this works right away after cloning the repo and modifying an existing file. But, it does not see new files.
  • npm run build-new path/to/new/current.yaml - for new files, which works right after cloning the repo and creating the subdir+file, without needing to one of the files first.

If the general direction is liked, we could improve jsonschema-tools
to support this with one command. For example, the default mode
of jsonschema-tools materialize could be to look process all
current.yaml files in the configured schemaBasePath regardless of
Git state, and expand any that differ from the latest version on disk.


I've created this ticket to discuss with users of the schema repos. Should we remove the git hook magic? Is it better to use explicit build commands to materialize schemas?

odimitrijevic moved this task from Incoming to Event Platform on the Analytics board.

I've created this ticket to discuss with users of the schema repos. Should we remove the git hook magic? Is it better to use explicit build commands to materialize schemas?

I'm new to schema repos, but I think that POV might be helpful to decide, so here I go. For both cases where I needed to change something there, I always ran the command manually -- to be able to review it without making a commit. Sometimes there's an error in the syntax (with YAML and spacing it happens fairly quickly), and the feeling "I control what my terminal is doing" is also helpful for me.

So, +1 to removing git hook magic from me. If there are users interested in it, they can always install it manually anyway.

@Ottomata I would be in favor of a single npm run build command and removing the git hook magic to make schema materialization process more explicit.

Nice bump! :) Been thinking about how I haven't moved here. I want to do this but am worried that it will take more coordination than I have time for atm. The change isn't hard, but the documentation and publicizing of the change (as well as making sure people upgrade their locally installed jsonschema-tools) should be done well.

If anyone has time to help me with that, I'm all for it!

The change isn't hard, but the documentation and publicizing of the change (as well as making sure people upgrade their locally installed jsonschema-tools) should be done well.

My linked patch implements the needed changes and updates documentation. Afaik no immediate update of the jsonschema-tools package is needed within the scope of this change. The package is used as a local dev dependency currently, not a global/unversioned install on the author's host machine.

I'm happy to help with anything else needed.

Ah right! And, I guess this won't remove any previously added .git hooks in people's checkouts, so it won't actually affect their workflows unless they re-clone the secondary repo.

We should also do this in the primary repo then too.

Let's see, when the next PA sync is this Wednesday. I'll run this by those folks then.

Okay, PAs are okay with this. TODO

  • look in git history for schema committers and communicate this change
  • give 2 weeks for objections
  • update documentation
  • merge, and also apply same change to schemas/event/primary.

One question about the new proposed development workflow. Will npm build be required with every commit? Or will we be allowed to merge commits without it, and build only when we want to "release"? As I understand it, we "release" whenever we merge to the schema repositories. If so, does this new workflow create a potential for human error in merging a set of changes without a build?

Yes, but hopefully the CI will not allow you to merge if make changes without building.

If I remember right npm test fails if the updated current.yaml's have been materialized or if the versions don't match, so I think Andrew's right.

Maybe instead of letting CI do it, we can let the git hook still run and warn that an npm build is needed instead of just doing it?

Change 714875 merged by Ottomata:

[schemas/event/secondary@master] build: Document simpler alternative contribution flow

https://gerrit.wikimedia.org/r/714875

Change 772402 had a related patch set uploaded (by Ottomata; author: Ottomata):

[schemas/event/secondary@master] Use --no-git-add in npm run build-modified script

https://gerrit.wikimedia.org/r/772402

Change 772402 merged by jenkins-bot:

[schemas/event/secondary@master] Use --no-git-add in npm run build-modified script

https://gerrit.wikimedia.org/r/772402

Change 772418 had a related patch set uploaded (by Ottomata; author: Ottomata):

[schemas/event/primary@master] Change materialization method and update readme to match schemas/event/secondary

https://gerrit.wikimedia.org/r/772418

Change 772418 merged by jenkins-bot:

[schemas/event/primary@master] Change materialization method and update readme to match schemas/event/secondary

https://gerrit.wikimedia.org/r/772418

Alright, done! I've updated wikitech documentation as well.