Page MenuHomePhabricator

[jobs-api] Save business models in a DB
Open, HighPublic

Description

This could be:

  • A new database (would need some kind of store, or be in trove)
  • As a custom resource in k8s (that would be using k8s/etcd as database)
    • This should be read-only for users, as we want to only modify it through the API (that way we don't need admission controllers or controllers at all)

More questions and likely answers (feel free to edit the below section if you have other opinions or ideas)

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  # name must match the spec fields below, and be in the form: <plural>.<group>
  name: toolforge-scheduled-job.jobs-api.toolforge.org
spec:
  # group name to use for REST API: /apis/<group>/<version>
  group: jobs-api.toolforge.org
  # list of versions supported by this CustomResourceDefinition
  versions:
    - name: v1
      # Each version can be enabled/disabled by Served flag.
      served: true
      # One and only one version must be marked as the storage version.
      storage: true
      schema:
        openAPIV3Schema:
          type: object
          properties:
            spec:
              type: object
              properties:
                cmd:
                  type: string
                cpu:
                  type: string
                ...

  # either Namespaced or Cluster
  scope: Namespaced
  names:
    # plural name to be used in the URL: /apis/<group>/<version>/<plural>
    plural: toolforge-scheduled-jobs
    # singular name to be used as an alias on the CLI and for display
    singular: toolforge-scheduled-job
    # kind is normally the CamelCased singular type. Your resource manifests use this.
    kind: ToolforgeScheduledJob
    # shortNames allow shorter string to match your resource on the CLI
    shortNames:
    - tsj
  • in what namespace should the database be?
    • Each tool's namespace

For example, inside tool-tf-test namespace there would be a bunch of ToolforgeScheduledJob resources defining each scheduled job.

  • what are we putting in this database?
    • All the information needed to rebuild the user's jobs if needed (that means the stuff we keep in labels plus anything else needed to start that job, it does not include the status, if it's running/stopped/etc. for example)
    • (component config) - dc: @Raymond_Ndibe what do you mean with this?
  • possible paths? -- dc: @Raymond_Ndibe what do you mean paths? The paths on the API side don't change.
    • /toolforge/<tool-name>/jobs-api/<job-name>/name {'unique-id' 'created-by' 'version' 'type' 'name' 'imagename' 'cmd' 'emails' 'retry' 'mount' 'continuous' 'filelog'} (or some variations of this)
    • (/toolforge/<tool-name>/components/config {<component config goes here>})
  • what do we do in case of path migration?
    • if we decide to change the path from /toolforge/<tool-name>/jobs-api/<job-name>/name to say /toolforge/<tool-name>/component/<component-name>/jobs-api/<job-name>/name, we'll need to manually write a script that will migrate the paths and data
  • how do we ensure that the database entries are in sync with the kubernetes objects they represent?
    • we don't. If someone manually makes changes the underlying kubernetes objects of a job, something will probably go wrong. For this reason the database should only be editable by the apis otherwise should be readonly (haven't thought about how to enforce that yet). If I am not mistaken this is also the current situation rn, give or take

Event Timeline

dcaro triaged this task as High priority.Mar 8 2024, 4:59 PM
dcaro created this task.

Beware, label values and similar have limitations on what characters they can store.

In the past, I evaluated using something like https://www.crossplane.io/ to create a custom resource for the jobs. But in the end it felt like just offsetting the abstraction elsewhere. You need to do have the translation logic somewhere.

I remember crossplane, we don't really need a whole new framewrok, just a database to store the actual abstractions we have (that's what the custom resource on k8s would become), instead of mixing them up with k8s native ones.

dcaro renamed this task from [jobs-api] Store user specified command in a label or similar to [jobs-api] Save business models in a DB.Mar 11 2024, 11:43 AM
dcaro updated the task description. (Show Details)

I would be happy to talk about this re-architecture idea. I can share a bit more info about what I tested in the past, and what architecture I had in mind when I first created this, although the code is maybe self-explanatory already.

I would be happy to talk about this re-architecture idea. I can share a bit more info about what I tested in the past, and what architecture I had in mind when I first created this, although the code is maybe self-explanatory already.

That'd be useful yes, I've added an entry in the toolforge meeting tomorrow to have a chat there, happy to chat somewhere else too though if you prefer

made some attempt to define somethings and answer some important questions on the task description, based on our discussion @dcaro . Input and possible modifications are welcome

made some attempt to define somethings and answer some important questions on the task description, based on our discussion @dcaro . Input and possible modifications are welcome

thanks! I think there's still some confusion xd, feel free to send me an invite for a quick chat and we can clarify further