Page MenuHomePhabricator

[jobs-api,buildservice-api,envvars-api] Investigate ways to present our multiple Openapi definitions to a future consolidated CLI client
Open, HighPublic

Description

We want to be able to expose our openapi definitions for all our toolforge APIs.

This task is to investigate ways to make that possible, some preferences:

  • Doing it "on the fly", so we don't have to generate a new definition any of the sub-apis changes
  • Doing it in a single definition
    • if not possible, we might want a script to generate it?
  • Should still be openapi valid

It should be presented by the api gateway (it could be a proxy from it, but it should be exposed externally through the gateway).

We will want to be able to generate our unified client and probably libraries from it.

Details

TitleReferenceAuthorSource BranchDest Branch
builds-api: bump to 0.0.130-20240221160838-03b2a120repos/cloud/toolforge/toolforge-deploy!205project_1317_bot_df3177307bed93c3f34e421e26c86e38bump_builds-apimain
envvars-api: bump to 0.0.39-20240221153320-e6de38fdrepos/cloud/toolforge/toolforge-deploy!204project_1317_bot_df3177307bed93c3f34e421e26c86e38bump_envvars-apimain
main.go: fix base urlrepos/cloud/toolforge/envvars-api!24dcarofix_v1_prefixmain
envvars-api: bump to 0.0.38-20240221140654-e441981frepos/cloud/toolforge/toolforge-deploy!202project_1317_bot_df3177307bed93c3f34e421e26c86e38bump_envvars-apimain
api: move the prefix to each endpointrepos/cloud/toolforge/builds-api!79dcaromove_v1_to_urlmain
Customize query in GitLab

Event Timeline

dcaro triaged this task as High priority.Feb 6 2024, 2:44 PM

So far, the only thing I could find online is this: https://github.com/Trax-air/swagger-aggregator which seems a bit old (last commit 6 years ago).

aborrero renamed this task from [toolforge API] Investigate ways to present our openapi definitions to users to [toolforge API] Investigate ways to present our multiple Openapi definitions to a future consolidated CLI client.Feb 7 2024, 1:48 PM

I have been exploring this, with the 3 openapi definitions:

For a number of reasons, directly merging them will be very difficult and cumbersome:

  • conflicting paths. For example, /v1/healthz or /v1/quota are defined in ennvars, builds and jobs.
  • conflicting component schemas. For example, Quota is defined in builds and jobs. Envvars uses EnvvarQuota though.
  • overlapping server URL specification. For example, builds and envvars use /v1, which is too generic. In jobs, it contains a more precise https://api.svc.tools.eqiad1.wikimedia.cloud:30003/jobs/api/v1.

For a merge to happen we would need first to update all 3 spec files to overcome all the conflicts, make them coherent, then write some script to do the merge.
Then, establish some kind of procedure or mechanism to ensure we don't diverge from this coherence in the future.

Maybe another, simpler approach would be to skip the unified openapi spec definition, and go directly to python lib packages:

  • for each spec file, generate a python library package.
  • in the repo for the consolidated client, use all the python library packages.
  • this can even be done on a single repository, i.e, toolforge-api-python-libs, where we have a script that fetches each individual openapi spec, generates the python libs, and published them to pip.

Going this route would leave the question open about how/where to publish the API documentation. Well, I would just publish each individual openapi spec as they are, via wikitech.

That would also defeat the main purpose of the API unification, that is, presenting a single abstraction to the users, this is a single entry point, a single dependency, a single documentation and a single library.

All the APIs are currently served under the same url, just prefixed with the API name (ex. /build/v1/healthz), so the endpoints are already there and coexist without problems, what's left is generating the definition for them.

That would also defeat the main purpose of the API unification, that is, presenting a single abstraction to the users, this is a single entry point, a single dependency, a single documentation and a single library.

This as theoretical goal is OK, but the reality is that today our architecture is a microservices one.

The challenge remains: how to bring coherence to the individual spec files (and maintain such coherence).

Maybe if we want a single library the easier would be to craft a monolitic API.

That would also defeat the main purpose of the API unification, that is, presenting a single abstraction to the users, this is a single entry point, a single dependency, a single documentation and a single library.

This as theoretical goal is OK, but the reality is that today our architecture is a microservices one.

The challenge remains: how to bring coherence to the individual spec files (and maintain such coherence).

Maybe if we want a single library the easier would be to craft a monolitic API.

I think we are going round and round the same discussions here (of which you were part of too), let's try to avoid it and instead to be constructive.

We can aggregate each of the APIs definitions as they are now, under the APIs subpath (jobs under /jobs/*, builds under /builds/*, etc.), that solves already the issue of the path collisions.

So the only thing left is datastructure name collisions, I can try giving that a look in a bit and come up with something. If in the end we have to change the names of those, it's a backwards compatible and simple change to do anyhow, so still good.

My concerns are also related to maintainability. In particular, if we solve by hand the collisions today, how to prevent them from sprawling in the future, across different repos.

And, in general, how to deterministically control and ensure that the individual openapi specs are crafted in a way that will play nicely later in the aggregated one.

Even though exposing multiple services underneath a single API Gateway is a common pattern, there doesn't seem to be much open-source tooling in this area. There are also API Gateways with integrated support for OpenAPI https://openapi.tools/#gateway. Only fusio seems to be open source though.

Doing it "on the fly", so we don't have to generate a new definition any of the sub-apis changes

@dcaro, do you mean not merging the separate OAI specs into a single file first, and then exposing it? Like, if each API exposes its own spec on an endpoint, the Gateway grabs these and generates a unified spec dynamically?

@dcaro, do you mean not merging the separate OAI specs into a single file first, and then exposing it? Like, if each API exposes its own spec on an endpoint, the Gateway grabs these and generates a unified spec dynamically?

Ideally, generating it on the fly would be better, but if it's not feasible, doing it offline with a script might be doable.

This silly script that I just started when you wrote the comment already "almost" generates a valid openapi from the envvars, builds and jobs API defs:

from __future__ import annotations
import yaml
from pathlib import Path
from typing import Any


OPENAPI_SOURCES = {
    "builds": Path("../../builds-api/openapi/v1.yaml"),
    "jobs": Path("../../jobs-api/openapi/openapi.yaml"),
    "envvars": Path("../../envvars-api/openapi/v1.yaml"),
}
BASE_API = Path("base.yaml")


def rebase_refs(api_def: dict[str, Any], old_path: str, new_path: str) -> None:
    for key, value in api_def.items():
        if key == "$ref":
            new_value = value.replace(old_path, new_path)
            api_def[key] = new_value

        else:
            if isinstance(value, dict):
                rebase_refs(api_def=value, old_path=old_path, new_path=new_path)


def rebase_api_paths(api_def: dict[str, Any], base_url: str) -> None:
    rebased_paths = {
        f"{base_url}{old_url}": data for old_url, data in api_def["paths"].items()
    }
    api_def["paths"] = rebased_paths


def main():
    merged_api = yaml.safe_load(BASE_API.open("r"))
    for api_name, source in OPENAPI_SOURCES.items():
        old_api = yaml.safe_load(source.open("r"))
        rebase_api_paths(api_def=old_api, base_url=f"/{api_name}")
        rebase_refs(
            api_def=old_api,
            old_path="/components/schemas/",
            new_path=f"/components/schemas/{api_name}",
        )
        merged_api["paths"].update(old_api["paths"])
        try:
            merged_api["components"]["schemas"].update(
                {
                    f"{api_name}{component_name}": component_value
                    for component_name, component_value in old_api["components"][
                        "schemas"
                    ].items()
                }
            )
        except Exception:
            import pdb

            pdb.set_trace()

    Path("merged_api.yaml").write_text(yaml.safe_dump(merged_api))


if __name__ == "__main__":
    main()

1components:
2 schemas:
3 buildsBadRequest:
4 properties:
5 message:
6 type: string
7 type: object
8 buildsBuild:
9 allOf:
10 - $ref: '#/components/schemas/BuildCondition'
11 - properties:
12 build_id:
13 type: string
14 destination_image:
15 type: string
16 end_time:
17 type: string
18 parameters:
19 $ref: '#/components/schemas/BuildParameters'
20 start_time:
21 type: string
22 type: object
23 buildsBuildCondition:
24 properties:
25 message:
26 type: string
27 status:
28 $ref: '#/components/schemas/buildsBuildStatus'
29 type: object
30 buildsBuildId:
31 properties:
32 id:
33 type: string
34 type: object
35 buildsBuildLogs:
36 properties:
37 lines:
38 items:
39 type: string
40 type: array
41 type: object
42 buildsBuildParameters:
43 properties:
44 ref:
45 type: string
46 source_url:
47 type: string
48 type: object
49 buildsBuildStartParams:
50 properties:
51 ref:
52 description: Source code reference to build (ex. a git branch name)
53 type: string
54 source_url:
55 description: URL to the public git repository that contains the source code
56 to build
57 type: string
58 required:
59 - source_url
60 type: object
61 buildsBuildStatus:
62 enum:
63 - BUILD_RUNNING
64 - BUILD_SUCCESS
65 - BUILD_FAILURE
66 - BUILD_CANCELLED
67 - BUILD_TIMEOUT
68 - BUILD_UNKNOWN
69 type: string
70 buildsHealthResponse:
71 properties:
72 message:
73 type: string
74 status:
75 enum:
76 - OK
77 - ERROR
78 type: string
79 type: object
80 buildsInternalError:
81 properties:
82 message:
83 type: string
84 type: object
85 buildsNewBuild:
86 properties:
87 name:
88 type: string
89 parameters:
90 $ref: '#/components/schemas/buildsNewBuild_parameters'
91 type: object
92 buildsNewBuild_parameters:
93 properties:
94 ref:
95 type: string
96 source_url:
97 type: string
98 type: object
99 buildsNotFound:
100 properties:
101 message:
102 type: string
103 type: object
104 buildsUnauthorized:
105 properties:
106 message:
107 type: string
108 type: object
109 buildsprincipal:
110 properties:
111 user:
112 type: string
113 type: object
114 envvarsEnvvar:
115 properties:
116 name:
117 pattern: ^[A-Z_][A-Z_0-9]{3,}$
118 type: string
119 value:
120 type: string
121 type: object
122 envvarsEnvvarsQuota:
123 properties:
124 quota:
125 description: The quota available to the user
126 type: integer
127 used:
128 description: The quota used by the user
129 type: integer
130 type: object
131 envvarsHealthResponse:
132 properties:
133 message:
134 type: string
135 status:
136 enum:
137 - OK
138 - ERROR
139 type: string
140 type: object
141 envvarsInternalError:
142 properties:
143 message:
144 type: string
145 type: object
146 envvarsNotFound:
147 properties:
148 message:
149 type: string
150 type: object
151 envvarsenvvar_name_body:
152 properties:
153 value:
154 description: Value of the envvar to store
155 type: string
156 required:
157 - value
158 type: object
159 envvarsprincipal:
160 type: string
161 jobsDefinedJob:
162 allOf:
163 - $ref: '#/components/schemas/NewJob'
164 - properties:
165 image_state:
166 description: Information about the image state, i.e, if it is stable,
167 deprecated, or something else.
168 example: stable
169 type: string
170 schedule_actual:
171 description: If the job is a cronjob, actual execution schedule.
172 example: 15 * * * *
173 type: string
174 status_long:
175 description: Job status, extended text.
176 example: Running since 30 seconds ago.
177 type: string
178 status_short:
179 description: Job status, short text.
180 example: Running.
181 type: string
182 type: object
183 jobsError:
184 properties:
185 message:
186 type: string
187 type: object
188 jobsHealth:
189 properties:
190 message:
191 type: string
192 status:
193 enum:
194 - OK
195 - ERROR
196 type: string
197 type: object
198 jobsImages:
199 items:
200 properties:
201 image:
202 type: string
203 shortname:
204 type: string
205 type: object
206 type: array
207 jobsJobLogs:
208 type: string
209 jobsNewJob:
210 properties:
211 cmd:
212 description: Command that this job is executing.
213 example: ./some-command.sh --with-args
214 type: string
215 continuous:
216 description: If a job should be always running.
217 example: false
218 type: boolean
219 cpu:
220 description: Job CPU resource limit.
221 example: '1'
222 type: string
223 emails:
224 description: Job emails setting.
225 enum:
226 - none
227 - all
228 - onfinish
229 - onfailure
230 example: all
231 type: string
232 filelog:
233 description: Whether this job uses filelog or not.
234 example: false
235 type: boolean
236 filelog_stderr:
237 description: Path to the stderr file log.
238 example: logs/my-job.err
239 type: string
240 filelog_stdout:
241 description: Path to the stdout file log.
242 example: logs/my-job.out
243 type: string
244 image:
245 description: Container image the job uses.
246 example: tool-my-tool/tool-my-tool:latest
247 type: string
248 mem:
249 description: Job memory resource limit.
250 example: 1G
251 type: string
252 mount:
253 description: NFS mount configuration for the job.
254 enum:
255 - all
256 - none
257 example: none
258 type: string
259 name:
260 description: Unique name that identifies the job.
261 example: my-job
262 maxLength: 52
263 minLength: 1
264 pattern: ^[a-z0-9]([-a-z0-9]*[a-z0-9])?([.][a-z0-9]([-a-z0-9]*[a-z0-9])?)*$
265 type: string
266 retry:
267 description: Job retry policy. Zero means don't retry at all (the default)
268 example: 0
269 maximum: 5
270 minimum: 0
271 type: integer
272 schedule:
273 description: If the job is a cronjob, execution schedule.
274 example: '@hourly'
275 type: string
276 required:
277 - name
278 - cmd
279 - image
280 type: object
281 jobsQuota:
282 properties:
283 categories:
284 items:
285 properties:
286 items:
287 items:
288 properties:
289 limit:
290 type: string
291 name:
292 type: string
293 used:
294 type: string
295 type: object
296 type: array
297 name:
298 type: string
299 type: object
300 type: array
301 type: object
302info:
303 title: Toolforge API
304 version: 0.0.1
305openapi: 3.1.0
306paths:
307 /builds/v1/build:
308 get:
309 operationId: list
310 responses:
311 '200':
312 content:
313 application/json:
314 schema:
315 items:
316 $ref: '#/components/schemas/buildsBuild'
317 type: array
318 description: Returns the list of builds
319 '401':
320 content:
321 application/json:
322 schema:
323 $ref: '#/components/schemas/buildsUnauthorized'
324 description: Unauthorized
325 '500':
326 content:
327 application/json:
328 schema:
329 $ref: '#/components/schemas/buildsInternalError'
330 description: An internal error happened
331 security:
332 - key: []
333 post:
334 operationId: start
335 requestBody:
336 content:
337 application/json:
338 schema:
339 $ref: '#/components/schemas/buildsBuildStartParams'
340 required: true
341 responses:
342 '200':
343 content:
344 application/json:
345 schema:
346 $ref: '#/components/schemas/buildsNewBuild'
347 description: returns the newly created build information
348 '400':
349 content:
350 application/json:
351 schema:
352 $ref: '#/components/schemas/buildsBadRequest'
353 description: Bad parameters passed
354 '401':
355 content:
356 application/json:
357 schema:
358 $ref: '#/components/schemas/buildsUnauthorized'
359 description: Unauthorized
360 '500':
361 content:
362 application/json:
363 schema:
364 $ref: '#/components/schemas/buildsInternalError'
365 description: An internal error happened
366 security:
367 - key: []
368 x-codegen-request-body-name: new_build
369 /builds/v1/build/{id}:
370 delete:
371 operationId: delete
372 parameters:
373 - description: Id of the build
374 explode: false
375 in: path
376 name: id
377 required: true
378 schema:
379 type: string
380 style: simple
381 responses:
382 '200':
383 content:
384 application/json:
385 schema:
386 $ref: '#/components/schemas/buildsBuildId'
387 description: The build was deleted
388 '400':
389 content:
390 application/json:
391 schema:
392 $ref: '#/components/schemas/buildsBadRequest'
393 description: Bad parameters passed
394 '401':
395 content:
396 application/json:
397 schema:
398 $ref: '#/components/schemas/buildsUnauthorized'
399 description: Unauthorized
400 '404':
401 content:
402 application/json:
403 schema:
404 $ref: '#/components/schemas/buildsNotFound'
405 description: The build was not found
406 '500':
407 content:
408 application/json:
409 schema:
410 $ref: '#/components/schemas/buildsInternalError'
411 description: An internal error happened
412 security:
413 - key: []
414 get:
415 operationId: get
416 parameters:
417 - description: Id of the build
418 explode: false
419 in: path
420 name: id
421 required: true
422 schema:
423 type: string
424 style: simple
425 responses:
426 '200':
427 content:
428 application/json:
429 schema:
430 $ref: '#/components/schemas/buildsBuild'
431 description: The build was found
432 '400':
433 content:
434 application/json:
435 schema:
436 $ref: '#/components/schemas/buildsBadRequest'
437 description: Bad parameters passed
438 '401':
439 content:
440 application/json:
441 schema:
442 $ref: '#/components/schemas/buildsUnauthorized'
443 description: Unauthorized
444 '404':
445 content:
446 application/json:
447 schema:
448 $ref: '#/components/schemas/buildsNotFound'
449 description: The build was not found
450 '500':
451 content:
452 application/json:
453 schema:
454 $ref: '#/components/schemas/buildsInternalError'
455 description: An internal error happened
456 security:
457 - key: []
458 /builds/v1/build/{id}/logs:
459 get:
460 operationId: logs
461 parameters:
462 - description: Id of the build
463 explode: false
464 in: path
465 name: id
466 required: true
467 schema:
468 type: string
469 style: simple
470 responses:
471 '200':
472 content:
473 application/json:
474 schema:
475 $ref: '#/components/schemas/buildsBuildLogs'
476 description: returns the logs of a build
477 '400':
478 content:
479 application/json:
480 schema:
481 $ref: '#/components/schemas/buildsBadRequest'
482 description: Bad parameters passed
483 '401':
484 content:
485 application/json:
486 schema:
487 $ref: '#/components/schemas/buildsUnauthorized'
488 description: Unauthorized
489 '404':
490 content:
491 application/json:
492 schema:
493 $ref: '#/components/schemas/buildsNotFound'
494 description: The build was not found
495 '500':
496 content:
497 application/json:
498 schema:
499 $ref: '#/components/schemas/buildsInternalError'
500 description: An internal error happened
501 security:
502 - key: []
503 /builds/v1/healthz:
504 get:
505 operationId: healthcheck
506 responses:
507 '200':
508 content:
509 application/json:
510 schema:
511 $ref: '#/components/schemas/buildsHealthResponse'
512 description: Returns the health status of the service
513 '503':
514 content:
515 application/json:
516 schema:
517 $ref: '#/components/schemas/buildsHealthResponse'
518 description: Returned when the service is not ready
519 security: []
520 /envvars/v1/envvar:
521 get:
522 operationId: list
523 responses:
524 '200':
525 content:
526 application/json:
527 schema:
528 items:
529 $ref: '#/components/schemas/envvarsEnvvar'
530 type: array
531 description: all the user envvars
532 '500':
533 content:
534 application/json:
535 schema:
536 $ref: '#/components/schemas/envvarsInternalError'
537 description: Bad parameters passed
538 security:
539 - key: []
540 /envvars/v1/envvar/{name}:
541 delete:
542 operationId: delete
543 parameters:
544 - description: Name of the envvar (same as the environment variable that will
545 be exposed)
546 explode: false
547 in: path
548 name: name
549 required: true
550 schema:
551 pattern: ^[A-Z_][A-Z_0-9]{3,}$
552 type: string
553 style: simple
554 responses:
555 '200':
556 content:
557 application/json:
558 schema:
559 $ref: '#/components/schemas/envvarsEnvvar'
560 description: the deleted envvar
561 '404':
562 content:
563 application/json:
564 schema:
565 $ref: '#/components/schemas/envvarsNotFound'
566 description: The envvar was not found
567 '500':
568 content:
569 application/json:
570 schema:
571 $ref: '#/components/schemas/envvarsInternalError'
572 description: Bad parameters passed
573 security:
574 - key: []
575 get:
576 operationId: show
577 parameters:
578 - description: Name of the envvar (same as the environment variable that will
579 be exposed)
580 explode: false
581 in: path
582 name: name
583 required: true
584 schema:
585 pattern: ^[A-Z_][A-Z_0-9]{3,}$
586 type: string
587 style: simple
588 responses:
589 '200':
590 content:
591 application/json:
592 schema:
593 $ref: '#/components/schemas/envvarsEnvvar'
594 description: returns a envvar
595 '404':
596 content:
597 application/json:
598 schema:
599 $ref: '#/components/schemas/envvarsNotFound'
600 description: The envvar was not found
601 '500':
602 content:
603 application/json:
604 schema:
605 $ref: '#/components/schemas/envvarsInternalError'
606 description: Bad parameters passed
607 security:
608 - key: []
609 post:
610 operationId: create
611 parameters:
612 - description: Name of the envvar (same as the environment variable that will
613 be exposed)
614 explode: false
615 in: path
616 name: name
617 required: true
618 schema:
619 pattern: ^[A-Z_][A-Z_0-9]{3,}$
620 type: string
621 style: simple
622 requestBody:
623 content:
624 '*/*':
625 schema:
626 $ref: '#/components/schemas/envvarsenvvar_name_body'
627 required: true
628 responses:
629 '200':
630 content:
631 application/json:
632 schema:
633 $ref: '#/components/schemas/envvarsEnvvar'
634 description: the newly created or updated envvar
635 '500':
636 content:
637 application/json:
638 schema:
639 $ref: '#/components/schemas/envvarsInternalError'
640 description: Bad parameters passed
641 security:
642 - key: []
643 x-codegen-request-body-name: value
644 /envvars/v1/healthz:
645 get:
646 operationId: healthcheck
647 responses:
648 '200':
649 content:
650 application/json:
651 schema:
652 $ref: '#/components/schemas/envvarsHealthResponse'
653 description: returns the health status of the service
654 '503':
655 content:
656 application/json:
657 schema:
658 $ref: '#/components/schemas/envvarsHealthResponse'
659 description: happens when the service is not ready
660 security: []
661 /envvars/v1/metrics:
662 get:
663 operationId: metrics
664 responses:
665 '200':
666 content:
667 text/plain:
668 schema:
669 type: string
670 description: Returns the metrics of the service
671 security: []
672 /envvars/v1/quota:
673 get:
674 operationId: quota
675 responses:
676 '200':
677 content:
678 application/json:
679 schema:
680 $ref: '#/components/schemas/envvarsEnvvarsQuota'
681 description: information about the quota available to the user for creating
682 envvars
683 '500':
684 content:
685 application/json:
686 schema:
687 $ref: '#/components/schemas/envvarsInternalError'
688 description: Bad parameters passed
689 security:
690 - key: []
691 /jobs/api/v1/images:
692 get:
693 description: Returns the list of available container images that you can use
694 to create a new job.
695 operationId: images
696 responses:
697 '200':
698 content:
699 application/json:
700 schema:
701 $ref: '#/components/schemas/jobsImages'
702 description: The response contains a list of available container images.
703 '401': &id001
704 content:
705 application/json:
706 schema:
707 $ref: '#/components/schemas/jobsjobsjobsjobsjobsjobsjobsjobsjobsError'
708 description: Unauthorized error.
709 security: &id002
710 - toolforge_tool_client_certificate: []
711 tags:
712 - Context
713 /jobs/api/v1/jobs:
714 delete:
715 description: Stop and delete all defined jobs from this Toolforge tool account.
716 operationId: flush
717 responses:
718 '200':
719 description: All jobs were flushed
720 '401': *id001
721 security: *id002
722 tags:
723 - Operations
724 get:
725 description: Returns the list of jobs defined for this Toolforge tool account.
726 operationId: list
727 responses:
728 '200':
729 content:
730 application/json:
731 schema:
732 items:
733 $ref: '#/components/schemas/jobsDefinedJob'
734 type: array
735 description: The list of jobs is returned.
736 '401': *id001
737 security: *id002
738 tags:
739 - Operations
740 post:
741 description: Create a new job in this Toolforge tool account.
742 operationId: create
743 requestBody:
744 content:
745 application/json:
746 schema:
747 $ref: '#/components/schemas/jobsNewJob'
748 required: true
749 responses:
750 '200':
751 content:
752 application/json:
753 schema:
754 $ref: '#/components/schemas/jobsDefinedJob'
755 description: A new job was created.
756 '400': &id003
757 content:
758 application/json:
759 schema:
760 $ref: '#/components/schemas/jobsjobsjobsjobsjobsError'
761 description: Bad parameters error
762 '401': *id001
763 security: *id002
764 tags:
765 - Operations
766 /jobs/api/v1/jobs/{id}:
767 delete:
768 description: Stop and delete the specified job from this Toolforge tool account.
769 operationId: delete
770 parameters: &id004
771 - description: Job name.
772 explode: false
773 in: path
774 name: id
775 required: true
776 schema:
777 type: string
778 style: simple
779 responses:
780 '200':
781 description: Job was deleted.
782 '400': *id003
783 '401': *id001
784 '404': &id005
785 content:
786 application/json:
787 schema:
788 $ref: '#/components/schemas/jobsjobsjobsjobsError'
789 description: Object not found error.
790 security: *id002
791 tags:
792 - Operations
793 get:
794 description: Show information about a given job for this Toolforge tool account.
795 operationId: show
796 parameters: *id004
797 responses:
798 '200':
799 content:
800 application/json:
801 schema:
802 $ref: '#/components/schemas/jobsDefinedJob'
803 description: Job was found.
804 '400': *id003
805 '401': *id001
806 '404': *id005
807 security: *id002
808 tags:
809 - Operations
810 /jobs/api/v1/jobs/{id}/logs:
811 get:
812 description: Get logs for a given job in this Toolforge tool account.
813 operationId: get_logs
814 parameters: *id004
815 responses:
816 '200':
817 content:
818 text/plain:
819 schema:
820 $ref: '#/components/schemas/jobsJobLogs'
821 description: Job logs are served.
822 '400': *id003
823 '401': *id001
824 '404': *id005
825 security: *id002
826 tags:
827 - Operations
828 /jobs/api/v1/jobs/{id}/restart:
829 post:
830 description: Restart a given job. If the job is a cronjob, execute it right
831 now.
832 operationId: restart
833 parameters: *id004
834 responses:
835 '200':
836 description: Job has been restarted.
837 '400': *id003
838 '401': *id001
839 '404': *id005
840 security: *id002
841 tags:
842 - Operations
843 /jobs/api/v1/quota:
844 get:
845 description: Returns information on the tool account quotas in Toolforge.
846 operationId: quota
847 responses:
848 '200':
849 content:
850 application/json:
851 schema:
852 $ref: '#/components/schemas/jobsQuota'
853 description: The response contains the quota details.
854 '401': *id001
855 security: *id002
856 tags:
857 - Context
858 /jobs/healthz:
859 get:
860 description: Endpoint that can be used to know the health status of the API
861 itself.
862 operationId: healthcheck
863 responses:
864 '200':
865 content:
866 application/json:
867 schema:
868 $ref: '#/components/schemas/jobsHealth'
869 description: The health status of the service is well known.
870 security: []
871 tags:
872 - API system
873 servers:
874 - url: https://api.svc.tools.eqiad1.wikimedia.cloud:30003/jobs/
875servers:
876- url: api.svc.toolforge.org

Fixed, this generate a valid definition:

from __future__ import annotations
import yaml
from pathlib import Path
from typing import Any


OPENAPI_SOURCES = {
    "builds": Path("../../builds-api/openapi/v1.yaml"),
    "jobs": Path("../../jobs-api/openapi/openapi.yaml"),
    "envvars": Path("../../envvars-api/openapi/v1.yaml"),
}
BASE_API = Path("base.yaml")


def rebase_refs(api_def: dict[str, Any], old_path: str, new_path: str) -> None:
    for key, value in api_def.items():
        if key == "$ref":
            new_value = value.replace(old_path, new_path)
            api_def[key] = new_value

        elif isinstance(value, list):
            for item in value:
                if isinstance(item, dict):
                    rebase_refs(api_def=item, old_path=old_path, new_path=new_path)

        else:
            if isinstance(value, dict):
                rebase_refs(api_def=value, old_path=old_path, new_path=new_path)


def rebase_api_paths(api_def: dict[str, Any], base_url: str) -> None:
    rebased_paths = {
        f"{base_url}{old_url}": data for old_url, data in api_def["paths"].items()
    }
    api_def["paths"] = rebased_paths


def main():
    merged_api = yaml.safe_load(BASE_API.open("r"))
    for api_name, source in OPENAPI_SOURCES.items():
        old_api = yaml.safe_load(source.open("r"))
        rebase_api_paths(api_def=old_api, base_url=f"/{api_name}")
        rebase_refs(
            api_def=old_api,
            old_path="/components/schemas/",
            new_path=f"/components/schemas/{api_name}",
        )
        merged_api["paths"].update(old_api["paths"])
        merged_api["components"]["schemas"].update(
            {
                f"{api_name}{component_name}": component_value
                for component_name, component_value in old_api["components"][
                    "schemas"
                ].items()
            }
        )


    Path("merged_api.yaml").write_text(yaml.safe_dump(merged_api))


if __name__ == "__main__":
    main()

Nice! I was just experimenting with https://www.npmjs.com/package/openapi-merge-cli, merging just envvars and build to begin with. It generates a valid spec out-of-the-box but I'm also visually checking it.

sounds like a more featureful version of the above script xd, we might want to avoid the extra complications though, our merging operation should be dead simple.

Though if it's stable enough might be ok to use.

sounds like a more featureful version of the above script xd, we might want to avoid the extra complications though, our merging operation should be dead simple.

Though if it's stable enough might be ok to use.

It has a whole lot of source code to finally do the same as your script xd. It also only seems able to prefix components in case of conflict, not by default.

The ability to configure it using json is nice though

{
  "inputs": [
    {
      "inputFile": "./gateway.swagger.json"
    },
    {
      "inputFile": "./jira.swagger.json",
      "pathModification": {
        "stripStart": "/rest",
        "prepend": "/jira"
      },
      "operationSelection": {
        "includeTags": ["included"]
      },
      "description": {
        "append": true
      }
    },
    {
      "inputFile": "./confluence.swagger.yaml",
      "dispute": {
        "prefix": "Confluence"
      },
      "pathModification": {
        "prepend": "/confluence"
      },
      "operationSelection": {
        "excludeTags": ["excluded"]
      }
    }
  ], 
  "output": "./output.swagger.json"
}

Being js also means that if we serve it as an API, we would have to use a nodejs server.

I think that going with python/golang might be safer (python might be enough).

So I propose adding a small python http server to the api-gateway component that only handles the /openapi.json endpoint (nginx will still forward the other endpoints for now) and generates the merged API on the fly by requesting the other /*/openapi.json endpoints (maybe cached?)

We can add also serving a swagger UI if the content-type is not json or similar.

Feel free to load the config from a json/yaml whatever, I just hardcoded it in the script.

In the future, we can tweak nginx to pass to the python code endpoints that need special handling if any, or handle all of them if needed eventually.

As in:

  • Current status: user -> api-gateway:nginx -> backend-apis
  • Next: (/openapi.json) user -> api-gateway:nginx -> api-gateway:gateway-app | (not /openapi.json) user -> api-gateway:nginx -> backend-apis
  • Possible future ({some set of urls}) user -> api-gateway:nginx -> api-gateway:gateway-app | ({some possibly empty set of urls}) user -> api-gateway:nginx -> backend-api

WDYT?

Do you want to do it yourself? (I can if you want to do something else)

Being js also means that if we serve it as an API, we would have to use a nodejs server.

I think that going with python/golang might be safer (python might be enough).

Sounds good to me.

I'm not quite convinced about the "on the fly" option vs. doing merging & validation in CI each time one of the APIs OAI spec is updated, then serving this single OAI spec from the gateway via one endpoint. The on the fly option seems more complex. If we do it in CI, we would catch any merge conflicts before trying to deploy a new version of a component. Also schema caching adds more complexity, imo.

I'm not quite convinced about the "on the fly" option vs. doing merging & validation in CI each time one of the APIs OAI spec is updated, then serving this single OAI spec from the gateway via one endpoint. The on the fly option seems more complex. If we do it in CI, we would catch any merge conflicts before trying to deploy a new version of a component. Also schema caching adds more complexity, imo.

There's no merge conflicts possible, the only possibility is that the generate openapi schema is not valid.

The big advantage of doing it on the fly, is that (as long as the generated schema is syntactically valid) the exposed schema would be the current exposed APIs, otherwise, every time we modify any of the backend APIs, there will be a period where the schema does not represent the current exposed APIs until we regenerate and deploy the gateway.

We can add some basic check to the live endpoint for openapi validity if you want, and even alert through prometheus/alertmanager if it's gets borked, that would catch most of the issues (ex. https://pypi.org/project/openapi-schema-validator/).

The big advantage of doing it on the fly, is that (as long as the generated schema is syntactically valid) the exposed schema would be the current exposed APIs, otherwise, every time we modify any of the backend APIs, there will be a period where the schema does not represent the current exposed APIs until we regenerate and deploy the gateway.

I'm not sure that at our scale/rate at which we modify the API schemas this makes a substantial difference, but I'm open to trying your approach.

How often would you have the gateway polling the APIs? Or would we be doing on-demand fetching with caching?

The big advantage of doing it on the fly, is that (as long as the generated schema is syntactically valid) the exposed schema would be the current exposed APIs, otherwise, every time we modify any of the backend APIs, there will be a period where the schema does not represent the current exposed APIs until we regenerate and deploy the gateway.

I'm not sure that at our scale/rate at which we modify the API schemas this makes a substantial difference, but I'm open to trying your approach.

How often would you have the gateway polling the APIs?

We can do it literally on the fly, every request that comes for the schema pulls it from the APIs and generates them (I don't think we will have much traffic there, so validity trumps performance in my mind).

We can do it literally on the fly, every request that comes for the schema pulls it from the APIs and generates them (I don't think we will have much traffic there, so validity trumps performance in my mind).

If we don't care about performance, and we're choosing this approach to avoid ever serving stale schemas, then we shouldn't do any caching either.

We can do it literally on the fly, every request that comes for the schema pulls it from the APIs and generates them (I don't think we will have much traffic there, so validity trumps performance in my mind).

If we don't care about performance, and we're choosing this approach to avoid ever serving stale schemas, then we shouldn't do any caching either.

sounds good to me, if we start seeing performance issues, we can start playing with caches/async generation of the merged schema and such

project_1317_bot_df3177307bed93c3f34e421e26c86e38 opened https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/202

envvars-api: bump to 0.0.38-20240221140654-e441981f

project_1317_bot_df3177307bed93c3f34e421e26c86e38 opened https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/204

envvars-api: bump to 0.0.39-20240221153320-e6de38fd

project_1317_bot_df3177307bed93c3f34e421e26c86e38 opened https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/205

builds-api: bump to 0.0.130-20240221160838-03b2a120

dcaro renamed this task from [toolforge API] Investigate ways to present our multiple Openapi definitions to a future consolidated CLI client to [jobs-api,buildservice-api,envvars-api] Investigate ways to present our multiple Openapi definitions to a future consolidated CLI client.Mar 12 2024, 1:20 PM

@dcaro what is the base.yaml in your script?

@dcaro what is the base.yaml in your script?

:facepalm: I forgot to paste it

dcaro@urcuchillay$ cat /home/dcaro/Work/wikimedia/api-gateway/api_gateway/base.yaml
openapi: 3.1.0
info:
  title: Toolforge API
  version: 0.0.1
servers:
  - url: "api.svc.toolforge.org"

paths: {}

components:
  schemas: {}

It has just the basic API info that only the aggregated api needs.

@dcaro thanks! Another question: each openapi spec has it's own /openapi.json endpoint. We will want the gateway to expose the unified spec created by merging the files. Should we remove the individual /openapi.json endpoints from the merged spec?

@dcaro thanks! Another question: each openapi spec has it's own /openapi.json endpoint. We will want the gateway to expose the unified spec created by merging the files. Should we remove the individual /openapi.json endpoints from the merged spec?

Good question, I'd say so yes, not sure which benefit would they have for users of the proxy (the can't really use the inner apis directly, if they can use the inner apis directly, they can access the openapi.json for the inner api directly too xd).

@dcaro How should we deal with the security top-level element. We are currently ignoring it. Maybe add it to the base.yaml?

We are also ignoring other top-level elements, I guess because we aren't currently using them. These are tags, webhooks, and externalDocs. Do we want to keep ignoring them for now?

@dcaro How should we deal with the security top-level element. We are currently ignoring it. Maybe add it to the base.yaml?

Yep, that will be whatever we end up doing for authentication, for now we can use the same we have in the others, but we will want to put something that makes more sense later.

We are also ignoring other top-level elements, I guess because we aren't currently using them. These are tags, webhooks, and externalDocs. Do we want to keep ignoring them for now?

Any top-level elements from the internal APIs should be ignored yes, and overwritten with the base.yaml ones if there's any, or just dropped if it has not.

In the future, we might not want to remove webhooks though, we don't have any, nor plan on adding any yet, but we might want in the future, for now we can just drop it.

operationIds need to be unique. At what level do we want to deal with this? Enforce a naming convention across the individual APis such as {serviceName}_{resourceName}_{operation}? Prefixing when merging the specs?

operationIds need to be unique. At what level do we want to deal with this? Enforce a naming convention across the individual APis such as {serviceName}_{resourceName}_{operation}? Prefixing when merging the specs?

Not really, that can be dropped if needed, it's only used for code generation I think.

If we might need it in the future, we can then prepend the value with the internal API name like Builds_<operationid> kind of thing.

operationIds need to be unique. At what level do we want to deal with this? Enforce a naming convention across the individual APis such as {serviceName}_{resourceName}_{operation}? Prefixing when merging the specs?

Not really, that can be dropped if needed, it's only used for code generation I think.

Hmm, we do want to generate the client libraries that we will use in the consolidated CLI, though. Also, I think tools that generate documentation rely on the uniqueness of operationId to correctly link operations to descriptions, parameters, and responses.

operationIds need to be unique. At what level do we want to deal with this? Enforce a naming convention across the individual APis such as {serviceName}_{resourceName}_{operation}? Prefixing when merging the specs?

Not really, that can be dropped if needed, it's only used for code generation I think.

Hmm, we do want to generate the client libraries that we will use in the consolidated CLI, though. Also, I think tools that generate documentation rely on the uniqueness of operationId to correctly link operations to descriptions, parameters, and responses.

Let's play with it then :), can we generate the docs without operationId being present, as in if we drop it?

If not, then let's namespace it like we do with paths and schemas.

My guess is that if operationId is not there, the client will generate the name, but I have not tried.

Do we want to expose the /healthz and /metrics endpoints in the unified spec?

The tags turned out to be useful as it makes for nicer, better organized docs. Not tried it yet, but it's also possible to add a description and and a link to each "endpoint family".

Screenshot 2024-03-14 at 15.55.57.png (780×925 px, 75 KB)

Do we want to expose the /healthz and /metrics endpoints in the unified spec?

That'd be ok yes.

The tags turned out to be useful as it makes for nicer, better organized docs.

They should be added at the sub-api level though, we should explore also what tags make sense.

I'd make that a task for later for now, we are not yet exposing the API to users.