Page MenuHomePhabricator

cxserver's swagger spec fails to validate
Open, Stalled, NormalPublic

Description

Using https://editor.swagger.io/ I 've tried to parse the swagger specs of cxserver. I was greeted with 49 different errors during validating the spec.

Steps to reproduce

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 6 2019, 10:39 AM
akosiaris triaged this task as Normal priority.Mar 6 2019, 10:39 AM

Mostly it is missing responses section in the definitions. They are easy to fix.

But there are a few errors that is difficult to fix as version 2.0 of swagger spec does not allow optional path paramerts. WMF has this kind of definition(not limited to cxserver). See comment by @GWicke at https://github.com/OAI/OpenAPI-Specification/issues/93#issuecomment-74362417

Mostly it is missing responses section in the definitions. They are easy to fix.

Cool. Thanks!

But there are a few errors that is difficult to fix as version 2.0 of swagger spec does not allow optional path paramerts. WMF has this kind of definition(not limited to cxserver). See comment by @GWicke at https://github.com/OAI/OpenAPI-Specification/issues/93#issuecomment-74362417

Neither does 3.0.2 where the text about the required attribute is verbatim included from 2.0 and reads

required	boolean	Determines whether this parameter is mandatory. If the parameter location is "path", this property is REQUIRED and its value MUST be true. Otherwise, the property MAY be included and its default value is false.

Effectively saying that optional path parameters are against the spec. And it's also explicitly said in https://github.com/swagger-api/swagger-ui/pull/935#issuecomment-74329722.

To be honest, I am surprised that it was decided back then that we forego all that and violate the spec. Granted, it's an often asked feature per https://github.com/OAI/OpenAPI-Specification/issues/574, but still, we are in the precarious position of effectively using an in-house version of the specification that:

  • is largely compatible with the original, but in reality the violating part is used extensively in WMF (it's at least used at https://gerrit.wikimedia.org/r/494703)
  • admittedly noone else uses (that we know of?)
  • no tooling exists around it to validate the correctness of it
  • the standard tooling fails

The end result for me personally is that I can't use standard tooling to parse the spec and create tools to benchmark services. I can work around it (it's yaml after all), but it's clearly suboptimal.

Per https://github.com/Azure/autorest/issues/729#issuecomment-226263537 this was deferred, and to me, the chance of this being added to the spec in a later version appears minimal. It's ~3 years since that deferred status and in the meantime multiple releases for the 3.0x version have happened. Maybe it's about time we re-evaluate and perhaps conform to the spec?

@santhosh I 'd say we create a phab task and re-evaluate there our divergence from the spec. Would that work for you?

akosiaris changed the task status from Open to Stalled.Mar 8 2019, 9:03 AM

Stalling until we have some input on T217881

I prepared a valid spec as per OpenAPI 3.0. But this will require optional params treated as query params. That means it is a breaking change in API. Require a new version. I am not proposing it, just listing a possibility.

1openapi: 3.0.1
2info:
3 title: Content translation service
4 description: Content translation service for translating mediawiki pages between
5 languages.
6 termsOfService: https://wikimediafoundation.org/wiki/Terms_of_Use
7 contact:
8 name: the Wikimedia Language team
9 url: https://www.mediawiki.org/wiki/Wikimedia_Language_engineering
10 license:
11 name: GPL-2.0+
12 url: http://opensource.org/licenses/GPL-2.0
13 version: 0.2.2
14servers:
15- url: /
16paths:
17 /robots.txt:
18 get:
19 tags:
20 - Root
21 description: Gets robots.txt
22 responses:
23 200:
24 description: OK
25 content:
26 text/plain:
27 schema:
28 type: string
29 x-amples:
30 - title: robots.txt check
31 request: {}
32 response:
33 status: 200
34 /:
35 get:
36 tags:
37 - Root
38 description: The root service end-point
39 responses:
40 200:
41 description: OK
42 content: {}
43 404:
44 description: root with wrong query param
45 content: {}
46 x-amples:
47 - title: root with no query params
48 request: {}
49 response:
50 status: 404
51 - title: spec from root
52 request:
53 query:
54 spec: true
55 response:
56 status: 200
57 - title: root with wrong query param
58 request:
59 query:
60 fooo: true
61 response:
62 status: 404
63 /v1/page/{language}/{title}:
64 get:
65 tags:
66 - Page content
67 description: Fetches segmented mediawiki page
68 parameters:
69 - name: language
70 in: path
71 description: The language code or the domain of the wiki
72 required: true
73 schema:
74 type: string
75 - name: title
76 in: path
77 description: The page title
78 required: true
79 schema:
80 type: string
81 - name: revision
82 in: query
83 description: The page revision id
84 required: false
85 schema:
86 type: string
87 responses:
88 200:
89 description: Page fetched successfully
90 content:
91 application/json:
92 schema:
93 type: string
94 404:
95 description: Page not found
96 content: {}
97 x-amples:
98 - title: Fetch enwiki Oxygen page
99 request:
100 params:
101 language: en
102 title: Oxygen
103 revision: 702870951
104 response:
105 status: 200
106 headers:
107 content-type: application/json
108 /v1/dictionary/{word}/{from}/{to}:
109 get:
110 tags:
111 - Dictionary
112 description: Fetches the dictionary meaning of a word.
113 parameters:
114 - name: word
115 in: path
116 description: The word to lookup
117 required: true
118 schema:
119 type: string
120 - name: from
121 in: path
122 description: The source language code
123 required: true
124 schema:
125 type: string
126 - name: to
127 in: path
128 description: The target language code
129 required: true
130 schema:
131 type: string
132 - name: provider
133 in: query
134 description: The dictionary provider id
135 required: false
136 schema:
137 type: string
138 enum:
139 - JsonDict
140 - Dictd
141 responses:
142 200:
143 description: Dictionary meaning of a word fetched successfully
144 content:
145 application/json:
146 schema:
147 type: string
148 404:
149 description: Word not found in the dictionary
150 content: {}
151 x-amples:
152 - title: Fetch dictionary meaning with a given provider
153 request:
154 params:
155 word: water
156 from: en
157 to: es
158 provider: JsonDict
159 response:
160 status: 200
161 body:
162 source: water
163 translations:
164 - phrase: /.+/
165 sources:
166 - fd-eng-spa
167 headers:
168 content-type: application/json
169 - title: Fetch dictionary meaning without specifying a provider
170 request:
171 params:
172 word: water
173 from: en
174 to: es
175 response:
176 status: 200
177 body:
178 source: water
179 translations:
180 - phrase: /.+/
181 sources:
182 - fd-eng-spa
183 headers:
184 content-type: application/json
185 /v1/mt/{from}/{to}:
186 post:
187 tags:
188 - Machine translation
189 description: Fetches the machine translation. Some providers require an authorization
190 header and it is forbidden to use them outside the Content Translation tool.
191 parameters:
192 - name: from
193 in: path
194 description: The source language code
195 required: true
196 schema:
197 type: string
198 - name: to
199 in: path
200 description: The target language code
201 required: true
202 schema:
203 type: string
204 - name: provider
205 in: query
206 description: The machine translation provider id
207 required: false
208 schema:
209 type: string
210 enum:
211 - Apertium
212 - Matxin
213 responses:
214 200:
215 description: Machine translation fetched successfully
216 content:
217 application/json:
218 schema:
219 type: string
220 500:
221 description: Internal error
222 content: {}
223 requestBody:
224 content:
225 application/x-www-form-urlencoded:
226 schema:
227 required:
228 - html
229 properties:
230 html:
231 type: string
232 description: The HTML or plaintext content to translate
233 x-textarea: true
234 required: true
235 x-ample:
236 - title: Machine translate an HTML fragment using TestClient.
237 request:
238 params:
239 from: en
240 to: qqq
241 provider: TestClient
242 body:
243 html: <p><a href='Oxygen'>Oxygen</a> is a chemical element with symbol
244 O and <a href='Atomic number'>atomic number</a> 8.</p>
245 response:
246 status: 200
247 body:
248 contents: /.+/
249 headers:
250 content-type: application/json
251 /v1/list/pair/{from}/{to}:
252 get:
253 tags:
254 - Tools
255 description: Lists the tools for a given language pair
256 parameters:
257 - name: from
258 in: path
259 description: The source language code
260 required: true
261 schema:
262 type: string
263 - name: to
264 in: path
265 description: The target language code
266 required: true
267 schema:
268 type: string
269 responses:
270 200:
271 description: Machine translation fetched successfully
272 content:
273 application/json:
274 schema:
275 type: string
276 500:
277 description: Internal error
278 content: {}
279 x-amples:
280 - title: Get the tools between two language pairs
281 request:
282 params:
283 from: en
284 to: es
285 response:
286 status: 200
287 headers:
288 content-type: application/json
289 /v1/list/languagepairs:
290 get:
291 tags:
292 - Languages
293 - Service information
294 description: Lists the language pairs supported by the server
295 responses:
296 200:
297 description: List of all language pairs
298 content:
299 application/json:
300 schema:
301 type: string
302 x-amples:
303 - title: Get all the language pairs
304 response:
305 status: 200
306 headers:
307 content-type: application/json
308 /v1/list/{tool}:
309 get:
310 tags:
311 - Tools
312 - Service information
313 description: Lists all language pairs that tool supports. If language pairs
314 are given, list the providers for the tool in that language pair.
315 parameters:
316 - name: tool
317 in: path
318 description: The tool name
319 required: true
320 schema:
321 type: string
322 enum:
323 - mt
324 - dictionary
325 - name: from
326 in: query
327 description: The source language code
328 required: false
329 schema:
330 type: string
331 - name: to
332 in: query
333 description: The target language code
334 required: false
335 schema:
336 type: string
337 responses:
338 200:
339 description: List of all language pairs
340 content:
341 application/json:
342 schema:
343 type: string
344 x-amples:
345 - title: Get the MT tool between two language pairs
346 request:
347 params:
348 from: en
349 to: es
350 tool: mt
351 response:
352 status: 200
353 headers:
354 content-type: application/json
355 /v2/page/{sourcelanguage}/{targetlanguage}/{title}:
356 get:
357 tags:
358 - Page content
359 description: Fetches segmented mediawiki page
360 parameters:
361 - name: sourcelanguage
362 in: path
363 description: The language code or the domain of the source wiki
364 required: true
365 schema:
366 type: string
367 - name: targetlanguage
368 in: path
369 description: The language code or the domain of the target wiki
370 required: true
371 schema:
372 type: string
373 - name: title
374 in: path
375 description: The page title
376 required: true
377 schema:
378 type: string
379 - name: revision
380 in: query
381 description: The page revision id
382 required: true
383 schema:
384 type: string
385 responses:
386 200:
387 description: Successfully fetched the segmented mediawiki page
388 content:
389 application/json:
390 schema:
391 type: string
392 x-amples:
393 - title: Fetch enwiki Oxygen page
394 request:
395 params:
396 sourcelanguage: en
397 targetlanguage: es
398 title: Pickling
399 revision: 825871454
400 response:
401 status: 200
402 headers:
403 content-type: application/json
404 /v2/translate/{from}/{to}:
405 post:
406 tags:
407 - Machine translation
408 description: Translate the given content from source language to target language.
409 Also adapt the content for the target language wiki. Some machine translation
410 providers require an authorization header and it is forbidden to use them
411 outside the Content Translation tool.
412 parameters:
413 - name: from
414 in: path
415 description: The source language code
416 required: true
417 schema:
418 type: string
419 - name: to
420 in: path
421 description: The target language code
422 required: true
423 schema:
424 type: string
425 - name: provider
426 in: query
427 description: The machine translation provider id
428 required: false
429 schema:
430 type: string
431 enum:
432 - Apertium
433 - Matxin
434 requestBody:
435 content:
436 application/x-www-form-urlencoded:
437 schema:
438 required:
439 - html
440 properties:
441 html:
442 type: string
443 description: The HTML content to translate
444 x-textarea: true
445 required: true
446 responses:
447 200:
448 description: OK
449 content: {}
450 500:
451 description: Error
452 content: {}
453 x-amples:
454 - title: Machine translate an HTML fragment using TestClient, adapt the links
455 to target language wiki.
456 request:
457 params:
458 from: en
459 to: qqq
460 provider: TestClient
461 body:
462 html: <p><a rel="mw:WikiLink" href='Oxygen'>Oxygen</a> is a chemical element
463 with symbol O and <a rel="mw:WikiLink" href='Atomic number'>atomic number</a>
464 8.</p>
465 response:
466 status: 200
467 body:
468 contents: /.+/
469 headers:
470 content-type: application/json
471 /_info:
472 get:
473 tags:
474 - Service information
475 description: Gets information about the service
476 responses:
477 200:
478 description: OK
479 content: {}
480 x-amples:
481 - title: retrieve service info
482 request: {}
483 response:
484 status: 200
485 headers:
486 content-type: application/json
487 body:
488 name: /.+/
489 description: /.+/
490 version: /.+/
491 home: /.+/
492 /_info/name:
493 get:
494 tags:
495 - Service information
496 description: Gets the name of the service
497 responses:
498 200:
499 description: OK
500 content: {}
501 x-amples:
502 - title: retrieve service name
503 request: {}
504 response:
505 status: 200
506 headers:
507 content-type: application/json
508 body:
509 name: /.+/
510 /_info/version:
511 get:
512 tags:
513 - Service information
514 description: Gets the running version of the service
515 responses:
516 200:
517 description: OK
518 content: {}
519 x-amples:
520 - title: retrieve service version
521 request: {}
522 response:
523 status: 200
524 headers:
525 content-type: application/json
526 body:
527 version: /.+/
528 /_info/home:
529 get:
530 tags:
531 - Service information
532 description: Redirects to the home page
533 responses:
534 301:
535 description: OK
536 content: {}
537 x-amples:
538 - title: redirect to the home page
539 request: {}
540 response:
541 status: 301
542components: {}
543x-default-params:
544 domain: en.wikipedia.org

Not necessarily. What we did was to duplicate the entries with optional parameters and use YAML references to avoid copy-pasting the whole definition. See https://github.com/wikimedia/restbase/blob/master/v1/talk.yaml#L116 for an example.