Page MenuHomePhabricator

Integrate MT healthcheck into Apertium service
Open, NormalPublic

Description

MT Heathcheck is the only way if MT service (Apertium as of now) is working or not. See: https://cxserver.wikimedia.org/translation/

Integrate it with Apertium service so that it can notify Language Team and we can take immidiate action instead of manual check of relying on user's feedback/bug report.

Event Timeline

KartikMistry updated the task description. (Show Details)
KartikMistry raised the priority of this task from to Normal.
KartikMistry claimed this task.
Amire80 moved this task from Needs Triage to CX6 on the ContentTranslation board.Aug 13 2015, 4:49 PM
KartikMistry set Security to None.

Not sure what this task is about. I am assuming better monitoring of the apertium-apy service. We got icinga for alerting and it can perform HTTP queries but I will need the exact HTTP GET (I assume it is a GET) required.

on apertium.org we just have a cron job that tries a simple translation on all pairs and shoots of an email if anything doesn't translate the way it used to. I'm guessing this task is for something equivalent? (maybe using https://cxserver.wikimedia.org/translation/ )

Arrbee moved this task from CX6 to CX8 candidates on the ContentTranslation board.Oct 8 2015, 10:00 AM
Arrbee moved this task from CX8 candidates to CX7 on the ContentTranslation board.
Amire80 edited a custom field.Oct 24 2015, 12:48 PM
Amire80 moved this task from CX7 to CX8 on the ContentTranslation board.Jan 24 2016, 10:28 PM
Amire80 edited a custom field.
Amire80 added a subscriber: Amire80.

Is this related to service-runner in any way? Is it still relevant?

Is this related to service-runner in any way? Is it still relevant?

Service runner is nodejs specific and apertium apy is python so technically no. However I am guessing you are also referring to the benefits that the service-runner/service::node combo provide, which are quite substantial as far as healthchecks go as I am sure you 've seen with the cxserver software. However, I am still a hazy as to what this task refers to. I can offer some input on things I 'd like to see happening on our infrastructure like:

  • basic TCP checks
  • monitoring of HTTP endpoints
  • inter language translation of a clearly defined set of text/translation as @Unhammer noted above

all of the above happens via service checker which is a crucial part of service::node, but is language agnostic and can be done generically for any service. Service-checker performs a wide range of healthchecks based on the swagger spec advertised by the service. I am unsure however on how we can integrate this into apertium.

@KartikMistry, any input on this?

https://github.com/goavki/apertium-apy/blob/master/tools/sanity-test-apy.py is the script we use which just tries some very simple translations on our installed language pairs. Since it's basically doing a bunch of "curl" calls, the tests could just as well be written in node.js or what have you.

There's also http://apy.projectjj.com/stats in newest git version of apy but we don't actively check that, it's just something we thought might come in handy (and if there's anything you think would be useful in there, we'd be happy to add it if possible).

@Unhammer, the stats we can definitely use and easily write a plugin for diamond allowing us to ship data to our graphite cluster, creating nice graphs right next to the ones in https://grafana.wikimedia.org/. It is a great addition thanks!

As far as sanity-test-apy.py goes, it looks pretty interesting. We have https://phabricator.wikimedia.org/diffusion/OPUP/browse/production/modules/service/files/checker.py for nodejs application that have a swagger spec like the one in https://cxserver.wikimedia.org/v1/?spec (cxserver is powered partly by apertium in case you were not aware btw). Given it follows a swagger spec, it's language agnostic.

This allows to couple the monitoring to the API endpoint advertisement and thus make is more difficult for monitoring to deviate from what the application actually does. It has proven to be quite robust.

That being said, I am unsure how much sense it makes for apertium to ship a swagger spec describing the API apertium-apy provides. From my point of view, it probably does, but I have no estimation of how much work that is. What do you think ?

Unhammer added a comment.EditedFeb 9 2016, 4:27 PM

That being said, I am unsure how much sense it makes for apertium to ship a swagger spec describing the API apertium-apy provides. From my point of view, it probably does, but I have no estimation of how much work that is. What do you think ?

If it can be done incrementally (e.g. we can get something useful already by just starting with the /translate endpoint and then add stuff as we have time), then it sounds like a good idea to me: https://github.com/goavki/apertium-apy/issues/12

KartikMistry closed this task as Invalid.Apr 20 2016, 9:28 AM
KartikMistry updated the task description. (Show Details)
KartikMistry reopened this task as Open.
Amire80 moved this task from CX8 to Bugs on the ContentTranslation board.Apr 20 2016, 1:11 PM
Unhammer added a comment.EditedMay 30 2016, 8:01 AM

As far as sanity-test-apy.py goes, it looks pretty interesting. We have https://phabricator.wikimedia.org/diffusion/OPUP/browse/production/modules/service/files/checker.py for nodejs application that have a swagger spec like the one in https://cxserver.wikimedia.org/v1/?spec (cxserver is powered partly by apertium in case you were not aware btw). Given it follows a swagger spec, it's language agnostic.

This allows to couple the monitoring to the API endpoint advertisement and thus make is more difficult for monitoring to deviate from what the application actually does. It has proven to be quite robust.

That being said, I am unsure how much sense it makes for apertium to ship a swagger spec describing the API apertium-apy provides. From my point of view, it probably does, but I have no estimation of how much work that is. What do you think ?

A student contributor just made a first version of a spec =D
https://github.com/goavki/apertium-apy/issues/12#issuecomment-222332011
If you're familiar with Swagger, could you take a look at the questions raised there?

@mobrovac @Pchelolo Any chance you can help with the questions @Unhammer raised ?

For the concrete use case in WMF prod, since CXServer already exposes its monitoring spec and uses Apertium behind the scenes, it would be trivial to improve it and add translation checks that exercise Apertium, wouldn't it?

For the concrete use case in WMF prod, since CXServer already exposes its monitoring spec and uses Apertium behind the scenes, it would be trivial to improve it and add translation checks that exercise Apertium, wouldn't it?

Maybe, that would however mean that:

  • we would only be exercising the subset of apertium functionality we are using currently, with the obvious drawback of quite possibly ending up not exercising newly used apertium functionality due to the spec staying out of sync. A risk that in an apertium provided spec is quite lower.
  • Depend the monitoring of one service on the monitoring of another service creating a clearly unneeded dependency between the two and probably obfuscating error reporting from apertium leading to needlessly spent efforts trying to find out what's wrong with cxserver while apertium has problems. We 've already seen that with citoid/zotero and it clearly is not the best approach forward.

So I am still thinking using the apertium provided spec mentioned above is probably worth more.