Page MenuHomePhabricator

Apertium leaves a ton of stale processes, consumes all the available memory
Closed, ResolvedPublic1 Estimated Story Points

Description

Apertium opens a ton of processes that eat up a ton of RAM on sca1001.

root@sca1001:~# ps -u apertium -fa | wc -l
4383

most of these processes are even one week old, some are fresh. All the ones I straced were listening on a FIFO pipe open with the apertium process, which suggests me a subprocess call from apertium that is not handled correctly.

This is a serious problem as those processes are eating up a sizeable amount of ram.

graph.png (280×577 px, 20 KB)

Event Timeline

Joe raised the priority of this task from to Needs Triage.
Joe updated the task description. (Show Details)
Joe subscribed.
Joe triaged this task as High priority.Jul 29 2015, 10:42 AM
Joe set Security to None.
Joe raised the priority of this task from High to Unbreak Now!.Jul 29 2015, 10:52 AM

I raised the priority as sca1001 was swapping to OOM death since forever, probably.

I suggest we add some protection, like a memory cgroup for use by apertium, or some firejail limitation if possible.

Also, I suggest to move cxserver (and all its accessory services) to a dedicated node as an alternative.

The pretty amaizing effect of restarting apertium on sca1001

graph.png (280×577 px, 17 KB)

Thanks @Joe I'm in process of updating Apertium in Debian, so that we can use fresh packages from backport and can easily debug issue with upstream.

I had discussion with upstream and restarting apertium-apy is one solution when memory leaks occur. It isn't ideal solution until we update Apertium (both apertium and apertium-es-gl) though.

Stale processes left behind by apertium-apy is clearly a bug and needs to be resolved. We should work with upstream and solve this for sure. Restarting apertium regularly is a poor man's solution. It will work, but we actually need to fix the bug, whatever it is.

@KartikMistry, any insights as to how apertium-apy starts this processes and how it uses them ? I can make some assumptions (forks and python https://docs.python.org/2/library/subprocess.html calls, pipes to communicate with them and never terminating child processes but some clear input would be greatly). A pstree is particularly disconcerting

|-python3-+-python3-+-apertium-interc
|         |         |-apertium-postch
|         |         |-apertium-pretra
|         |         |-apertium-tagger
|         |         |-2*[apertium-transf]
|         |         `-3*[lt-proc]
|         |-3*[python3-+-apertium-interc]
|         |            |-apertium-postch]
|         |            |-2*[apertium-pretra]]
|         |            |-2*[apertium-tagger]]
|         |            |-3*[apertium-transf]]
|         |            `-6*[lt-proc]]
|         |-python3-+-7*[apertium-interc]
|         |         |-3*[apertium-postch]
|         |         |-4*[apertium-pretra]
|         |         |-4*[apertium-tagger]
|         |         |-7*[apertium-transf]
|         |         `-10*[lt-proc]
|         |-2*[python3-+-4*[apertium-interc]]
|         |            |-2*[apertium-postch]]
|         |            |-2*[apertium-pretra]]
|         |            |-2*[apertium-tagger]]
|         |            |-4*[apertium-transf]]
|         |            `-5*[lt-proc]]
|         |-python3-+-3*[apertium-interc]
|         |         |-apertium-postch
|         |         |-2*[apertium-pretra]
|         |         |-2*[apertium-tagger]
|         |         |-3*[apertium-transf]
|         |         `-5*[lt-proc]
|         |-3*[python3-+-5*[apertium-interc]]
|         |            |-3*[apertium-postch]]
|         |            |-4*[apertium-pretra]]
|         |            |-4*[apertium-tagger]]
|         |            |-7*[apertium-transf]]
|         |            `-11*[lt-proc]]
|         |-4*[python3-+-4*[apertium-interc]]
|         |            |-2*[apertium-postch]]
|         |            |-4*[apertium-pretra]]
|         |            |-4*[apertium-tagger]]
|         |            |-6*[apertium-transf]]
|         |            `-11*[lt-proc]]
|         |-python3-+-5*[apertium-interc]
|         |         |-3*[apertium-postch]
|         |         |-4*[apertium-pretra]
|         |         |-4*[apertium-tagger]
|         |         |-6*[apertium-transf]
|         |         |-cg-proc
|         |         `-11*[lt-proc]
|         |-python3-+-4*[apertium-interc]
|         |         |-2*[apertium-postch]
|         |         |-3*[apertium-pretra]
|         |         |-3*[apertium-tagger]
|         |         |-5*[apertium-transf]
|         |         `-8*[lt-proc]
|         |-python3-+-6*[apertium-interc]
|         |         |-4*[apertium-postch]
|         |         |-6*[apertium-pretra]
|         |         |-6*[apertium-tagger]
|         |         |-8*[apertium-transf]
|         |         |-cg-proc
|         |         `-17*[lt-proc]
|         |-2*[python3-+-7*[apertium-interc]]
|         |            |-5*[apertium-postch]]
|         |            |-8*[apertium-pretra]]
|         |            |-8*[apertium-tagger]]
|         |            |-11*[apertium-transf]]
|         |            |-2*[cg-proc]]
|         |            `-23*[lt-proc]]
|         |-2*[python3-+-8*[apertium-interc]]
|         |            |-6*[apertium-postch]]
|         |            |-10*[apertium-pretra]]
|         |            |-10*[apertium-tagger]]
|         |            |-13*[apertium-transf]]
|         |            |-2*[cg-proc]]
|         |            `-29*[lt-proc]]
|         |-python3-+-7*[apertium-interc]
|         |         |-5*[apertium-postch]
|         |         |-8*[apertium-pretra]
|         |         |-8*[apertium-tagger]
|         |         |-11*[apertium-transf]
|         |         |-cg-proc
|         |         `-23*[lt-proc]
|         |-2*[python3-+-9*[apertium-interc]]
|         |            |-7*[apertium-postch]]
|         |            |-12*[apertium-pretra]]
|         |            |-12*[apertium-tagger]]
|         |            |-15*[apertium-transf]]
|         |            |-2*[cg-proc]]
|         |            `-35*[lt-proc]]
|         |-3*[python3-+-8*[apertium-interc]]
|         |            |-6*[apertium-postch]]
|         |            |-11*[apertium-pretra]]
|         |            |-11*[apertium-tagger]]
|         |            |-14*[apertium-transf]]
|         |            |-2*[cg-proc]]
|         |            `-32*[lt-proc]]
|         |-python3-+-9*[apertium-interc]
|         |         |-7*[apertium-postch]
|         |         |-11*[apertium-pretra]
|         |         |-11*[apertium-tagger]
|         |         |-14*[apertium-transf]
|         |         |-2*[cg-proc]
|         |         `-32*[lt-proc]
|         |-python3-+-8*[apertium-interc]
|         |         |-6*[apertium-postch]
|         |         |-12*[apertium-pretra]
|         |         |-12*[apertium-tagger]
|         |         |-15*[apertium-transf]
|         |         |-2*[cg-proc]
|         |         `-35*[lt-proc]
|         |-python3-+-8*[apertium-interc]
|         |         |-6*[apertium-postch]
|         |         |-11*[apertium-pretra]
|         |         |-11*[apertium-tagger]
|         |         |-14*[apertium-transf]
|         |         |-cg-proc
|         |         `-32*[lt-proc]
|         `-python3-+-7*[apertium-interc]
|                   |-5*[apertium-postch]
|                   |-10*[apertium-pretra]
|                   |-10*[apertium-tagger]
|                   |-13*[apertium-transf]
|                   |-cg-proc
|                   `-29*[lt-proc]
  • firejailing apertium is clearly something we should do but it is not going to solve this problem.
  • cgrouping will mitigate the problem's effect but the problem itself will still be there. cgrouping it right now though will make the migration to systemd and jessie a bit more complicated.
  • Finding the bug and fixing it is something we should anyway do, regardless of any mitigating action we take.

I explained some of this on IRC, but repeating here for completeness: if APY is set to use "all available cores", then it starts one http server per core, and each http server gets to start all language pairs (as requests come in). Each language pair uses something like 7 processes (depending on the pair), and by default pipelines are kept open forever to make requests fast. So if you have 48 cores and 27 language pairs, you could end up with about 9.000 processes :-)

You can specify how many http servers to use with the -j argument, where -j0 is one server per core. On apertium.org (8 core, 32GB RAM) we just use -j1, and performance seems good. Given the existence of 48-core machines and the fact that -j1 works fine for apertium.org, I changed the default of -j to 1 in SVN revision 61363 of APY.

APY also already has an option to shut down language pairs that haven't been used in N seconds, with the -m/--max-idle-secs argument. If you run with -j>1 it might be worth specifying that. Since we use -j1 on apertium.org (and check every hour that language pairs respond, meaning lastUsage is always <1hr ago) we haven't used that there, but the feature seems to work in testing.

There's no feature yet to restart a pair after N usages (regardless of last usage); that would make sense if there are memory leaks in any of the individual processes used by the pair pipelines, but that hasn't been a problem for us yet (we do still restart APY every midnight just in case, so maybe it would be a problem if we didn't). It should be simple enough to implement something like that though.

Hello,

It's very good that you decided to repeat this here as well. It is very much appreciated.

So, in production, we got slightly more cores than 8 (32 to be exact) and 58 language pairs which means ~13000 processes if my math is correct. Which is a bit too high IMHO.

While I must say I am not sure of the performance gains of not shutting down language pairs anyway in the default case, I think we are going to experiment with the -m/--max-idle-secs argument. And of course -j as well. Starting with 1 and perhaps raising if problems arise. I am hoping we will not have bugs to report as far as this functionality goes. Given those 2 changes we might very well mitigate our problem. It will also mean that we might end up hiding memory leaks if language pairs get restarted often however. But it's a risk we will have to take anyway.

Again, thank you for posting this here. It's very helpful indeed.

I am not sure of the performance gains of not shutting down language pairs anyway in the default case

If you want to compare with shutting down each pipeline after every completed request, you can try this change:

diff --git a/servlet.py b/servlet.py
index 30290ae..0638e36 100755
--- a/servlet.py
+++ b/servlet.py
@@ -216,6 +216,9 @@ class TranslateHandler(BaseHandler):
         self.pipelines.pop(pair)
 
     def cleanPairs(self):
+        for pair in self.stats['useCount']:
+            logging.info('Shutting down pair %s-%s' % (pair[0], pair[1]))
+            self.shutdownPair(pair)
         if self.max_idle_secs:
             for pair, lastUsage in self.stats['lastUsage'].items():
                 if pair in self.pipelines and time.time() - lastUsage > self.max_idle_secs:

and testing before and after something like

#!/bin/bash
time for i in {1..10}; do curl 'http://localhost:2737/translate?langpair=nno|nob&q=ikkje+eg'; done

On my computer, that's 0m0.634s with pipelines open vs 0m3.950s when closing pipelines after every request.

(Though that change there will break things when you get concurrent requests, so don't try it in production ;-))

https://gerrit.wikimedia.org/r/#/c/230108/ is the first attempt to mitigate the issue. @Unhammer, I'll update apertium-apy after updating to latest upstream later in this week.

https://gerrit.wikimedia.org/r/#/c/230108/ has been merged. It effectively sets -j1 -m300 for apertium-apy in upstart/systemd. Now to measure the result over the course of a couple of days

#!/bin/bash
time for i in {1..10}; do curl 'http://localhost:2737/translate?langpair=nno|nob&q=ikkje+eg'; done

On my computer, that's 0m0.634s with pipelines open vs 0m3.950s when closing pipelines after every request.

(Though that change there will break things when you get concurrent requests, so don't try it in production ;-))

OK, that makes them clearer, thanks for posting them.

I've now implemented the above mentioned option -r to shut down a pipeline that has handled >N requests. Default is 1000.

I also took the time to add support for several pipelines per pair in the same server process, based on how much a pair is used. We typically have two or three very popular pairs, while the rest are rather rarely used, so something like "-j2" where we have two of each pair is a bit wasteful. Instead, you can now do

#!/bin/bash
./servlet.py -j1 -i2 -m300 -- /path/to/pairs

and if, say, spa→cat at some point gets >5 (change with the -u option) concurrent users, it'll spin up a second pipeline for that pair to handle the extra traffic, and turn it off again after 300 seconds of idleness (or 1000 requests).

You can also add -n1 to keep a minimum of one pipeline per pair open in that server process. This is now used on apertium.org (that is, -j1 -i2 -m300 -n1). See also ./servlet.py --help (or http://wiki.apertium.org/wiki/Apertium-apy ).

So if popular language pairs seem slow, you may want to set -i2 or maybe even -i3. I wouldn't increase -j unless the whole server seems to respond slowly regardless of request type (and there's lots of RAM).

Arrbee lowered the priority of this task from Unbreak Now! to High.Aug 12 2015, 7:07 AM
Arrbee edited a custom field.
Arrbee moved this task from Backlog to In Progress on the LE-CX6-Sprint 2 board.
Arrbee raised the priority of this task from High to Unbreak Now!.Aug 12 2015, 7:31 AM

@Arrbee can I understand why you raised the priority on this task?

btw, with the new settings we incur in a too many open files problem.

It somehow seems pipes are not being closed correctly from the main process?

# lsof -p 21379 | grep pipe | wc -l
1015
# cat /proc/21379/limits  | grep 'open files'
Max open files            1024                 4096                 files

Change 230983 had a related patch set uploaded (by Giuseppe Lavagetto):
apertium: raise the open files limit

https://gerrit.wikimedia.org/r/230983

"the new settings" being just -j1 -m300? Also, what SVN revision is this now? And did it keep increasing after the limits were increased?

I'm not able to reproduce any fd leak here at least. On apertium.org, there is a 21hr old servlet.py process with 587 open pipes, which seems about right for our 93 modes files (one per translation direction, about 7 processes per such), we're on SVN revision 61403 with args "-j1 -n1 -i2 -u3 -m300".

I also tried running

$ ./servlet.py -m1 -r5   --  /usr/local/share/apertium   

which should start and stop pairs lots of times, then sending lots of concurrent requests and looking at the pipe count:

$ lsof -p $(pgrep -f servlet.py)|grep -c pipe
2
$ source /tmp/make-lotsa-requests &  lsof -p $(pgrep -f servlet.py)|grep -c pipe;sleep 1;lsof -p $(pgrep -f servlet.py)|grep -c pipe
[1] 29964
12
[1]+  Done                    source /tmp/make-lotsa-requests
13
$ lsof -p $(pgrep -f servlet.py)|grep -c pipe
9

(That's 2 for servlet.py + 7 for nob→nno which I was testing.) Running the requests again without restarting the process gives the exact same pipe count.

@Unhammer We haven't updated Apertium-APY yet! So, that can be issue.

You might want to try running

tools/sanity-test-apy.py http://your.apy.endpoint.wikimedia.org

(from the APY folder) while reading the servlet.py logs to see if any pairs lead to crashes or something like that (maybe change the tests to reflect your installed pairs). The new APY version will also mention in the logs how many pipelines are "still scheduled for shutdown"; these are pipelines where there are still requests to them that haven't been finished. I did manage to get hanging requests by trying so many different language pairs at the same time that I ran out of memory and got bad_alloc crashes, but hopefully your server is configured so that's not possible :)

Change 230983 merged by Alexandros Kosiaris:
apertium: raise the open files limit

https://gerrit.wikimedia.org/r/230983

akosiaris renamed this task from Apertium leaves a ton of stale processes, consumes all the available to Apertium leaves a ton of stale processes, consumes all the available memory.Aug 13 2015, 8:17 AM
akosiaris lowered the priority of this task from Unbreak Now! to High.Aug 13 2015, 8:22 AM

Lowering back to high priority. The problem still seems to exist though. Looking at:

http://ganglia.wikimedia.org/latest/graph.php?r=week&z=xlarge&h=sca1001.eqiad.wmnet&m=cpu_report&s=descending&mc=2&g=mem_report&c=Service+Cluster+A+eqiad

After restarting apertium with the -j1 -m300 flags, memory usage did reach a plateau of 30G but after merging the open files limit change yesterday and restarting apertium again, all available memory is consumed once more. Not sure however if the open files limit was acting as a choke and raising it allowed to problem to exhibit itself again.

root@sca1001:~# ps -u apertium -fa | wc -l
3202

My math doesn't add up to that number. 58 pairs * 7 processes * 1 core => 406 processes, not 3202.

Some more info

akosiaris@sca1001:/var/log/upstart$ sudo grep 'nno-nob' apertium-apy.log
[I 150813 07:34:03 servlet:221] nno-nob not in pipelines of this process, starting …
[I 150813 07:34:03 servlet:221] nno-nob not in pipelines of this process, starting …
[I 150813 08:03:01 servlet:216] Shutting down pair nno-nob since it has not been used in 300 seconds
akosiaris@sca1001:/var/log/upstart$ ps auxwww |grep nno-nob |wc -l
217
akosiaris@sca1001:/var/log/upstart$ date
Thu Aug 13 08:45:13 UTC 2015

@KartikMistry, @Unhammer, is it me or the pair has not been really shutdown ?

:-( we have 526 procs for our user, running 93 translation modes. Does the servlet.py output tell you anything useful?

e.g. if you do

egrep -v '(GET|POST)' the.log |grep '^\['

Also, is the python3/servlet.py process itself consuming lots of memory? (We had a bug before with servlet.py itself consuming lots of memory, but that was due to not flushing stored unknown words often enough, I gather you're not using the -f option.)

That one did not return anything. Removing the extra grep did return something

akosiaris@sca1001:/var/log/upstart$ sudo egrep -v '(GET|POST)' apertium-apy.log 
[I 150813 06:41:30 servlet:221] spa-cat not in pipelines of this process, starting …
[I 150813 06:41:30 servlet:216] Shutting down pair eng-cat since it has not been used in 300 seconds
[I 150813 06:41:30 servlet:216] Shutting down pair eng-spa since it has not been used in 300 seconds
[I 150813 06:41:30 servlet:221] spa-cat not in pipelines of this process, starting …
[I 150813 06:41:30 servlet:221] spa-cat not in pipelines of this process, starting …
[I 150813 06:41:30 servlet:221] spa-cat not in pipelines of this process, starting …
[I 150813 06:41:30 servlet:221] spa-cat not in pipelines of this process, starting …
[I 150813 06:41:30 servlet:221] spa-cat not in pipelines of this process, starting …
[I 150813 06:41:34 servlet:221] spa-cat not in pipelines of this process, starting …
[I 150813 06:49:19 servlet:221] fra-spa not in pipelines of this process, starting …
[I 150813 06:49:19 servlet:221] fra-spa not in pipelines of this process, starting …
[I 150813 06:49:19 servlet:221] fra-spa not in pipelines of this process, starting …
[I 150813 06:49:19 servlet:221] fra-spa not in pipelines of this process, starting …
[I 150813 06:49:19 servlet:221] fra-spa not in pipelines of this process, starting …
[E 150813 06:49:21 servlet:182] _worker problem
    Traceback (most recent call last):
      File "/usr/share/apertium-apy/servlet.py", line 178, in run_worker
        self._worker(*worker_args)
      File "/usr/share/apertium-apy/servlet.py", line 259, in _worker
        _, _, do_flush = self.pipelines[(l1, l2)]
    KeyError: ('fra', 'spa')
...
another 4 instances of that

Why 6 different efforts to try starting the pipeline btw ?

Also, is the python3/servlet.py process itself consuming lots of memory? (We had a bug before with servlet.py itself consuming lots of memory, but that was due to not flushing stored unknown words often enough, I gather you're not using the -f option.)

No, we are not using the -f option. In fact you can see exactly what do here:

https://phabricator.wikimedia.org/diffusion/OPUP/browse/production/modules/apertium/templates/initscripts/apertium-apy.upstart.erb;c29fa1b4dddc65c174131ff6381ed3d460562803$19

That being said

apertium 33980  2.6  0.0 3752024 40144 ?       Ss   Aug12  36:49 /usr/bin/python3 /usr/share/apertium-apy/servlet.py -j1 -m300 /usr/share/apertium/modes

so, ~3.7GB virtual and ~40MB resident memory usage.

A simple grep for nno-nob will also show processes of translation modes for nob-nno (which uses data from the same language folder, as well as minor variants like nob-nno_e but I guess there's no reason that'd be started).

does

ps -ef |grep nob-nno.automorf.bin$

show many processes? also what are their parent pids?

Oh, that is odd. It should not be starting more than one process per pair by default …

Wait, what version of apy is this?

[E 150813 06:49:21 servlet:182] _worker problem
    Traceback (most recent call last):
      File "/usr/share/apertium-apy/servlet.py", line 178, in run_worker
        self._worker(*worker_args)
      File "/usr/share/apertium-apy/servlet.py", line 259, in _worker
        _, _, do_flush = self.pipelines[(l1, l2)]
    KeyError: ('fra', 'spa')

refers to ancient code (removed last November!)

Wait, what version of apy is this?

dpkg -l apertium-apy | grep ^ii
ii  apertium-apy                        0.1+svn~57689-1                  amd64        Apertium APY service

A simple grep for nno-nob will also show processes of translation modes for nob-nno (which uses data from the same language folder, as well as minor variants like nob-nno_e but I guess there's no reason that'd be started).

does

ps -ef |grep nob-nno.automorf.bin$

show many processes? also what are their parent pids?

OOM showed up and killed various apertium related processes so I restarted apertium-apy to get the service back running more reliably than what we would have after the OOM event. Perhaps it was not needed and apertium would recover on its own but I was not sure. For now we don't have yet nob-nno pairs but we do have other pairs

akosiaris@sca1001:/var/log/upstart$ ps -ef |grep automorf
apertium 34909 34111  0 09:50 ?        00:00:00 lt-proc -z /usr/share/apertium/apertium-eo-en/en-eo.automorf.bin
apertium 35030 34111  0 09:52 ?        00:00:00 lt-proc -z /usr/share/apertium/apertium-es-ast/es-ast.automorf.bin
apertium 35031 34111  0 09:52 ?        00:00:00 lt-proc -z /usr/share/apertium/apertium-es-ast/es-ast.automorf.bin
apertium 35032 34111  0 09:52 ?        00:00:00 lt-proc -z /usr/share/apertium/apertium-es-ast/es-ast.automorf.bin
apertium 37970 34111  0 10:05 ?        00:00:00 lt-proc -z /usr/share/apertium/apertium-es-gl/es-gl.automorf.bin

So the es-ast pair started 3 times ? or is that me ?

-r57689 is almost a year old and used threading instead of select/polling; RAM usage was the main reason for the rewrite, but also easier process handling (pretty sure that explains the amount of pipelines started). Could you give -r61403 a try? It doesn't require toro any more, but does depend on Tornado >=4.2 and Python >=3.3

-r57689 is almost a year old and used threading instead of select/polling; RAM usage was the main reason for the rewrite, but also easier process handling (pretty sure that explains the amount of pipelines started). Could you give -r61403 a try? It doesn't require toro any more, but does depend on Tornado >=4.2 and Python >=3.3

I am not opposed to it for sure. That being said Tornado 4.2 is going to be problem. Is that a hard dependency ? trusty (on which we currently run) has 3.1.1 and jessie, to which we plan to migrate has 3.2.2. Not sure how easy it would be to backport 4.2 to work on our infrastructure

4.2 introduced the locks that we currently use – before that we depended on toro for the same stuff. I'm not sure how much else from >4 we require, but it would require some rewriting in any case.

So hard dependency. I was afraid of that (but always hoping!). We might be able to port python-tornado to both jessie and trusty. It's already in jessie-backports (4.2.0-1~bpo8+1) so it should be possible at least

Tried making it work with tornado 3; apy -r61424 seems to work on 3.2 which is what my Ubuntu machine installs.

EDIT: seems to work on my Trusty (tornado 3.1) testing machine as well.

By the way, I just included toro.py there as a single file, so no need to install it separately; dependencies should just be tornado >=3.1 now.

Change 231280 had a related patch set uploaded (by Alexandros Kosiaris):
Updated package to 0.1 svn~61425

https://gerrit.wikimedia.org/r/231280

Change 231280 merged by Alexandros Kosiaris:
Updated package to 0.1 svn~61425

https://gerrit.wikimedia.org/r/231280

Change merged, package built, uploaded on apt.wikimedia.org, servers upgraded, apertium-apy restarted. The above two links should be pretty telling in the next hours/days about whether the issue was solved or not.

Seems like the problem has been resolved. Memory consumption for the past week has been between 11.7G and 16.6G which is significantly less than previously. apertium running processes have also been way way lower. Currently measuring 10 on the server. I am resolving this as success. Thanks to @Unhammer and @KartikMistry for helping resolve this!