Page MenuHomePhabricator

dbstore1007 is swapping heavilly, potentially soon killing mysql services due to OOM error
Closed, ResolvedPublicBUG REPORT

Description

dbstore1007 is using 96% of its total memory: https://grafana.wikimedia.org/d/000000377/host-overview?viewPanel=4&orgId=1&var-server=dbstore1007&var-datasource=thanos&var-cluster=misc&from=1623138837938&to=1631516764499

It is frequently swapping:
https://grafana.wikimedia.org/d/000000377/host-overview?viewPanel=18&orgId=1&var-server=dbstore1007&var-datasource=thanos&var-cluster=misc&from=1623740593737&to=1631516593737&refresh=30s

which not only makes it run with lower performance, it also has the danger of the OOM killer activating and killing a mysql daemon.

I recommend to research a possible memory leak on those servers and/or restart some to prevent the killing.

Event Timeline

This is not the first time it happens, and seems specific to analytics dbs: T270112

Change 720739 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] dbstore1007: Decrease buffer_pool_sizes

https://gerrit.wikimedia.org/r/720739

Change 720739 merged by Marostegui:

[operations/puppet@production] dbstore1007: Decrease buffer_pool_sizes

https://gerrit.wikimedia.org/r/720739

I have merged the above patch to decrease mysql buffer pool sizes for all the instances. This requires mysql restarts. Please do so or let me know when I can do it

odimitrijevic triaged this task as High priority.
odimitrijevic moved this task from Incoming to Operational Excellence on the Analytics board.

Mentioned in SAL (#wikimedia-analytics) [2021-09-13T18:13:00Z] <razzi> razzi@dbstore1007:~$ sudo systemctl restart mariadb@s3.service for T290841

Mentioned in SAL (#wikimedia-operations) [2021-09-13T18:13:13Z] <razzi> razzi@dbstore1007:~$ sudo systemctl restart mariadb@s3.service for T290841

Mentioned in SAL (#wikimedia-analytics) [2021-09-13T18:19:28Z] <razzi> razzi@dbstore1007:~$ sudo systemctl restart mariadb@s4.service for T290841

Mentioned in SAL (#wikimedia-analytics) [2021-09-13T18:24:57Z] <razzi> razzi@dbstore1007:~$ for socket in /run/mysqld/*; do sudo mysql --socket=$socket -e "START SLAVE"; done - reenable replication for T290841

Mentioned in SAL (#wikimedia-operations) [2021-09-13T18:25:37Z] <razzi> reenable replication on dbstore1007 for T290841

Restarting the 3 mysqld sections put the memory into a reasonable 14% usage. It's possible there's something leaking memory however and this isn't the last we'll see of this situation.

odimitrijevic moved this task from Incoming to Ops on the Data-Engineering board.
odimitrijevic added a subscriber: razzi.

This has occurred again on dbstore1007.

image.png (797×1 px, 121 KB)

I will do some investigation to see if I can find out where the memory leak might be, but if I can't get anywhere I will have to restart the three mariadb sections again.

@BTullis mariadb 10.4.22 has fixed some memory leaks, which might or might be related to this. If you want, I can try to install it now (or whenever you tell me it is a good moment to restart mariadb).

@Marostegui - That sound like a great idea to me. I think that now would be a good time to try this upgrade to 10.4.22.
It looks like they're not being heavily used at the moment, according to Grafana, so I say go for it if you have the time.

Sorry I commented while manuel did already.

@Marostegui - That sound like a great idea to me. I think that now would be a good time to try this upgrade to 10.4.22.
It looks like they're not being heavily used at the moment, according to Grafana, so I say go for it if you have the time.

Ok, I will go for it now.

Mentioned in SAL (#wikimedia-operations) [2021-11-18T12:15:53Z] <marostegui> Upgrade dbstore1007 to 10.4.22 T290841 T295970

Upgrade done, replication started.

Great, thanks. I'll try to keep an eye on this graph for the next few months to see if it's resolved the issue fully.

Nevermind, I was looking at the wrong graph. It keeps increasing, we'll see if it stabilizes at some point.

At this point in my career, I came to zen and accept memleaks as a fact of life. I suggest we should just restart those hosts from time to time for usual maintenance (os upgrade, security updates, mariadb minor and major upgrade, etc.)

BTullis claimed this task.

Thanks @Ladsgroup for the reflection. Sadly. at this point in my career I have yet to achieve these levels of zen and the killing of processes due to memory leaks like this still cause me a degree of anguish.
To mis-quote Dylan Thomas *

Rage, rage against the dying of the bytes.

However, on this occasion, I think you're probably right and we should just take the pragmatic decision to restart it as required.
The rate of increase in the memory usage is now so slow that normal maintenance restarts are more frequent than likely incidents of memory exhaustion and associated swapping, so I'll resolve this ticket.

image.png (922×1 px, 157 KB)

While searching for other things on MariaDB's JIRA, I saw there are a bug or a few related to performance_schema memory leaks on MariaDB (this seems to be specific to MariaDB, and not happening on MySQL). It was reported that disabling P_S didn't make the leak fully disappear but it made it way slower. I wouldn't advise against doing that on production mw hosts, as P_S is such a great debbugging tool, but maybe it is something that could be considered for analytics dbs? Because analytics dbs have such different query patterns (long running queries) it would make sense those are more affected. Or check if there are active events/cron jobs using it that could be disabled. Just a suggestion that seemed relevant, you don't have to listen to me.

I always listen to your suggestions @jcrespo :-)
Do you happen to have a handy link to any of those MariaDB bug reports about the performance_schema memory leaks please?

I'm not aware of any active jobs that use the feature, but I'd be happy to try turning it off to find out:

  • if any users complain that their jobs no longer work
  • if it slows or stops the memory leak

Maybe we could run the dbstore servers for a few months with this feature disabled, just to test the hypothesis.

@Ottomata - can you see any issues with this? Do you know of any active jobs that make use of the performance_schema feature on the dbstore hosts?

Not that I know of! But I probably wouldn't know either! :)

Do you happen to have a handy link to any of those MariaDB bug reports about the performance_schema memory leaks please?

I cannot find the exact one right now, but these are related (I was monitoring some of these for mediawiki production):
https://jira.mariadb.org/browse/MDEV-24417
https://jira.mariadb.org/browse/MDEV-20933
https://jira.mariadb.org/browse/MDEV-23936

Heads up to @Ladsgroup about https://jira.mariadb.org/browse/MDEV-12205 which is unrelated to analytics, but could hit production.

If performance schema is disabled, please consider enabling user_stats plugin, which is a poor man's P_S.

@jcrespo Thanks but it seems that only happens in write queries with max time (unlike our system where only read queries have max time). I make sure we don't add max time to our write queries and it's not needed since mw is good at killing slow write queries (unlike slow read queries)