MariaDB: Follow recommended memory suggestions
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	Tarrow
	Jun 16 2022, 10:24 AM

Description

Following https://mariadb.com/kb/en/mariadb-memory-allocation

We should bump both primary and secondary RAM to 4GB and allocate 70% of that to the innodb_buffer_pool_size

Lets try this and see if it reduces our unusual problems

Related Objects

Mentioned In: T310597: Production warning: Aborted connection 38950 to db: 'mwdb_wbstack_X' user: 'mwu_X' host: '10.108.4.7' (Got an error reading communication packets)
Mentioned Here: T310697: [timebox 16hrs] Investigate using Prometheus locally to monitor k8s PV utilisation

Event Timeline

Tarrow created this task.Jun 16 2022, 10:24 AM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJun 16 2022, 10:24 AM

• toan moved this task from Doing to Backlog on the Wikibase Cloud (Launch Migration Kanban (2022)) board.Jun 16 2022, 10:57 AM

• toan mentioned this in T310597: Production warning: Aborted connection 38950 to db: 'mwdb_wbstack_X' user: 'mwu_X' host: '10.108.4.7' (Got an error reading communication packets) .

As an initial attempt to understand how these things correlate we kicked over the secondary pod and will monitor the situation a bit before we do this change.

Deniz_WMDE claimed this task.Jun 20 2022, 12:12 PM

Deniz_WMDE moved this task from Backlog to Doing on the Wikibase Cloud (Launch Migration Kanban (2022)) board.

PR for staging: https://github.com/wmde/wbaas-deploy/pull/433

Note: This is probably not functional if we don't increase the staging cluster nodes before!

PR for production: https://github.com/wmde/wbaas-deploy/pull/436

Deniz_WMDE removed Deniz_WMDE as the assignee of this task.Jun 21 2022, 9:56 AM

Deniz_WMDE moved this task from Doing to Review on the Wikibase Cloud (Launch Migration Kanban (2022)) board.

Deniz_WMDE subscribed.

dang claimed this task.Jun 21 2022, 12:58 PM

Could we somehow do it on local also?

Created a PR for local env: https://github.com/wmde/wbaas-deploy/pull/439

dang removed dang as the assignee of this task.Jun 21 2022, 3:36 PM

dang subscribed.

Tarrow claimed this task.Jun 23 2022, 9:38 AM

There was quite some debate about if raising these values in this way makes sense.

For example would 2GB of total requested memory actually be sufficient? Also, what impact would tweaking this buffer have on the system?

A suggestion was made by @toan to consider tracking the numbers mentioned in https://mariadb.com/kb/en/innodb-buffer-pool/#innodb_buffer_pool_size. Specifically if, looking over time innodb_buffer_pool_reads changes less than 1% of the change in reads of innodb_buffer_pool_read_requests. If the outcome of T310697 looks promising then we could monitor these numbers relatively easily. I enabled the metrics sidecar locally by tweaking the chart values. These metrics are then clearly visible by connecting to the primary or secondary sql service on port 9104. For example:

# HELP mysql_global_status_innodb_buffer_pool_read_requests Generic metric from SHOW GLOBAL STATUS.
# TYPE mysql_global_status_innodb_buffer_pool_read_requests untyped
mysql_global_status_innodb_buffer_pool_read_requests 24211
# HELP mysql_global_status_innodb_buffer_pool_reads Generic metric from SHOW GLOBAL STATUS.
# TYPE mysql_global_status_innodb_buffer_pool_reads untyped
mysql_global_status_innodb_buffer_pool_reads 1223

In my opinion it would probably still make sense to go ahead and just merge this patch, as is, and then in a few weeks/months when the details of the performance of these buffers is more observable then we can think about tweaking this stuff up or down.

Tarrow moved this task from Review to Deploy To Production on the Wikibase Cloud (Launch Migration Kanban (2022)) board.Jun 23 2022, 3:35 PM

Seems to be happily deployed. It is notable that right now every change like this does "cause disruption" i.e. Wikis go down for a short period

Tarrow moved this task from Deploy To Production to Done on the Wikibase Cloud (Launch Migration Kanban (2022)) board.Jun 23 2022, 3:50 PM

Tarrow closed this task as Resolved.Jun 23 2022, 9:11 PM

MariaDB: Follow recommended memory suggestionsClosed, ResolvedPublicActions

Description

Related Objects

Event Timeline

MariaDB: Follow recommended memory suggestions
Closed, ResolvedPublic
Actions