Page MenuHomePhabricator

dbstore1002 /srv filling up
Closed, ResolvedPublic

Description

Hi,

dbstore1002 is slowly filling up:

root@dbstore1002:/srv/sqldata# df -hT /srv/
Filesystem            Type  Size  Used Avail Use% Mounted on
/dev/mapper/tank-data xfs   6.3T  5.6T  682G  90% /srv

It has been alerting over the last few months and I have been slowly increasing its volume, as there is still some space left:

root@dbstore1002:/srv/sqldata# pvs
  PV         VG   Fmt  Attr PSize PFree
  /dev/sda3  tank lvm2 a--  6.50t 245.64g

However, now that we still have some time, it would be nice to come up with a better solution (maybe drop old unused tables?)

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
elukey triaged this task as Medium priority.Jun 19 2017, 2:38 PM
elukey added a project: User-Elukey.
elukey raised the priority of this task from Medium to High.Jun 22 2017, 11:07 AM
elukey edited projects, added Analytics-Kanban; removed Analytics.

IIUC du -hsc _log_* should list the space consumption of the log database tables, that is around 1.3T. This slave will be a target of T156933 so the eventlogging_cleaner script will likely help a lot.

There is one table on dbstore1002 that holds ~370M rows (~170GB in size): _Edit_11448630_old. This table is not present on db104[67], so it might be a good candidate to drop. Sent an email to analytics-internal and data analysts to get some feedback before proceeding.

Table dropped after a consultation with data analysts and the analytics team.

Awesome!
Once you're ready to start purging old rows, we can try to optimize a couple of tables and see what happens with the claimed disk space

Mentioned in SAL (#wikimedia-operations) [2017-07-14T11:41:16Z] <marostegui> Add 50G to /srv/ on dbstore1002 - T168303

This host reached this today:

Filesystem            Type  Size  Used Avail Use% Mounted on
/dev/mapper/tank-data xfs       6.3T  5.8T  477G  93% /srv

I have added 50G more to the volume to avoid it paging during the weekend:

root@dbstore1002:~# df -hT /srv/
Filesystem            Type  Size  Used Avail Use% Mounted on
/dev/mapper/tank-data xfs   6.4T  5.8T  536G  92% /srv

Let's run the purge script as soon as possible next week and optimize the tables to see how much we gain.

From the last crash this host had (T170308) looks like there are two temporary files that were generated during the ALTER that were never cleaned up and are eating up 40GB in total:

root@dbstore1002:/srv/tmp# ls -lhS
total 44G
-rw-rw---- 1 mysql mysql  25G Jul 14 12:04 #sql_5e0b_0.MAI
-rw-rw---- 1 mysql mysql  21G Jul 14 12:04 #sql_5e0b_0.MAD

They are still in use, so I reckon we would need to restart mariadb on this host for them to stop being tracked and we could delete them probably:

root@dbstore1002:/srv/tmp# fuser "#sql_5e0b_0.MAI"
/srv/tmp/#sql_5e0b_0.MAI: 24075
root@dbstore1002:/srv/tmp# fuser "#sql_5e0b_0.MAD"
/srv/tmp/#sql_5e0b_0.MAD: 24075
root@dbstore1002:~# cat /srv/sqldata/dbstore1002.pid
24075

Actually, those are in use by:

      Id: 3616956
...
      db: log
 Command: Query
    Time: 6746
   State: Queried about 106270000 rows
    Info: select id,uuid,userAgent,count(*) as events from QuickSurveyInitiation_15278946 where timestamp like '201706%' group by userAgent,event_eventName,event_surveyCodeName,event_surveyInstanceToken,event_surveySessionToken having events > 1
Progress: 0.000

Which is responsible for that size shortage.

Started the eventlogging cleaner script for 2014 data on dbstore1002, we'll run optimize queries afterwards to see how much space from the log database we gain..

Mentioned in SAL (#wikimedia-operations) [2017-07-19T13:24:29Z] <marostegui> Optimize EditConflict_8860941_15423246 and Echo_7731316 on dbstore1002 - T168303

This is the graph for the last 30 months of disk usage:
https://grafana.wikimedia.org/dashboard/file/server-board.json?refresh=1m&panelId=17&fullscreen&orgId=1&var-server=dbstore1002&var-network=bond0&from=1497879482633&to=1500471482634

It has consumed around 300G in the last month.
So far we have available:

root@dbstore1002:~# df -hT /srv
Filesystem            Type  Size  Used Avail Use% Mounted on
/dev/mapper/tank-data xfs   6.4T  5.9T  537G  92% /srv
root@dbstore1002:~# pvs
  PV         VG   Fmt  Attr PSize PFree
  /dev/sda3  tank lvm2 a--  6.50t 135.64g

That means only two months of disk space, so maybe we really need to go and check unused tables that can be removed as a more aggressive way to gain disk space back.
I have been also checking which InnoDB tables aren't compressed on dbstore1002 and there are a few wikis without InnoDB compression there, so maybe we can also compress those, although we'll not get a huge amount of space back, but at least something.

Mentioned in SAL (#wikimedia-operations) [2017-07-19T13:46:17Z] <marostegui> Compress database rowiki on dbstore1002 - T168303

All these wikis are not InnoDB compressed:

66G	viwiki
97G	ukwiki
32G	kowiki
82G	huwiki
43G	hewiki
48G	frwiktionary
69G	fawiki
131G	eswiki
52G	cawiki
145G	arwiki

So I will compress them

Mentioned in SAL (#wikimedia-operations) [2017-07-19T13:46:17Z] <marostegui> Compress database rowiki on dbstore1002 - T168303

rowiki has gone from 45G to around 17G, so that is nice. I am going to compress now all the wikis on T168303#3452667 so we can get more space back, specially from the big ones.

Mentioned in SAL (#wikimedia-operations) [2017-07-19T16:16:18Z] <marostegui> Compressing innodb on dbstore1002 for the following wikis: viwiki ukwiki kowiki huwiki hewiki frwiktionary fawiki eswiki cawiki arwiki - T168303

If I made correct calculations the top 20 tables of the log database on store/slave weight ~2.6TB, so even cutting their size in a half would be a massive win for us. Just sent an email to our data analysts to get their feedback/opinion.

In the meantime the eventlogging cleaner script is still running on 2014 and older data, but since most of the biggest tables are whitelisted we might not get a huge amount of space back.

The compression for those uncompressed wikis finished and we've got some more room to breathe now:

root@dbstore1002:~# df -hT /srv
Filesystem            Type  Size  Used Avail Use% Mounted on
/dev/mapper/tank-data xfs   6.4T  5.5T  970G  86% /srv
root@dbstore1002:~# pvs
  PV         VG   Fmt  Attr PSize PFree
  /dev/sda3  tank lvm2 a--  6.50t 135.64g

However, we _really_ need to keep auditing and deleting that old data and nuking unused tables where possible.
https://grafana-admin.wikimedia.org/dashboard/file/server-board.json?refresh=1m&panelId=17&fullscreen&orgId=1&from=1500101868124&to=1500706668124&var-server=dbstore1002&var-network=bond0

Its usual pace of 30G eaten per day...

[05:51:20] marostegui@dbstore1002:~$ df -hT /srv
Filesystem            Type  Size  Used Avail Use% Mounted on
/dev/mapper/tank-data xfs   6.4T  5.5T  937G  86% /srv

In https://phabricator.wikimedia.org/T170720 we are planning to move one big table (~500GB) to HDFS and drop it from mysql.

In https://phabricator.wikimedia.org/T170720 we are planning to move one big table (~500GB) to HDFS and drop it from mysql.

That would be great!!! :-)
Thanks

We are discussing to move ~1TB of data to HDFS in https://phabricator.wikimedia.org/T172322, another great news :)

I will be removing around 120G from m3 instance from dbstore1002 (T156758#3514245)

I will be removing around 120G from m3 instance from dbstore1002 (T156758#3514245)

Done and around 100G are back
https://grafana.wikimedia.org/dashboard/file/server-board.json?refresh=1m&panelId=17&fullscreen&orgId=1&var-server=dbstore1002&var-network=bond0&from=1502370489229&to=1502371282750

Current state:

root@dbstore1002:/home/marostegui# df -hT /srv
Filesystem            Type  Size  Used Avail Use% Mounted on
/dev/mapper/tank-data xfs   6.4T  5.6T  869G  87% /srv

And here we are again:

root@dbstore1002:/srv# df -hT /srv/
Filesystem            Type  Size  Used Avail Use% Mounted on
/dev/mapper/tank-data xfs   6.4T  5.9T  556G  92% /srv

Mentioned in SAL (#wikimedia-operations) [2017-08-29T08:06:37Z] <elukey> drop log.MobileWebUIClickTracking_10742159_15423246 from dbstore1002 to free space (table archived on HDFS) - T172322 T168303

Back to acceptable levels:

/dev/mapper/tank-data  6.4T  5.5T  946G  86% /srv

@Marostegui: In T172291#3564966, you reported that available space on dbstore1002 had increased to 91%. That increase took a little over 16 hours. Do we know what caused it?

@Marostegui: In T172291#3564966, you reported that available space on dbstore1002 had increased to 91%. That increase took a little over 16 hours. Do we know what caused it?

At this point we believe it was created by a temporary table being used in a query.

And we are again over 90%, it seems that we'd need to find a more permanent solution to this problem. This is a proposal from the Analytics team:

  1. Free some space on dbstore1002 dropping some EventLogging tables that can be archived to HDFS. This should resolve the problem in the immediate term.
  2. Set up the db104[67] replacements and configure the new analytics-slave to replicate only the log database.
  3. Announce to researchers that the log database will be available only on the more powerful host (the new analytics-slave).
  4. Stop eventlogging_sync on dbstore1002 and after all the data sanity checks, drop the log database.

The final goal would be to have one host to query for the log database and one for the wikis (analytics-slave vs analytics-store). From what we (as Analytics team) know, joins between the log database and the other wiki replicas is very rare, so it shouldn't be a big deal to split wikis and log databases (please let us know otherwise).

We can drop the following tables:

TO_DROP_ImageMetricsCorsSupport_11686678

_EchoInteraction_5782287

and per ticket: https://phabricator.wikimedia.org/T171629 we can drop: PageCreation_7481635_1542324 and PageCreation_7481635

This comment was removed by Nuria.

next ones:
MediaViewer_10867062_15423246 299.91
MobileWikiAppToCInteraction_10375484_15423246 140.57
Edit_13457736_15423246 130.85
MobileWikiAppSearch_10641988_15423246 83.53

We can drop the following tables:

TO_DROP_ImageMetricsCorsSupport_11686678

_EchoInteraction_5782287

and per ticket: https://phabricator.wikimedia.org/T171629 we can drop: PageCreation_7481635_1542324 and PageCreation_7481635

dropped from db104[67], dbstore1002

This table is going to free about 40G but not much more, running tests on data now:

-rw-r--r-- 3 hdfs nuria 7.4 G 2017-10-02 18:45 hdfs://analytics-hadoop/user/nuria/MediaViewer_10867062_15423246/part-m-00000.avro
-rw-r--r-- 3 hdfs nuria 8.7 G 2017-10-02 18:45 hdfs://analytics-hadoop/user/nuria/MediaViewer_10867062_15423246/part-m-00001.avro
-rw-r--r-- 3 hdfs nuria 10.5 G 2017-10-02 18:46 hdfs://analytics-hadoop/user/nuria/MediaViewer_10867062_15423246/part-m-00002.avro
-rw-r--r-- 3 hdfs nuria 10.7 G 2017-10-02 18:46 hdfs://analytics-hadoop/user/nuria/MediaViewer_10867062_15423246/part-m-00003.avro

Table MediaViewer_10867062_15423246 can be dropped , it is now available on archive db in hadoop.

Cleaned up some eventlogging_cleaner logs, new status of dbstore1002:

/dev/mapper/tank-data  6.4T  6.0T  475G  93% /srv

/dev/sda1               37G   32G  3.2G  91% /

The root space occupation seems to be due to the /var/log/eventlogging_sync files, that are really huge and there seems to be no logrotate policy in place. Going to take a look and see what we can do.

Table MediaViewer_10867062_15423246 can be dropped , it is now available on archive db in hadoop.

Sanity check:

# dbstore1002
MariaDB [log]> select max(timestamp) as max, min(timestamp) as min from MediaViewer_10867062_15423246;
+----------------+----------------+
| max            | min            |
+----------------+----------------+
| 20170329170135 | 20150107194849 |
+----------------+----------------+

#db1047
MariaDB [log]> select max(timestamp) as max, min(timestamp) as min from MediaViewer_10867062_15423246;
+----------------+----------------+
| max            | min            |
+----------------+----------------+
| 20170329170135 | 20150107194849 |
+----------------+----------------+

#db1046
MariaDB [log]> select max(timestamp) as max, min(timestamp) as min from MediaViewer_10867062_15423246;
+----------------+----------------+
| max            | min            |
+----------------+----------------+
| 20170329170135 | 20151122165132 |
+----------------+----------------+

Table dropped, positive effect on space consumption:

/dev/mapper/tank-data  6.4T  5.8T  594G  91% /srv

Removed the super huge /var/log/eventlogging_sync.log files, now the root looks like:

/dev/sda1               37G  5.8G   29G  17% /

Change 381942 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] role::mariadb::analytics: add logrotate fo eventlogging_sync's log

https://gerrit.wikimedia.org/r/381942

Change 381942 merged by Elukey:
[operations/puppet@production] role::mariadb::analytics: add logrotate fo eventlogging_sync's log

https://gerrit.wikimedia.org/r/381942

root@dbstore1002[(none)]> SELECT CONCAT(table_schema, '.', table_name),
    ->        CONCAT(ROUND(table_rows / 1000000, 2), 'M')                                    rows,
    ->        CONCAT(ROUND(data_length / ( 1024 * 1024 * 1024 ), 2), 'G')                    DATA,
    ->        CONCAT(ROUND(index_length / ( 1024 * 1024 * 1024 ), 2), 'G')                   idx,
    ->        CONCAT(ROUND(( data_length + index_length ) / ( 1024 * 1024 * 1024 ), 2), 'G') total_size,
    ->        ROUND(index_length / data_length, 2)                                           idxfrac
    -> FROM   information_schema.TABLES
    -> ORDER  BY data_length + index_length DESC
    -> LIMIT  10;
+---------------------------------------------------+----------+---------+---------+------------+---------+
| CONCAT(table_schema, '.', table_name)             | rows     | DATA    | idx     | total_size | idxfrac |
+---------------------------------------------------+----------+---------+---------+------------+---------+
| wikidatawiki.wb_terms                             | 1397.44M | 231.42G | 436.07G | 667.49G    |    1.88 |
| cebwiki.templatelinks                             | 2011.89M | 48.25G  | 257.95G | 306.20G    |    5.35 |
| commonswiki.templatelinks                         | 1654.75M | 37.50G  | 197.88G | 235.39G    |    5.28 |
| wikidatawiki.logging                              | 571.13M  | 81.69G  | 145.02G | 226.70G    |    1.78 |
| enwiki.revision                                   | 734.24M  | 94.46G  | 123.57G | 218.04G    |    1.31 |
| wikidatawiki.revision                             | 386.05M  | 72.88G  | 90.97G  | 163.85G    |    1.25 |
| enwiki.pagelinks                                  | 1127.26M | 32.87G  | 119.10G | 151.97G    |    3.62 |
| log.MobileWikiAppToCInteraction_10375484_15423246 | 532.44M  | 98.98G  | 41.59G  | 140.57G    |    0.42 |
| log.Edit_11448630                                 | 350.55M  | 109.83G | 27.19G  | 137.02G    |    0.25 |
| log.Edit_13457736_15423246                        | 366.46M  | 103.64G | 27.21G  | 130.85G    |    0.26 |
+---------------------------------------------------+----------+---------+---------+------------+---------+
10 rows in set (1 hour 4 min 17.37 sec)

The plan is to stop s5 replication, and convert wb_terms to TokuDB.

There is now 1TB available,

Excellent news!
Maybe this ticket can be closed then? }:-)

but we have now a memory (swap) problem: https://grafana.wikimedia.org/dashboard/db/server-board?refresh=1m&orgId=1&var-server=dbstore1002&var-network=eth0&from=now-24h&to=now

I guess that could be related to the massive alter?
Should we restart MySQL and see how it behaves?

Don't want to do that without someone from analytics around. Swap usage seems to be going down, probably even faster when it catches up- so I will leave it as is until tomorrow.

Don't want to do that without someone from analytics around. Swap usage seems to be going down, probably even faster when it catches up- so I will leave it as is until tomorrow.

Yeah - totally agreed. I wasn't suggesting to do it now, no way, we are into the Eurpean evening :-)

Thanks a lot for all the work people, count me in tomorrow if you want to restart it. I agree with the plan, looks good :)

Change 382672 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/puppet@production] Remove dbstore2 role, make dbstore default to the new socket location

https://gerrit.wikimedia.org/r/382672

Change 382673 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/puppet@production] dbstore: Reduce memory pressure to reduece the likelyhood of swapping

https://gerrit.wikimedia.org/r/382673

Change 382672 merged by Jcrespo:
[operations/puppet@production] Remove dbstore2 role, make dbstore default to the new socket location

https://gerrit.wikimedia.org/r/382672

Change 382673 merged by Jcrespo:
[operations/puppet@production] dbstore: Reduce memory pressure to minimize the likelihood of swapping

https://gerrit.wikimedia.org/r/382673

Change 382678 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/puppet@production] dbstore: Move socket location to the right dir /run/mysqld

https://gerrit.wikimedia.org/r/382678

Change 382678 merged by Jcrespo:
[operations/puppet@production] dbstore: Move socket location to the right dir /run/mysqld

https://gerrit.wikimedia.org/r/382678

Seems healthy to me, both on disk space and memory usage. @elukey, I think no matter how large is the available disk, it will fill up in most cases with not-that-useful information. Maybe a more strict "rotation" policy should be set even for anonymized/summary data or user databases. I also mentioned that if growth trends keep going in the same direction, no matter how compressed data is, it may not be possible to server all data with a single server, and we will have to split it into smaller ones to provide more disk space and IOPS. Other possibilities is to filter out tables that are never read on analytics slaves (e.g. non-pure metadata). All things to consider in the future.

For now, I think this can be closed.

Mentioned in SAL (#wikimedia-operations) [2017-10-24T15:58:22Z] <elukey> drop MediaViewer_10867062_15423246 and MobileWebUIClickTracking_10742159_15423246 from the log database on db1046 (archived on hadoop) - T168303

Mentioned in SAL (#wikimedia-operations) [2017-11-02T11:26:09Z] <elukey> drop log.MediaViewer_10867062_15423246 from db1047,db1108 since already archived in hdfs - T168303