Page MenuHomePhabricator

dbstore1002 /srv filling up
Closed, ResolvedPublic

Description

Hello,

root@dbstore1002:/srv# df -hT /srv/
Filesystem            Type  Size  Used Avail Use% Mounted on
/dev/mapper/tank-data xfs   6.4T  5.5T  905G  87% /srv

There is still room, but I thought I would report this so we don't get caught on a 95% status :-)

What I have seen so far is that we have lots of tables on InnoDB and not TokuDB which could give us some more space.
These are the two ones:

-rw-rw---- 1 mysql mysql  201G Sep 26 13:55 ./sqldata/metawiki/pagelinks.ibd
-rw-rw---- 1 mysql mysql  155G Sep 26 13:55 ./sqldata/wikidatawiki/revision.ibd
-rw-rw---- 1 mysql mysql  116G Sep 26 13:55 ./sqldata/enwiki/slots.ibd
-rw-rw---- 1 mysql mysql   89G Sep 26 13:55 ./sqldata/wikidatawiki/change_tag.ibd
-rw-rw---- 1 mysql mysql   78G Sep 26 13:48 ./sqldata/enwiki/content.ibd
-rw-rw---- 1 mysql mysql   72G Sep 26 13:55 ./sqldata/wikidatawiki/slots.ibd
-rw-rw---- 1 mysql mysql   70G Sep 26 13:55 ./sqldata/wikidatawiki/content.ibd
-rw-rw---- 1 mysql mysql   65G Sep 26 13:55 ./sqldata/dewiki/flaggedimages.ibd
-rw-rw---- 1 mysql mysql   60G Sep 26 13:55 ./sqldata/wikidatawiki/text.ibd
-rw-rw---- 1 mysql mysql  53G Sep 26 13:55 ./sqldata/wikishared/cx_corpora.ibd
-rw-rw---- 1 mysql mysql   51G Sep 26 13:55 ./sqldata/srwiki/pagelinks.ibd
-rw-rw---- 1 mysql mysql   37G Sep 26 13:55 ./sqldata/wikidatawiki/echo_event.ibd
-rw-rw---- 1 mysql mysql   39G Sep 26 13:55 ./sqldata/viwiki/pagelinks.ibd

Specially interesting is:

-rw-rw---- 1 mysql mysql  77G Oct 11  2017 ./sqldata/commonswiki_test_T177772/recentchanges.ibd
root@dbstore1002:/srv/sqldata/commonswiki_test_T177772# ls -lh
total 77G
-rw-rw---- 1 mysql mysql   54 Oct  9  2017 db.opt
-rw-rw---- 1 mysql mysql 8.1K Oct  9  2017 recentchanges.frm
-rw-rw---- 1 mysql mysql  77G Oct 11  2017 recentchanges.ibd

I think that directory can be just dropped all at once.

I propose we start converting some of those 50GB tables to TokuDB and see how much space we gain and then continue with other ones until we are back in a good state until the new servers arrive.

Event Timeline

Marostegui triaged this task as Medium priority.Sep 26 2018, 2:00 PM

I agree, let's pick a couple of big tables and convert them.

wouldn't be better if we skip toku, but compress the tables?
I am not sure if it is not good to have the same kind of storage engines everywhere. Less hidden caveats ,etc.

wouldn't be better if we skip toku, but compress the tables?
I am not sure if it is not good to have the same kind of storage engines everywhere. Less hidden caveats ,etc.

dbstore1002 has the majority of its tables on TokuDB already. These are probably new tables, reimported tables from core and stuff like that.

Some of them may be on innodb on purpuse because we hit a bug (T109069), some may not be on purpose because alter/import tables happened. This is the last tokudb server aside from analytics. TokuDB works mostly ok for eventlogging, but even if it didn't, we could migrate to rocks db, which has a similar compression ratio (but it is even newer).

I start the conversion with
/srv/sqldata/srwiki/pagelinks.ibd
it is 51G let's see what we get at the end

Mentioned in SAL (#wikimedia-operations) [2018-09-27T14:29:16Z] <banyek> converting srwiki.pagelinks to TokuDB on host dbstore1002 (T205544)

The conversion finished, it took 2 hours. The replication slowly catching up on s3, I'll continue this tomorrow.
(I was not able to check yet the size of the TokuDB table, but I'll check that too.)
But at least we have 928G free now under /srv

Coordinate with me, as I am doing some maintenance on change_tag table on dbstore1002 and we should probably avoid running several heavy alters at the same time on this host.

Mentioned in SAL (#wikimedia-operations) [2018-09-28T09:34:00Z] <banyek> converting whikishared.cx_coprora to TokuDB on host dbstrore1002 (T205544)

Mentioned in SAL (#wikimedia-operations) [2018-09-28T10:55:29Z] <banyek> converting dewiki.flaggedtemplates to TokuDB on host dbstrore1002 (T205544)

Mentioned in SAL (#wikimedia-operations) [2018-09-28T12:39:21Z] <banyek> converting wikidatawiki.text to TokuDB on host dbstrore1002 (T205544)

Mentioned in SAL (#wikimedia-operations) [2018-09-28T14:22:09Z] <banyek> converting dewiki.flaggedimages to TokuDB on host dbstrore1002 (T205544)

Mentioned in SAL (#wikimedia-operations) [2018-10-01T12:26:28Z] <banyek> converting enwiki.categorylinks to TokuDB on host dbstrore1002 (T205544)

Mentioned in SAL (#wikimedia-operations) [2018-10-01T13:51:58Z] <banyek> Downtimed the slave lag monitoring on dbstore1002 while the tables getting converted (T205544)

Mentioned in SAL (#wikimedia-operations) [2018-10-02T08:01:23Z] <banyek> converting wikidatawiki.content to TokuDB on host dbstrore1002 (T205544)

Mentioned in SAL (#wikimedia-operations) [2018-10-02T11:58:42Z] <banyek> converting wikidatawiki.slots to TokuDB on host dbstrore1002 (T205544)

Mentioned in SAL (#wikimedia-operations) [2018-10-02T12:47:25Z] <banyek> converting enwiki.contents to TokuDB on host dbstrore1002 (T205544)

Mentioned in SAL (#wikimedia-operations) [2018-10-02T12:47:50Z] <banyek> converting enwiki.content to TokuDB on host dbstrore1002 (T205544)

Mentioned in SAL (#wikimedia-operations) [2018-10-03T11:45:28Z] <banyek> converting enwiki.slots to TokuDB on host dbstrore1002 (T205544)

Mentioned in SAL (#wikimedia-operations) [2018-10-03T14:07:26Z] <banyek> converting wikidatawiki.change_tag to TokuDB on host dbstrore1002 (T205544)

Ottomata raised the priority of this task from Medium to Needs Triage.Oct 4 2018, 5:17 PM
Ottomata moved this task from Incoming to Radar on the Analytics board.
Milimetric triaged this task as Medium priority.Oct 4 2018, 5:18 PM

To show some light on how the process is going:

Filesystem            Type  Size  Used Avail Use% Mounted on
/dev/mapper/tank-data xfs   6.4T  5.1T  1.4T  79% /srv

Mentioned in SAL (#wikimedia-operations) [2018-10-08T13:04:36Z] <banyek> downtime notifications for dbstore1002 repliaction threads (T205544)

Mentioned in SAL (#wikimedia-operations) [2018-10-08T13:05:38Z] <banyek> converting cebwiki.templatelinks to TokuDB on host dbstore1002.eqiad.wmnet (T205544)

Actually I was thinking on closing the task as we have 1,4 T free space now.
Maybe before that just dropping the commonswiki_test_T177772 database with the recentchanges table which would give us a huge boost space-wise
If not I still think we can close the task, and maybe reopen it if we have disk space issues again before the server gets decomissioned.

+1 to close.
It is actually not a bad idea to leave that big DB as a safety net, so we have stuff to drop if this host complains again about disk space :-)

Yes, that makes sense. I close the task now, we can reopen it when needed.