Import S2,S6,S7,m3 and x1 to dbstore2001 and dbstore2002
Closed, ResolvedPublic

Description

Right now dbstore2001 and dbstore2002 are running the following shards:

s1
s3
s4
s5

In order to be identical with the eqiad setup for dbstore1001 and dbstore1002 the following shards need to be imported into dbstore2001 and dbstore2002

s2
s6
s7
x1

Marostegui edited the task description. (Show Details)

Change 326388 had a related patch set uploaded (by Marostegui):
db-codfw.php: Depool db2064

https://gerrit.wikimedia.org/r/326388

Marostegui moved this task from Triage to In progress on the DBA board.

Change 326388 merged by jenkins-bot:
db-codfw.php: Depool db2064

https://gerrit.wikimedia.org/r/326388

Mentioned in SAL (#wikimedia-operations) [2016-12-12T07:17:40Z] <marostegui@tin> Synchronized wmf-config/db-codfw.php: Depool db2064 - T151552 (duration: 02m 21s)

Mentioned in SAL (#wikimedia-operations) [2016-12-12T07:30:07Z] <marostegui> Stop replication db2064 for maintenance - T151552

I have now started to import S2 from db2064 into dbstore2001.

The import is about to finish in a couple of hours or so I believe.
So db2064 will remain with replication stopped until tomorrow morning (it is depooled)

s2 is now catching up in dbstore2001. Stopping/starting all slaves worked fine. Later I will stop MySQL and start MySQL to make sure nothing got corrupted.
Once that is done I will start compressing s2.

Thinking about it and given that dbstore2001 has enough disk now, I will keep importing shards this week as next week deployments will not be allowed.
So I will import stuff this week and compress next week as there is no need to depool servers to compress them.

Change 326898 had a related patch set uploaded (by Marostegui):
db-codfw.php: Depool db2067

https://gerrit.wikimedia.org/r/326898

Change 326898 merged by jenkins-bot:
db-codfw.php: Depool db2067

https://gerrit.wikimedia.org/r/326898

Mentioned in SAL (#wikimedia-operations) [2016-12-13T08:11:30Z] <marostegui@tin> Synchronized wmf-config/db-codfw.php: Depool db2067 - T151552 (duration: 02m 19s)

Mentioned in SAL (#wikimedia-operations) [2016-12-13T08:12:08Z] <marostegui> Stop replication on db2067 for maintenance - T151552

All the files from s6 are now being transferred to dbstore2001

Transfer is done and I am now importing: frwiki jawiki and ruwiki

Change 326927 had a related patch set uploaded (by Marostegui):
db-codfw.php: Repool db2064

https://gerrit.wikimedia.org/r/326927

Change 326927 merged by jenkins-bot:
db-codfw.php: Repool db2064

https://gerrit.wikimedia.org/r/326927

Mentioned in SAL (#wikimedia-operations) [2016-12-13T12:50:14Z] <marostegui@tin> Synchronized wmf-config/db-codfw.php: Repool db2064 - T151552 (duration: 00m 45s)

The last wiki (ruwiki) is getting imported and it will take a few more hours to finish.
db2067 will remain with replication stopped until it is done.

s6 has been imported into dbstore2001 and it is now catching up with the master.

Mentioned in SAL (#wikimedia-operations) [2016-12-13T17:30:06Z] <marostegui@tin> Synchronized wmf-config/db-codfw.php: Repool db2067 - T151552 (duration: 00m 47s)

Change 327145 had a related patch set uploaded (by Marostegui):
db-codfw.php: Depool db2068

https://gerrit.wikimedia.org/r/327145

Change 327145 merged by jenkins-bot:
db-codfw.php: Depool db2068

https://gerrit.wikimedia.org/r/327145

Mentioned in SAL (#wikimedia-operations) [2016-12-14T06:43:07Z] <marostegui@tin> Synchronized wmf-config/db-codfw.php: Depool db2068 - T151552 (duration: 01m 48s)

Mentioned in SAL (#wikimedia-operations) [2016-12-14T06:44:06Z] <marostegui> Stop replication db2068 for maintenance - T151552

I have started the s7 file transfer from db2068 to dbstore2001

The metawiki database in s7 has one partitioned table (pagelinks)

/*!50100 PARTITION BY RANGE (pl_namespace)
(PARTITION p_2 VALUES LESS THAN (3) ENGINE = InnoDB,
 PARTITION p_9 VALUES LESS THAN (10) ENGINE = InnoDB,
 PARTITION p_10 VALUES LESS THAN (11) ENGINE = InnoDB,
 PARTITION p_max VALUES LESS THAN MAXVALUE ENGINE = InnoDB)

MariaDB doesn't support export partitions (https://jira.mariadb.org/browse/MDEV-10568) but some days ago a MariaDB engineer replied to me with a procedure that works. I will need to try it to be able to import this database.

I have tested the workaround in my local environment and I am facing some issues, with column mismatch (even though I am using same MariaDB version from both servers).
I am unsure about this workaround and the last thing I want is to corrupt all the tablespace and get dbstore2001 broken entirely.

Another thing we can do is, temporarily delete partitions for this table in db2068, get the transfer done, and then add the partitions again in both db2068 and dbstore2001 @jcrespo any objection to that?

Drop them, but do not add them back after transfer. We want to be able to copy them from dbstore2001 to any other host easily.

I will drop them on db2068, then do the transfer and only add them back to db2068 but not to dbstore2001.
Thanks!

Should they even be on db2068? Genuine question (not a blocker for anything you just commented).

Should they even be on db2068? Genuine question (not a blocker for anything you just commented).

I checked another slave and it has partitions
However, they are not under: https://phabricator.wikimedia.org/diffusion/MW/browse/master/maintenance/tables.sql

They are also not even at: dbtools/s7-pager.sql

I checked another slave and it has partitions

Is it on all servers, eqiad and codfw?

In the majority of hosts indeed:

root@neodymium:/home/marostegui/git/software/dbtools# for i in `cat s7.hosts| grep -v db1069| awk -F " " '{print $1}'`; do echo $i; mysql -h$i -A metawiki -e "show create table pagelinks\G" | grep PARTITI | wc -l; done
dbstore2001.codfw.wmnet
5
dbstore2002.codfw.wmnet
ERROR 1049 (42000): Unknown database 'metawiki'
0
db2040.codfw.wmnet
5
db2047.codfw.wmnet
5
db2054.codfw.wmnet
5
db2061.codfw.wmnet
5
db2068.codfw.wmnet
5
db2029.codfw.wmnet
5
dbstore1001.eqiad.wmnet
0
dbstore1002.eqiad.wmnet
5
labsdb1001.eqiad.wmnet
0
labsdb1003.eqiad.wmnet
0
db1028.eqiad.wmnet
5
db1033.eqiad.wmnet
0
db1034.eqiad.wmnet
5
db1039.eqiad.wmnet
5
db1062.eqiad.wmnet
5
db1079.eqiad.wmnet
0
db1086.eqiad.wmnet
0
db1094.eqiad.wmnet
0
db1041.eqiad.wmnet
5

I have finished cleaning up the databases from s7 that were transferred earlier from dbstore2001.

Also, given the size of m3 (120G) and x1 (176G) it is probably easier just to import a mysqldump from them rather than the .ibd files

Mentioned in SAL (#wikimedia-operations) [2016-12-14T14:39:04Z] <marostegui> Stop replication db2012 (m3) for maintenance - T151552

Mentioned in SAL (#wikimedia-operations) [2016-12-14T16:48:15Z] <marostegui> Stop replication db2068 for maintenance - T151552

I am transferring the s7 files from db2068 now to dbstore2001 once the partitions have been removed from metawiki.pagelinks (T153194).

I am importing m3 (already compressed) into dbstore2001 now. Replication on db2012 will remain stopped until it is done.

m3 is now replicating finely in dbstore2001 and catching up with the master.

The files have been transferred and I have started the tablespace importation of s7 into dbstore2001. It will take several hours to complete.
Replication on db2068 will remain stopped until this is completed.

s7 has been imported correctly into dbstore2001 and it is now catching up with the master.

Mentioned in SAL (#wikimedia-operations) [2016-12-15T14:38:44Z] <marostegui> Stop replication db2033 (x1) for maintenance - T151552

x1 needs a bit more coordination than just the mysqldump from db2033 as there are tables existing in both, dbstore2001 and db2033.
So far, a find reveals that the following tables in dbstore2001 are in used and SHOULD NOT BE dumped from db2033

root@dbstore2001:/srv/sqldata#  find . -name "*echo*.ibd" -exec ls -lh {} \; | egrep -v "2013|2014"
-rw-rw---- 1 mysql mysql 8.0M Dec 15 13:42 ./mediawikiwiki/echo_email_batch.ibd
-rw-rw---- 1 mysql mysql 112M Dec 15 15:37 ./mediawikiwiki/echo_event.ibd
-rw-rw---- 1 mysql mysql 1.4G Dec 15 15:39 ./mediawikiwiki/echo_notification.ibd
-rw-rw---- 1 mysql mysql 152M Dec 15 15:36 ./mediawikiwiki/echo_target_page.ibd
-rw-rw---- 1 mysql mysql 1.0M Dec 15 14:23 ./officewiki/echo_email_batch.ibd
-rw-rw---- 1 mysql mysql 18M Dec 15 15:12 ./officewiki/echo_event.ibd
-rw-rw---- 1 mysql mysql 36M Dec 15 15:41 ./officewiki/echo_notification.ibd
-rw-rw---- 1 mysql mysql 9.0M Dec 15 15:12 ./officewiki/echo_target_page.ibd
-rw-rw---- 1 mysql mysql 112K Feb  4  2016 ./testwiki/echo_target_page.ibd
-rw-rw---- 1 mysql mysql 128K Dec 15 01:47 ./kowiki/echo_email_batch.ibd
-rw-rw---- 1 mysql mysql 112K Dec 15 01:47 ./kowiki/echo_event.ibd
-rw-rw---- 1 mysql mysql 176K Dec 15 01:47 ./kowiki/echo_notification.ibd
-rw-rw---- 1 mysql mysql 1.0M Dec 15 02:07 ./metawiki/echo_email_batch.ibd
-rw-rw---- 1 mysql mysql 656M Dec 15 15:21 ./metawiki/echo_event.ibd
-rw-rw---- 1 mysql mysql 1.2G Dec 15 15:27 ./metawiki/echo_notification.ibd
-rw-rw---- 1 mysql mysql 108M Dec 15 14:53 ./metawiki/echo_target_page.ibd

kowiki can be probably discarded as that time is when the ibd file was imported last night

So from that find the list of DBs that already exist on dbstore2001 and need to be ignored when dumping data from db2033 is:

mediawiki
officewiki
metawiki

Which matches this:

// Use extension1 db for all wikis
'wmgEchoCluster' => [
	'default' => 'extension1',
	// The following wikis should be using extension1 db but started with
	// the application main database, stick with the current setting
	'wikitech' => false,
	'mediawikiwiki' => false,
	'metawiki' => false,
	'officewiki' => false,
],

Mentioned in SAL (#wikimedia-operations) [2016-12-16T07:37:37Z] <marostegui@tin> Synchronized wmf-config/db-codfw.php: Repool db2068 - T151552 (duration: 00m 39s)

Mentioned in SAL (#wikimedia-operations) [2016-12-16T12:16:08Z] <marostegui> Stop replication db2033 for maintenance - T151552

Marostegui added a comment.EditedDec 16 2016, 12:40 PM

Thinking about it, all the databases that have the tables echo_xxx and their size is 1M they do contain a row:

root@dbstore2001:/srv/sqldata# mysql --skip-ssl nywiki -e "select count(*) from echo_event;"
+----------+
| count(*) |
+----------+
|        1 |
+----------+
root@dbstore2001:/srv/sqldata# ls -lh nywiki/echo_event.ibd
-rw-rw---- 1 mysql mysql 1.0M Feb  4  2014 nywiki/echo_event.ibd

Once the critical wikis are excluded from the mysqldump:

mediawikiwiki
officewiki
metawiki

It is better to drop (or rename) the other tables already existing tables in dbstore2001 in all the wikis before inserting the data from x1. Otherwise, as this is ROW based replication, replication will break as the data isn't the same.

We should ask (for example, on the thread of tables to drop) if that row should be there- if not we can drop them from production, too (not right away, but with time).

We should ask (for example, on the thread of tables to drop) if that row should be there- if not we can drop them from production, too (not right away, but with time).

Good point. I will mention it there too.

Mentioned in SAL (#wikimedia-operations) [2016-12-19T06:44:38Z] <marostegui> Deploy innodb compression dbstore2001 on dewiki and wikidatawiki - T151552

s5 has been compressed.
Now compressing s6.

s6 has been compressed.
It went from: 719G to 268G

Important: x1 has some (wrong) filters by default on dbstores, those should be removed to not compromise data integrity on x1 when/if it is loaded (I just realized that):

x1.replicate-wild-do-table = flowdb.%
x1.replicate-wild-do-table = wikishared.%
x1.replicate-wild-do-table = heartbeat.%

dbstore2001 crashed on restart, probably because a partitioned table was imported:

161222 15:45:05 [Note] Event Scheduler: scheduler thread started with id 1
InnoDB: Error: tablespace id is 174172 in the data dictionary
InnoDB: but in file ./enwiktionary/templatelinks#P#p_max.ibd it is 2798!
2016-12-22 15:46:50 7fb4d9fff700  InnoDB: Assertion failure in thread 140414728271616 in file fil0fil.cc line 637
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
161222 15:46:50 [ERROR] mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.

To report this bug, see https://mariadb.com/kb/en/reporting-bugs

We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed, 
something is definitely wrong and this may fail.

Server version: 10.0.28-MariaDB
key_buffer_size=134217728
read_buffer_size=131072
max_used_connections=0
max_threads=252
thread_count=1
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 684398 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x0 thread_stack 0x48000
mysys/stacktrace.c:268(my_print_stacktrace)[0xbc55fe]
sql/signal_handler.cc:159(handle_fatal_signal)[0x73775f]
/lib/x86_64-linux-gnu/libpthread.so.0(+0xf8d0)[0x7fcffaaed8d0]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0x37)[0x7fcff944c067]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x148)[0x7fcff944d448]
fil/fil0fil.cc:607(fil_node_open_file)[0xa1f659]
fil/fil0fil.cc:5374(fil_node_prepare_for_io)[0xa1f6fe]
fil/fil0fil.cc:1424(fil_space_get_space)[0xa286cc]
buf/buf0rea.cc:853(buf_read_ibuf_merge_pages(bool, unsigned long const*, long const*, unsigned long const*, unsigned long))[0x9ef981]
ibuf/ibuf0ibuf.cc:2664(ibuf_merge_pages)[0x8cd05d]
ibuf/ibuf0ibuf.cc:2858(ibuf_merge_in_background(bool))[0x8d240b]
srv/srv0srv.cc:2914(srv_master_do_idle_tasks)[0x96fc0b]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x80a4)[0x7fcffaae60a4]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7fcff94ff62d]
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
161222 15:46:52 mysqld_safe Number of processes running now: 0
161222 15:46:57 [Note] InnoDB: Reading tablespace information from the .ibd files...

161222 15:56:34 [ERROR] InnoDB: Attempted to open a previously opened tablespace. Previous tablespace arbcom_nlwiki/iwlinks uses space ID: 2796 at filepath: ./arbcom_nlwiki/iwlinks.ibd. Cannot open tablespace enwiktionary/templatelinks#P#p_10 which uses space ID: 2796 at filepath: ./enwiktionary/templatelinks#P#p_10.ibd
2016-12-22 15:56:34 7ffb5bef9780  InnoDB: Operating system error number 2 in a file operation.
InnoDB: The error means the system cannot find the path specified.
InnoDB: If you are installing InnoDB, remember that you must create
InnoDB: directories yourself, InnoDB does not create them.
InnoDB: Error: could not open single-table tablespace file ./enwiktionary/templatelinks#P#p_10.ibd
InnoDB: We do not continue the crash recovery, because the table may become
InnoDB: corrupt if we cannot apply the log records in the InnoDB log to it.
InnoDB: To fix the problem and start mysqld:
InnoDB: 1) If there is a permission problem in the file and mysqld cannot
InnoDB: open the file, you should modify the permissions.
InnoDB: 2) If the table is not needed, or you can restore it from a backup,
InnoDB: then you can remove the .ibd file, and InnoDB will do a normal
InnoDB: crash recovery and ignore that table.
InnoDB: 3) If the file system or the disk is broken, and you cannot remove
InnoDB: the .ibd file, you can set innodb_force_recovery > 0 in my.cnf
InnoDB: and force InnoDB to continue crash recovery here.
161222 15:56:34 mysqld_safe mysqld from pid file /srv/sqldata/dbstore2001.pid ended
jcrespo raised the priority of this task from "Normal" to "High".Dec 22 2016, 5:25 PM

Because of the above, I have recovered dbstore2001, and I am running s2 with enwiktionary.templatelinks ignored, will reimport it when it catches up (assuming it does).

High until this issue is fixed.

jcrespo claimed this task.Dec 22 2016, 5:25 PM
jcrespo lowered the priority of this task from "High" to "Normal".Dec 23 2016, 10:03 AM
jcrespo reassigned this task from jcrespo to Marostegui.

I have now fixed dbstore2001, and enwiki is catching up with the new, larger, buffer pool.


We need to deploy https://gerrit.wikimedia.org/r/328671 , but it will be blocked by dbstore new disks in general.

I do not believe dbstore2001 is in an ideal state anyway- I think that partitioned table imported messed up InnoDB, and now it is taking 20 minutes to stop and start the server (outside or flushing buffers and other things), or we have reached some kind of limit. Strange, because I did not see happening when we had 99% of the tables loaded.

Because of the above, I have recovered dbstore2001, and I am running s2 with enwiktionary.templatelinks ignored, will reimport it when it catches up (assuming it does).

High until this issue is fixed.

Oh no partitions :-(.
However it is strange that it didn't complain when the table was imported. Will need to be looked at once back from holidays.

Mentioned in SAL (#wikimedia-operations) [2017-01-03T09:10:46Z] <marostegui> stop MySQL dbstore2001 for maintenance - T151552

Mentioned in SAL (#wikimedia-operations) [2017-01-03T09:23:40Z] <marostegui> stop MySQL dbstore2002 for maintenance - T151552

Comparing the timing of both servers:
dbstore2001

stop all slaves - 8 seconds
stop mysql  - 9 minutes
start mysql (and be able to get in the prompt: 10 minutes

dbstore2002:

stop all slaves - 2 seconds
stop mysql - 4 minutes
start mysql (and be able to get in the prompt: 1 minute

However, we need to keep in mind that dbstore2001 has 8 replication threads where as dbstore2002 has 4 only.
We also need to keep in mind that dbstore2001 has s6 already compressed too (as well as m3)

I don't think we are on a bad shape really here, we would have seen more issues I believe if we were hitting some corruption in innodb (given the amount of data we are writing per second and while doing stop/start).
I am trying to think we did a full stop/start after we imported all the shards, and I don't think we did, so maybe those big times were already there before with all the tables loaded.

Mentioned in SAL (#wikimedia-operations) [2017-01-23T09:06:23Z] <marostegui> Compress s2 on dbstore2001 - T151552

s2 was compressed on dbstore2001 (1.1T -> 763G)

Marostegui moved this task from In progress to Next on the DBA board.Feb 6 2017, 10:59 AM
Marostegui edited the task description. (Show Details)Mon, Mar 6, 12:53 PM
Marostegui closed this task as "Resolved".Mon, Mar 6, 2:34 PM

The only pending shard to import is x1 - I will mark this as resolved and create a ticket just for x1.
x1 is a bit more difficult because of this: T151552#2876669 T151552#2880669 and T153638