Details
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | None | T130702 Several es20XX servers keep crashing (es2017, es2019, es2015, es2014) since 23 March | |||
Resolved | Marostegui | T181293 es2018 crashed |
Event Timeline
Change 393211 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/mediawiki-config@master] maridb: depool es2018 after crash
The idrac console showed when I logged in:
[30996087.770298] megaraid_sas 0000:03:00.0: pending commands remain after waiting, will reset adapter scsi0. [30996102.596599] megaraid_sas 0000:03:00.0: Init cmd success
dmesg:
[Fri Nov 24 10:14:53 2017] TCP: request_sock_TCP: Possible SYN flooding on port 5666. Sending cookies. Check SNMP counters. [Fri Nov 24 10:17:06 2017] INFO: task jbd2/sda1-8:934 blocked for more than 120 seconds. [Fri Nov 24 10:17:06 2017] Not tainted 4.4.0-3-amd64 #1 [Fri Nov 24 10:17:06 2017] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [Fri Nov 24 10:17:06 2017] jbd2/sda1-8 D ffff88203eb15d80 0 934 2 0x00000000 [Fri Nov 24 10:17:06 2017] ffff8820334caf40 ffff881038566380 ffff882036120000 ffff88203611fb00 [Fri Nov 24 10:17:06 2017] 7fffffffffffffff ffffffff81593490 ffff88203611fb80 0000000000000852 [Fri Nov 24 10:17:06 2017] ffffffff81592c11 0000000000000000 ffffffff81595ba5 7fffffffffffffff [Fri Nov 24 10:17:06 2017] Call Trace: [Fri Nov 24 10:17:06 2017] [<ffffffff81593490>] ? bit_wait_timeout+0xa0/0xa0 [Fri Nov 24 10:17:06 2017] [<ffffffff81592c11>] ? schedule+0x31/0x80 [Fri Nov 24 10:17:06 2017] [<ffffffff81595ba5>] ? schedule_timeout+0x235/0x2d0 [Fri Nov 24 10:17:06 2017] [<ffffffff812bf22b>] ? queue_unplugged+0x9b/0xa0 [Fri Nov 24 10:17:06 2017] [<ffffffff81593490>] ? bit_wait_timeout+0xa0/0xa0 [Fri Nov 24 10:17:06 2017] [<ffffffff81592204>] ? io_schedule_timeout+0xb4/0x130 [Fri Nov 24 10:17:06 2017] [<ffffffff810b82c5>] ? prepare_to_wait+0x55/0x80 [Fri Nov 24 10:17:06 2017] [<ffffffff815934a7>] ? bit_wait_io+0x17/0x60 [Fri Nov 24 10:17:06 2017] [<ffffffff81592f8a>] ? __wait_on_bit+0x5a/0x90 [Fri Nov 24 10:17:06 2017] [<ffffffff81169591>] ? wait_on_page_bit+0xc1/0xe0 [Fri Nov 24 10:17:06 2017] [<ffffffff810b8630>] ? autoremove_wake_function+0x40/0x40 [Fri Nov 24 10:17:06 2017] [<ffffffff81169687>] ? __filemap_fdatawait_range+0xd7/0x150 [Fri Nov 24 10:17:06 2017] [<ffffffff812c36cf>] ? submit_bio+0x6f/0x170 [Fri Nov 24 10:17:06 2017] [<ffffffff8116970f>] ? filemap_fdatawait_range+0xf/0x30 [Fri Nov 24 10:17:06 2017] [<ffffffffa0105d45>] ? jbd2_journal_commit_transaction+0xd15/0x1900 [jbd2] [Fri Nov 24 10:17:06 2017] [<ffffffff810acb02>] ? dequeue_entity+0x3f2/0x920 [Fri Nov 24 10:17:06 2017] [<ffffffff810ad203>] ? put_prev_entity+0x33/0x710 [Fri Nov 24 10:17:06 2017] [<ffffffff810dd6d9>] ? try_to_del_timer_sync+0x59/0x80 [Fri Nov 24 10:17:06 2017] [<ffffffffa010a1cd>] ? kjournald2+0xdd/0x280 [jbd2] [Fri Nov 24 10:17:06 2017] [<ffffffff810b85f0>] ? wait_woken+0x90/0x90 [Fri Nov 24 10:17:06 2017] [<ffffffffa010a0f0>] ? commit_timeout+0x10/0x10 [jbd2] [Fri Nov 24 10:17:06 2017] [<ffffffff81096ebf>] ? kthread+0xdf/0x100 [Fri Nov 24 10:17:06 2017] [<ffffffff81096de0>] ? kthread_park+0x50/0x50 [Fri Nov 24 10:17:06 2017] [<ffffffff81596d5f>] ? ret_from_fork+0x3f/0x70 [Fri Nov 24 10:17:06 2017] [<ffffffff81096de0>] ? kthread_park+0x50/0x50 [Fri Nov 24 10:17:06 2017] INFO: task xfsaild/dm-0:1231 blocked for more than 120 seconds. [Fri Nov 24 10:17:06 2017] Not tainted 4.4.0-3-amd64 #1 [Fri Nov 24 10:17:06 2017] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [Fri Nov 24 10:17:06 2017] xfsaild/dm-0 D ffff88203ead5d80 0 1231 2 0x00000000 [Fri Nov 24 10:17:06 2017] ffff8810336c6240 ffff881038558300 ffff881035b88000 0000000000000000 [Fri Nov 24 10:17:06 2017] ffff8820336eac00 ffff8820324b4800 ffff8820324b4928 ffff88202f55c800 [Fri Nov 24 10:17:06 2017] ffffffff81592c11 0000000000000001 ffffffffa05fef94 00000002cdeb6cad [Fri Nov 24 10:17:06 2017] Call Trace: [Fri Nov 24 10:17:06 2017] [<ffffffff81592c11>] ? schedule+0x31/0x80 [Fri Nov 24 10:17:06 2017] [<ffffffffa05fef94>] ? _xfs_log_force+0x164/0x2d0 [xfs] [Fri Nov 24 10:17:06 2017] [<ffffffff810a2110>] ? wake_up_q+0x60/0x60 [Fri Nov 24 10:17:06 2017] [<ffffffffa05ff121>] ? xfs_log_force+0x21/0x90 [xfs] [Fri Nov 24 10:17:06 2017] [<ffffffffa0609c37>] ? xfsaild+0x197/0x740 [xfs] [Fri Nov 24 10:17:06 2017] [<ffffffffa0609aa0>] ? xfs_trans_ail_cursor_first+0x80/0x80 [xfs] [Fri Nov 24 10:17:06 2017] [<ffffffff81096ebf>] ? kthread+0xdf/0x100 [Fri Nov 24 10:17:06 2017] [<ffffffff81096de0>] ? kthread_park+0x50/0x50 [Fri Nov 24 10:17:06 2017] [<ffffffff81596d5f>] ? ret_from_fork+0x3f/0x70 [Fri Nov 24 10:17:06 2017] [<ffffffff81096de0>] ? kthread_park+0x50/0x50 [Fri Nov 24 10:17:06 2017] INFO: task mysqld:2902 blocked for more than 120 seconds. [Fri Nov 24 10:17:06 2017] Not tainted 4.4.0-3-amd64 #1 [Fri Nov 24 10:17:06 2017] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [Fri Nov 24 10:17:06 2017] mysqld D ffff88103f255d80 0 2902 2071 0x00000000 [Fri Nov 24 10:17:06 2017] ffff881032c20e80 ffff88103852cf40 ffff8810357d0000 ffff8810357cfb08 [Fri Nov 24 10:17:06 2017] 7fffffffffffffff ffff881032c20e80 ffff881032c20e80 ffff880108636a40 [Fri Nov 24 10:17:06 2017] ffffffff81592c11 0000000000000000 ffffffff81595ba5 7fffffffffffffff [Fri Nov 24 10:17:06 2017] Call Trace: [Fri Nov 24 10:17:06 2017] [<ffffffff81592c11>] ? schedule+0x31/0x80 [Fri Nov 24 10:17:06 2017] [<ffffffff81595ba5>] ? schedule_timeout+0x235/0x2d0 [Fri Nov 24 10:17:06 2017] [<ffffffff812bef6f>] ? __blk_run_queue+0x2f/0x40 [Fri Nov 24 10:17:06 2017] [<ffffffff812bf1b5>] ? queue_unplugged+0x25/0xa0 [Fri Nov 24 10:17:06 2017] [<ffffffff81592204>] ? io_schedule_timeout+0xb4/0x130 [Fri Nov 24 10:17:06 2017] [<ffffffff812189c8>] ? do_blockdev_direct_IO+0x1b38/0x2bf0 [Fri Nov 24 10:17:06 2017] [<ffffffffa05d7080>] ? xfs_get_blocks+0x10/0x10 [xfs] [Fri Nov 24 10:17:06 2017] [<ffffffffa05d657c>] ? xfs_vm_direct_IO+0x6c/0xe0 [xfs] [Fri Nov 24 10:17:06 2017] [<ffffffffa05d58c0>] ? xfs_submit_ioend+0x120/0x120 [xfs] [Fri Nov 24 10:17:06 2017] [<ffffffffa05e319e>] ? xfs_file_dio_aio_write+0x1ae/0x360 [xfs] [Fri Nov 24 10:17:06 2017] [<ffffffffa05e3660>] ? xfs_file_write_iter+0xa0/0x170 [xfs] [Fri Nov 24 10:17:06 2017] [<ffffffff811dae0d>] ? __vfs_write+0xcd/0x120 [Fri Nov 24 10:17:06 2017] [<ffffffff811db464>] ? vfs_write+0xa4/0x190 [Fri Nov 24 10:17:06 2017] [<ffffffff811dc386>] ? SyS_pwrite64+0x86/0xb0 [Fri Nov 24 10:17:06 2017] [<ffffffff815969b6>] ? system_call_fast_compare_end+0xc/0x6b [Fri Nov 24 10:17:06 2017] INFO: task mysqld:123754 blocked for more than 120 seconds. [Fri Nov 24 10:17:06 2017] Not tainted 4.4.0-3-amd64 #1 [Fri Nov 24 10:17:06 2017] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [Fri Nov 24 10:17:06 2017] mysqld D ffff88103f315d80 0 123754 2071 0x00000000 [Fri Nov 24 10:17:06 2017] ffff8811f5714e00 ffff8810385670c0 ffff8817e225c000 ffff8817e225bd00 [Fri Nov 24 10:17:06 2017] 7fffffffffffffff ffffffff81593490 ffff8817e225bd80 0007ffffffffffff [Fri Nov 24 10:17:06 2017] ffffffff81592c11 0000000000000000 ffffffff81595ba5 7fffffffffffffff [Fri Nov 24 10:17:06 2017] Call Trace: [Fri Nov 24 10:17:06 2017] [<ffffffff81593490>] ? bit_wait_timeout+0xa0/0xa0 [Fri Nov 24 10:17:06 2017] [<ffffffff81592c11>] ? schedule+0x31/0x80 [Fri Nov 24 10:17:06 2017] [<ffffffff81595ba5>] ? schedule_timeout+0x235/0x2d0 [Fri Nov 24 10:17:06 2017] [<ffffffff812c49f6>] ? blk_peek_request+0x46/0x260 [Fri Nov 24 10:17:06 2017] [<ffffffffa001624d>] ? scsi_request_fn+0x3d/0x5f0 [scsi_mod] [Fri Nov 24 10:17:06 2017] [<ffffffff81593490>] ? bit_wait_timeout+0xa0/0xa0 [Fri Nov 24 10:17:06 2017] [<ffffffff81592204>] ? io_schedule_timeout+0xb4/0x130 [Fri Nov 24 10:17:06 2017] [<ffffffff810b82c5>] ? prepare_to_wait+0x55/0x80 [Fri Nov 24 10:17:06 2017] [<ffffffff815934a7>] ? bit_wait_io+0x17/0x60 [Fri Nov 24 10:17:06 2017] [<ffffffff81592f8a>] ? __wait_on_bit+0x5a/0x90 [Fri Nov 24 10:17:06 2017] [<ffffffff81169591>] ? wait_on_page_bit+0xc1/0xe0 [Fri Nov 24 10:17:06 2017] [<ffffffff810b8630>] ? autoremove_wake_function+0x40/0x40 [Fri Nov 24 10:17:06 2017] [<ffffffff81169687>] ? __filemap_fdatawait_range+0xd7/0x150 [Fri Nov 24 10:17:06 2017] [<ffffffff8116b41f>] ? __filemap_fdatawrite_range+0xcf/0x100 [Fri Nov 24 10:17:06 2017] [<ffffffff8116970f>] ? filemap_fdatawait_range+0xf/0x30 [Fri Nov 24 10:17:06 2017] [<ffffffff8116b52b>] ? filemap_write_and_wait_range+0x3b/0x60 [Fri Nov 24 10:17:06 2017] [<ffffffffa05e2311>] ? xfs_file_fsync+0x61/0x220 [xfs] [Fri Nov 24 10:17:06 2017] [<ffffffff8120d428>] ? do_fsync+0x38/0x60 [Fri Nov 24 10:17:06 2017] [<ffffffff8120d69c>] ? SyS_fsync+0xc/0x10 [Fri Nov 24 10:17:06 2017] [<ffffffff815969b6>] ? system_call_fast_compare_end+0xc/0x6b [Fri Nov 24 10:17:06 2017] INFO: task mysqld:190154 blocked for more than 120 seconds. [Fri Nov 24 10:17:06 2017] Not tainted 4.4.0-3-amd64 #1 [Fri Nov 24 10:17:06 2017] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [Fri Nov 24 10:17:06 2017] mysqld D ffff88103f295d80 0 190154 2071 0x00000000 [Fri Nov 24 10:17:06 2017] ffff88203418eec0 ffff881038544fc0 ffff88108bbfc000 ffff88108bbfbd00 [Fri Nov 24 10:17:06 2017] 7fffffffffffffff ffffffff81593490 ffff88108bbfbd80 0007ffffffffffff [Fri Nov 24 10:17:06 2017] ffffffff81592c11 0000000000000000 ffffffff81595ba5 7fffffffffffffff [Fri Nov 24 10:17:06 2017] Call Trace: [Fri Nov 24 10:17:06 2017] [<ffffffff81593490>] ? bit_wait_timeout+0xa0/0xa0 [Fri Nov 24 10:17:06 2017] [<ffffffff81592c11>] ? schedule+0x31/0x80 [Fri Nov 24 10:17:06 2017] [<ffffffff81595ba5>] ? schedule_timeout+0x235/0x2d0 [Fri Nov 24 10:17:06 2017] [<ffffffff812c49f6>] ? blk_peek_request+0x46/0x260 [Fri Nov 24 10:17:06 2017] [<ffffffffa001624d>] ? scsi_request_fn+0x3d/0x5f0 [scsi_mod] [Fri Nov 24 10:17:06 2017] [<ffffffff81593490>] ? bit_wait_timeout+0xa0/0xa0 [Fri Nov 24 10:17:06 2017] [<ffffffff81592204>] ? io_schedule_timeout+0xb4/0x130 [Fri Nov 24 10:17:06 2017] [<ffffffff810b82c5>] ? prepare_to_wait+0x55/0x80 [Fri Nov 24 10:17:06 2017] [<ffffffff815934a7>] ? bit_wait_io+0x17/0x60 [Fri Nov 24 10:17:06 2017] [<ffffffff81592f8a>] ? __wait_on_bit+0x5a/0x90 [Fri Nov 24 10:17:06 2017] [<ffffffff81169591>] ? wait_on_page_bit+0xc1/0xe0 [Fri Nov 24 10:17:06 2017] [<ffffffff810b8630>] ? autoremove_wake_function+0x40/0x40 [Fri Nov 24 10:17:06 2017] [<ffffffff81169687>] ? __filemap_fdatawait_range+0xd7/0x150 [Fri Nov 24 10:17:06 2017] [<ffffffff8116b41f>] ? __filemap_fdatawrite_range+0xcf/0x100 [Fri Nov 24 10:17:06 2017] [<ffffffff8116970f>] ? filemap_fdatawait_range+0xf/0x30 [Fri Nov 24 10:17:06 2017] [<ffffffff8116b52b>] ? filemap_write_and_wait_range+0x3b/0x60 [Fri Nov 24 10:17:06 2017] [<ffffffffa05e2311>] ? xfs_file_fsync+0x61/0x220 [xfs] [Fri Nov 24 10:17:06 2017] [<ffffffff8120d428>] ? do_fsync+0x38/0x60 [Fri Nov 24 10:17:06 2017] [<ffffffff8120d69c>] ? SyS_fsync+0xc/0x10 [Fri Nov 24 10:17:06 2017] [<ffffffff815969b6>] ? system_call_fast_compare_end+0xc/0x6b [Fri Nov 24 10:17:06 2017] INFO: task rs:main Q:Reg:167518 blocked for more than 120 seconds. [Fri Nov 24 10:17:06 2017] Not tainted 4.4.0-3-amd64 #1 [Fri Nov 24 10:17:06 2017] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [Fri Nov 24 10:17:06 2017] rs:main Q:Reg D ffff88203ea55d80 0 167518 1 0x00000000 [Fri Nov 24 10:17:06 2017] ffff88010638afc0 ffff88103852c200 ffff881087a8c000 ffff881087a8ba48 [Fri Nov 24 10:17:06 2017] 7fffffffffffffff ffffffff81593490 ffff881087a8bac8 ffff881625f9c5d8 [Fri Nov 24 10:17:06 2017] ffffffff81592c11 0000000000000000 ffffffff81595ba5 7fffffffffffffff [Fri Nov 24 10:17:06 2017] Call Trace: [Fri Nov 24 10:17:06 2017] [<ffffffff81593490>] ? bit_wait_timeout+0xa0/0xa0 [Fri Nov 24 10:17:06 2017] [<ffffffff81592c11>] ? schedule+0x31/0x80 [Fri Nov 24 10:17:06 2017] [<ffffffff81595ba5>] ? schedule_timeout+0x235/0x2d0 [Fri Nov 24 10:17:06 2017] [<ffffffff81593490>] ? bit_wait_timeout+0xa0/0xa0 [Fri Nov 24 10:17:06 2017] [<ffffffff81592204>] ? io_schedule_timeout+0xb4/0x130 [Fri Nov 24 10:17:06 2017] [<ffffffff810b82c5>] ? prepare_to_wait+0x55/0x80 [Fri Nov 24 10:17:06 2017] [<ffffffff815934a7>] ? bit_wait_io+0x17/0x60 [Fri Nov 24 10:17:06 2017] [<ffffffff81592f8a>] ? __wait_on_bit+0x5a/0x90 [Fri Nov 24 10:17:06 2017] [<ffffffff81593490>] ? bit_wait_timeout+0xa0/0xa0 [Fri Nov 24 10:17:06 2017] [<ffffffff8159303e>] ? out_of_line_wait_on_bit+0x7e/0xa0 [Fri Nov 24 10:17:06 2017] [<ffffffff810b8630>] ? autoremove_wake_function+0x40/0x40 [Fri Nov 24 10:17:06 2017] [<ffffffffa0103b4f>] ? do_get_write_access+0x24f/0x480 [jbd2] [Fri Nov 24 10:17:06 2017] [<ffffffff814db3f0>] ? ip_finish_output2+0x150/0x350 [Fri Nov 24 10:17:06 2017] [<ffffffff81210737>] ? __find_get_block+0xa7/0x110 [Fri Nov 24 10:17:06 2017] [<ffffffff81210ee6>] ? __getblk_gfp+0x26/0x50 [Fri Nov 24 10:17:06 2017] [<ffffffffa01a81a3>] ? ext4_dirty_inode+0x43/0x60 [ext4] [Fri Nov 24 10:17:06 2017] [<ffffffffa0103dae>] ? jbd2_journal_get_write_access+0x2e/0x60 [jbd2] [Fri Nov 24 10:17:06 2017] [<ffffffffa01d4356>] ? __ext4_journal_get_write_access+0x36/0x70 [ext4] [Fri Nov 24 10:17:06 2017] [<ffffffffa01a484d>] ? ext4_reserve_inode_write+0x5d/0x80 [ext4] [Fri Nov 24 10:17:06 2017] [<ffffffffa01a48bf>] ? ext4_mark_inode_dirty+0x4f/0x210 [ext4] [Fri Nov 24 10:17:06 2017] [<ffffffffa01a81a3>] ? ext4_dirty_inode+0x43/0x60 [ext4] [Fri Nov 24 10:17:06 2017] [<ffffffff8120816a>] ? __mark_inode_dirty+0x17a/0x370 [Fri Nov 24 10:17:06 2017] [<ffffffff811f5b69>] ? generic_update_time+0x79/0xd0 [Fri Nov 24 10:17:06 2017] [<ffffffff811f53ad>] ? file_update_time+0xbd/0x110 [Fri Nov 24 10:17:06 2017] [<ffffffff8116bd69>] ? __generic_file_write_iter+0x99/0x1d0 [Fri Nov 24 10:17:06 2017] [<ffffffffa019b238>] ? ext4_file_write_iter+0x228/0x460 [ext4] [Fri Nov 24 10:17:06 2017] [<ffffffff811db6ae>] ? do_readv_writev+0x15e/0x2b0 [Fri Nov 24 10:17:06 2017] [<ffffffff811dae0d>] ? __vfs_write+0xcd/0x120 [Fri Nov 24 10:17:06 2017] [<ffffffff811db464>] ? vfs_write+0xa4/0x190 [Fri Nov 24 10:17:06 2017] [<ffffffff811dc1e2>] ? SyS_write+0x52/0xc0 [Fri Nov 24 10:17:06 2017] [<ffffffff815969b6>] ? system_call_fast_compare_end+0xc/0x6b [Fri Nov 24 10:17:06 2017] INFO: task nrpe:4639 blocked for more than 120 seconds. [Fri Nov 24 10:17:06 2017] Not tainted 4.4.0-3-amd64 #1 [Fri Nov 24 10:17:06 2017] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [Fri Nov 24 10:17:06 2017] nrpe D ffff88203ec55d80 0 4639 160509 0x00000000 [Fri Nov 24 10:17:06 2017] ffff881032c92f00 ffff8810385de0c0 ffff88108a83c000 ffff88108a83b948 [Fri Nov 24 10:17:06 2017] 7fffffffffffffff ffffffff81593490 ffff88108a83b9c8 ffff8810874535d8 [Fri Nov 24 10:17:06 2017] ffffffff81592c11 0000000000000000 ffffffff81595ba5 7fffffffffffffff [Fri Nov 24 10:17:06 2017] Call Trace: [Fri Nov 24 10:17:06 2017] [<ffffffff81593490>] ? bit_wait_timeout+0xa0/0xa0 [Fri Nov 24 10:17:06 2017] [<ffffffff81592c11>] ? schedule+0x31/0x80 [Fri Nov 24 10:17:06 2017] [<ffffffff81595ba5>] ? schedule_timeout+0x235/0x2d0 [Fri Nov 24 10:17:06 2017] [<ffffffff8131b048>] ? __nla_reserve+0x38/0x50 [Fri Nov 24 10:17:06 2017] [<ffffffff8131b09c>] ? __nla_put+0xc/0x20 [Fri Nov 24 10:17:06 2017] [<ffffffff81545cbe>] ? inet6_fill_ifla6_attrs+0x3de/0x400 [Fri Nov 24 10:17:06 2017] [<ffffffff810ac3da>] ? update_curr+0xba/0x130 [Fri Nov 24 10:17:06 2017] [<ffffffff81593490>] ? bit_wait_timeout+0xa0/0xa0 [Fri Nov 24 10:17:06 2017] [<ffffffff81592204>] ? io_schedule_timeout+0xb4/0x130 [Fri Nov 24 10:17:06 2017] [<ffffffff810b82c5>] ? prepare_to_wait+0x55/0x80 [Fri Nov 24 10:17:06 2017] [<ffffffff815934a7>] ? bit_wait_io+0x17/0x60 [Fri Nov 24 10:17:06 2017] [<ffffffff81592f8a>] ? __wait_on_bit+0x5a/0x90 [Fri Nov 24 10:17:06 2017] [<ffffffff81593490>] ? bit_wait_timeout+0xa0/0xa0 [Fri Nov 24 10:17:06 2017] [<ffffffff8159303e>] ? out_of_line_wait_on_bit+0x7e/0xa0 [Fri Nov 24 10:17:06 2017] [<ffffffff810b8630>] ? autoremove_wake_function+0x40/0x40 [Fri Nov 24 10:17:06 2017] [<ffffffffa0103b4f>] ? do_get_write_access+0x24f/0x480 [jbd2] [Fri Nov 24 10:17:06 2017] [<ffffffff810dd3c4>] ? internal_add_timer+0x34/0x80 [Fri Nov 24 10:17:06 2017] [<ffffffff81210737>] ? __find_get_block+0xa7/0x110 [Fri Nov 24 10:17:06 2017] [<ffffffff81210ee6>] ? __getblk_gfp+0x26/0x50 [Fri Nov 24 10:17:06 2017] [<ffffffffa01a81a3>] ? ext4_dirty_inode+0x43/0x60 [ext4] [Fri Nov 24 10:17:06 2017] [<ffffffffa0103dae>] ? jbd2_journal_get_write_access+0x2e/0x60 [jbd2] [Fri Nov 24 10:17:06 2017] [<ffffffffa01d4356>] ? __ext4_journal_get_write_access+0x36/0x70 [ext4] [Fri Nov 24 10:17:06 2017] [<ffffffffa01a484d>] ? ext4_reserve_inode_write+0x5d/0x80 [ext4] [Fri Nov 24 10:17:06 2017] [<ffffffffa01a48bf>] ? ext4_mark_inode_dirty+0x4f/0x210 [ext4] [Fri Nov 24 10:17:06 2017] [<ffffffffa01a81a3>] ? ext4_dirty_inode+0x43/0x60 [ext4] [Fri Nov 24 10:17:06 2017] [<ffffffff8120816a>] ? __mark_inode_dirty+0x17a/0x370 [Fri Nov 24 10:17:06 2017] [<ffffffff811f5b69>] ? generic_update_time+0x79/0xd0 [Fri Nov 24 10:17:06 2017] [<ffffffff811f53ad>] ? file_update_time+0xbd/0x110 [Fri Nov 24 10:17:06 2017] [<ffffffff8116bd69>] ? __generic_file_write_iter+0x99/0x1d0 [Fri Nov 24 10:17:06 2017] [<ffffffffa019b238>] ? ext4_file_write_iter+0x228/0x460 [ext4] [Fri Nov 24 10:17:06 2017] [<ffffffff81317500>] ? __percpu_counter_sum+0x60/0x70 [Fri Nov 24 10:17:06 2017] [<ffffffffa01b4d24>] ? ext4_statfs+0x104/0x140 [ext4] [Fri Nov 24 10:17:06 2017] [<ffffffff811dae0d>] ? __vfs_write+0xcd/0x120 [Fri Nov 24 10:17:06 2017] [<ffffffff811daeb3>] ? __kernel_write+0x53/0x100 [Fri Nov 24 10:17:06 2017] [<ffffffff810fb1d2>] ? do_acct_process+0x462/0x4e0 [Fri Nov 24 10:17:06 2017] [<ffffffff810fb8ac>] ? acct_process+0xdc/0x100 [Fri Nov 24 10:17:06 2017] [<ffffffff8107c19e>] ? do_exit+0x79e/0xb10 [Fri Nov 24 10:17:06 2017] [<ffffffff8107c589>] ? do_group_exit+0x39/0xb0 [Fri Nov 24 10:17:06 2017] [<ffffffff8107c610>] ? SyS_exit_group+0x10/0x10 [Fri Nov 24 10:17:06 2017] [<ffffffff815969b6>] ? system_call_fast_compare_end+0xc/0x6b [Fri Nov 24 10:17:06 2017] INFO: task check_disk:4643 blocked for more than 120 seconds. [Fri Nov 24 10:17:06 2017] Not tainted 4.4.0-3-amd64 #1 [Fri Nov 24 10:17:06 2017] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [Fri Nov 24 10:17:06 2017] check_disk D ffff88203ec75d80 0 4643 4642 0x00000002 [Fri Nov 24 10:17:06 2017] ffff8811f571c0c0 ffff8810385e6100 ffff88132982c000 ffff88132982be78 [Fri Nov 24 10:17:06 2017] ffff881036d69164 ffff8811f571c0c0 00000000ffffffff ffff881036d69168 [Fri Nov 24 10:17:06 2017] ffffffff81592c11 ffff881036d69160 ffffffff81592e9a ffffffff81594a44 [Fri Nov 24 10:17:06 2017] Call Trace: [Fri Nov 24 10:17:06 2017] [<ffffffff81592c11>] ? schedule+0x31/0x80 [Fri Nov 24 10:17:06 2017] [<ffffffff81592e9a>] ? schedule_preempt_disabled+0xa/0x10 [Fri Nov 24 10:17:06 2017] [<ffffffff81594a44>] ? __mutex_lock_slowpath+0xb4/0x120 [Fri Nov 24 10:17:06 2017] [<ffffffff81594acb>] ? mutex_lock+0x1b/0x30 [Fri Nov 24 10:17:06 2017] [<ffffffff810fb844>] ? acct_process+0x74/0x100 [Fri Nov 24 10:17:06 2017] [<ffffffff8107c19e>] ? do_exit+0x79e/0xb10 [Fri Nov 24 10:17:06 2017] [<ffffffff811db503>] ? vfs_write+0x143/0x190 [Fri Nov 24 10:17:06 2017] [<ffffffff8107c589>] ? do_group_exit+0x39/0xb0 [Fri Nov 24 10:17:06 2017] [<ffffffff8107c610>] ? SyS_exit_group+0x10/0x10 [Fri Nov 24 10:17:06 2017] [<ffffffff815969b6>] ? system_call_fast_compare_end+0xc/0x6b [Fri Nov 24 10:17:06 2017] INFO: task sshd:4645 blocked for more than 120 seconds. [Fri Nov 24 10:17:06 2017] Not tainted 4.4.0-3-amd64 #1 [Fri Nov 24 10:17:06 2017] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [Fri Nov 24 10:17:06 2017] sshd D ffff88203ead5d80 0 4645 4644 0x00000100 [Fri Nov 24 10:17:06 2017] ffff8810372d0440 ffff881038558300 ffff8813a8890000 ffff8813a888fe78 [Fri Nov 24 10:17:06 2017] ffff881036d69164 ffff8810372d0440 00000000ffffffff ffff881036d69168 [Fri Nov 24 10:17:06 2017] ffffffff81592c11 ffff881036d69160 ffffffff81592e9a ffffffff81594a44 [Fri Nov 24 10:17:06 2017] Call Trace: [Fri Nov 24 10:17:06 2017] [<ffffffff81592c11>] ? schedule+0x31/0x80 [Fri Nov 24 10:17:06 2017] [<ffffffff81592e9a>] ? schedule_preempt_disabled+0xa/0x10 [Fri Nov 24 10:17:06 2017] [<ffffffff81594a44>] ? __mutex_lock_slowpath+0xb4/0x120 [Fri Nov 24 10:17:06 2017] [<ffffffff81594acb>] ? mutex_lock+0x1b/0x30 [Fri Nov 24 10:17:06 2017] [<ffffffff810fb844>] ? acct_process+0x74/0x100 [Fri Nov 24 10:17:06 2017] [<ffffffff8107c19e>] ? do_exit+0x79e/0xb10 [Fri Nov 24 10:17:06 2017] [<ffffffff8100388c>] ? syscall_trace_enter_phase1+0x11c/0x150 [Fri Nov 24 10:17:06 2017] [<ffffffff8107c589>] ? do_group_exit+0x39/0xb0 [Fri Nov 24 10:17:06 2017] [<ffffffff8107c610>] ? SyS_exit_group+0x10/0x10 [Fri Nov 24 10:17:06 2017] [<ffffffff815969b6>] ? system_call_fast_compare_end+0xc/0x6b [Fri Nov 24 10:17:06 2017] INFO: task cron:4646 blocked for more than 120 seconds. [Fri Nov 24 10:17:06 2017] Not tainted 4.4.0-3-amd64 #1 [Fri Nov 24 10:17:06 2017] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [Fri Nov 24 10:17:06 2017] cron D ffff88103f375d80 0 4646 1394 0x00000000 [Fri Nov 24 10:17:06 2017] ffff882034540f40 ffff881038591180 ffff88124d7b8000 ffff88124d7b7a20 [Fri Nov 24 10:17:06 2017] 7fffffffffffffff ffffffff81593490 ffff88124d7b7aa0 ffff88103e81d298 [Fri Nov 24 10:17:06 2017] ffffffff81592c11 0000000000000000 ffffffff81595ba5 7fffffffffffffff [Fri Nov 24 10:17:06 2017] Call Trace: [Fri Nov 24 10:17:06 2017] [<ffffffff81593490>] ? bit_wait_timeout+0xa0/0xa0 [Fri Nov 24 10:17:06 2017] [<ffffffff81592c11>] ? schedule+0x31/0x80 [Fri Nov 24 10:17:06 2017] [<ffffffff81595ba5>] ? schedule_timeout+0x235/0x2d0 [Fri Nov 24 10:17:06 2017] [<ffffffff810b1ece>] ? find_busiest_group+0x3e/0x4f0 [Fri Nov 24 10:17:06 2017] [<ffffffff81593490>] ? bit_wait_timeout+0xa0/0xa0 [Fri Nov 24 10:17:06 2017] [<ffffffff81592204>] ? io_schedule_timeout+0xb4/0x130 [Fri Nov 24 10:17:06 2017] [<ffffffff810b82c5>] ? prepare_to_wait+0x55/0x80 [Fri Nov 24 10:17:06 2017] [<ffffffff815934a7>] ? bit_wait_io+0x17/0x60 [Fri Nov 24 10:17:06 2017] [<ffffffff81592f8a>] ? __wait_on_bit+0x5a/0x90 [Fri Nov 24 10:17:06 2017] [<ffffffff81593490>] ? bit_wait_timeout+0xa0/0xa0 [Fri Nov 24 10:17:06 2017] [<ffffffff8159303e>] ? out_of_line_wait_on_bit+0x7e/0xa0 [Fri Nov 24 10:17:06 2017] [<ffffffff810b8630>] ? autoremove_wake_function+0x40/0x40 [Fri Nov 24 10:17:06 2017] [<ffffffffa0103b4f>] ? do_get_write_access+0x24f/0x480 [jbd2] [Fri Nov 24 10:17:06 2017] [<ffffffffa0104299>] ? jbd2_journal_dirty_metadata+0x269/0x2c0 [jbd2] [Fri Nov 24 10:17:06 2017] [<ffffffffa0103dae>] ? jbd2_journal_get_write_access+0x2e/0x60 [jbd2] [Fri Nov 24 10:17:06 2017] [<ffffffffa01d4356>] ? __ext4_journal_get_write_access+0x36/0x70 [ext4] [Fri Nov 24 10:17:06 2017] [<ffffffffa019de58>] ? __ext4_new_inode+0xb78/0x1410 [ext4] [Fri Nov 24 10:17:06 2017] [<ffffffffa01af645>] ? ext4_create+0x115/0x1b0 [ext4] [Fri Nov 24 10:17:06 2017] [<ffffffff811e8007>] ? vfs_create+0xb7/0x120 [Fri Nov 24 10:17:06 2017] [<ffffffff811ea2aa>] ? path_openat+0x140a/0x1520 [Fri Nov 24 10:17:06 2017] [<ffffffff811e5035>] ? terminate_walk+0x55/0xb0 [Fri Nov 24 10:17:06 2017] [<ffffffff8119808e>] ? do_set_pte+0x9e/0xd0 [Fri Nov 24 10:17:06 2017] [<ffffffff811eb581>] ? do_filp_open+0x91/0x100 [Fri Nov 24 10:17:06 2017] [<ffffffff811da5ba>] ? do_sys_open+0x13a/0x230 [Fri Nov 24 10:17:06 2017] [<ffffffff815969b6>] ? system_call_fast_compare_end+0xc/0x6b [Fri Nov 24 10:18:41 2017] megaraid_sas 0000:03:00.0: [ 0]waiting for 38 commands to complete for scsi0 [Fri Nov 24 10:18:46 2017] megaraid_sas 0000:03:00.0: [ 5]waiting for 38 commands to complete for scsi0 [Fri Nov 24 10:18:51 2017] megaraid_sas 0000:03:00.0: [10]waiting for 38 commands to complete for scsi0 [Fri Nov 24 10:18:56 2017] megaraid_sas 0000:03:00.0: [15]waiting for 38 commands to complete for scsi0 [Fri Nov 24 10:19:01 2017] megaraid_sas 0000:03:00.0: [20]waiting for 38 commands to complete for scsi0 [Fri Nov 24 10:19:06 2017] megaraid_sas 0000:03:00.0: [25]waiting for 38 commands to complete for scsi0 [Fri Nov 24 10:19:11 2017] megaraid_sas 0000:03:00.0: [30]waiting for 38 commands to complete for scsi0 [Fri Nov 24 10:19:16 2017] megaraid_sas 0000:03:00.0: [35]waiting for 38 commands to complete for scsi0 [Fri Nov 24 10:19:21 2017] megaraid_sas 0000:03:00.0: [40]waiting for 38 commands to complete for scsi0 [Fri Nov 24 10:19:26 2017] megaraid_sas 0000:03:00.0: [45]waiting for 38 commands to complete for scsi0 [Fri Nov 24 10:19:31 2017] megaraid_sas 0000:03:00.0: [50]waiting for 38 commands to complete for scsi0 [Fri Nov 24 10:19:36 2017] megaraid_sas 0000:03:00.0: [55]waiting for 38 commands to complete for scsi0 [Fri Nov 24 10:19:41 2017] megaraid_sas 0000:03:00.0: [60]waiting for 38 commands to complete for scsi0 [Fri Nov 24 10:19:46 2017] megaraid_sas 0000:03:00.0: [65]waiting for 38 commands to complete for scsi0 [Fri Nov 24 10:19:51 2017] megaraid_sas 0000:03:00.0: [70]waiting for 38 commands to complete for scsi0 [Fri Nov 24 10:19:56 2017] megaraid_sas 0000:03:00.0: [75]waiting for 38 commands to complete for scsi0 [Fri Nov 24 10:20:01 2017] megaraid_sas 0000:03:00.0: [80]waiting for 38 commands to complete for scsi0 [Fri Nov 24 10:20:06 2017] megaraid_sas 0000:03:00.0: [85]waiting for 38 commands to complete for scsi0 [Fri Nov 24 10:20:11 2017] megaraid_sas 0000:03:00.0: [90]waiting for 38 commands to complete for scsi0 [Fri Nov 24 10:20:16 2017] megaraid_sas 0000:03:00.0: [95]waiting for 38 commands to complete for scsi0 [Fri Nov 24 10:20:21 2017] megaraid_sas 0000:03:00.0: [100]waiting for 38 commands to complete for scsi0 [Fri Nov 24 10:20:26 2017] megaraid_sas 0000:03:00.0: [105]waiting for 38 commands to complete for scsi0 [Fri Nov 24 10:20:31 2017] megaraid_sas 0000:03:00.0: [110]waiting for 38 commands to complete for scsi0 [Fri Nov 24 10:20:36 2017] megaraid_sas 0000:03:00.0: [115]waiting for 38 commands to complete for scsi0 [Fri Nov 24 10:20:41 2017] megaraid_sas 0000:03:00.0: [120]waiting for 38 commands to complete for scsi0 [Fri Nov 24 10:20:46 2017] megaraid_sas 0000:03:00.0: [125]waiting for 38 commands to complete for scsi0 [Fri Nov 24 10:20:51 2017] megaraid_sas 0000:03:00.0: [130]waiting for 38 commands to complete for scsi0 [Fri Nov 24 10:20:56 2017] megaraid_sas 0000:03:00.0: [135]waiting for 38 commands to complete for scsi0 [Fri Nov 24 10:21:01 2017] megaraid_sas 0000:03:00.0: [140]waiting for 38 commands to complete for scsi0 [Fri Nov 24 10:21:06 2017] megaraid_sas 0000:03:00.0: [145]waiting for 38 commands to complete for scsi0 [Fri Nov 24 10:21:11 2017] megaraid_sas 0000:03:00.0: [150]waiting for 38 commands to complete for scsi0 [Fri Nov 24 10:21:16 2017] megaraid_sas 0000:03:00.0: [155]waiting for 38 commands to complete for scsi0 [Fri Nov 24 10:21:21 2017] megaraid_sas 0000:03:00.0: [160]waiting for 38 commands to complete for scsi0 [Fri Nov 24 10:21:26 2017] megaraid_sas 0000:03:00.0: [165]waiting for 38 commands to complete for scsi0 [Fri Nov 24 10:21:28 2017] megaraid_sas 0000:03:00.0: waitingfor controller reset to finish [Fri Nov 24 10:21:31 2017] megaraid_sas 0000:03:00.0: [170]waiting for 38 commands to complete for scsi0 [Fri Nov 24 10:21:33 2017] megaraid_sas 0000:03:00.0: waitingfor controller reset to finish [Fri Nov 24 10:21:36 2017] megaraid_sas 0000:03:00.0: [175]waiting for 38 commands to complete for scsi0 [Fri Nov 24 10:21:38 2017] megaraid_sas 0000:03:00.0: waitingfor controller reset to finish [Fri Nov 24 10:21:41 2017] megaraid_sas 0000:03:00.0: pending commands remain after waiting, will reset adapter scsi0. [Fri Nov 24 10:21:41 2017] megaraid_sas 0000:03:00.0: resetting fusion adapter scsi0. [Fri Nov 24 10:21:43 2017] megaraid_sas 0000:03:00.0: waitingfor controller reset to finish [Fri Nov 24 10:21:48 2017] megaraid_sas 0000:03:00.0: Waiting for FW to come to ready state [Fri Nov 24 10:21:48 2017] megaraid_sas 0000:03:00.0: waitingfor controller reset to finish [Fri Nov 24 10:21:53 2017] megaraid_sas 0000:03:00.0: waitingfor controller reset to finish [Fri Nov 24 10:21:55 2017] megaraid_sas 0000:03:00.0: FW now in Ready state [Fri Nov 24 10:21:56 2017] megaraid_sas 0000:03:00.0: Init cmd success [Fri Nov 24 10:21:56 2017] megaraid_sas 0000:03:00.0: firmware type : Extended VD(240 VD)firmware [Fri Nov 24 10:21:56 2017] megaraid_sas 0000:03:00.0: controller type : MR(1024MB) [Fri Nov 24 10:21:56 2017] megaraid_sas 0000:03:00.0: Online Controller Reset(OCR) : Enabled [Fri Nov 24 10:21:56 2017] megaraid_sas 0000:03:00.0: Secure JBOD support : No [Fri Nov 24 10:21:56 2017] megaraid_sas 0000:03:00.0: Jbod map is not supported megasas_setup_jbod_map 4613 [Fri Nov 24 10:21:56 2017] megaraid_sas 0000:03:00.0: Reset successful for scsi0. [Fri Nov 24 10:21:56 2017] megaraid_sas 0000:03:00.0: 2479 (2s/0x0020/CRIT) - Controller encountered a fatal error and was reset
Mentioned in SAL (#wikimedia-operations) [2017-11-24T11:48:04Z] <marostegui> Reboot es2018 after full-upgrade - T181293
Change 393211 merged by jenkins-bot:
[operations/mediawiki-config@master] maridb: depool es2018 after crash
Mentioned in SAL (#wikimedia-operations) [2017-11-24T11:50:39Z] <jynus@tin> Synchronized wmf-config/db-codfw.php: depool es2018 T181293 (duration: 00m 45s)
Mentioned in SAL (#wikimedia-operations) [2017-11-24T11:57:03Z] <marostegui> Disable puppet on es2018 - T181293
Change 393218 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] mariadb: Promote es2017 to master
Change 393218 merged by Marostegui:
[operations/puppet@production] mariadb: Promote es2017 to master
Mentioned in SAL (#wikimedia-operations) [2017-11-24T12:13:47Z] <marostegui> Enable GTID on es2018 - T181293
I would do a quick data check on enwiki around the time of the issue (compare.py) to see that no data has been lost, but other than that, this is fixed.
I have compared the last value from enwiki at: 171123 22:16:12 (155663487 )till the last one I just selected from the table (155702786).
And no differences were found.
Servers compared: es2018 with es2019 and with the current master, es2017. Also with eqiad server: es1019
Change 407407 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/mediawiki-config@master] mariadb: Repool es2018 after maintenance
Change 407407 merged by jenkins-bot:
[operations/mediawiki-config@master] mariadb: Repool es2018 after maintenance