Thanks for adding the URL @1997kB!
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Feb 4 2019
Feb 3 2019
You did it in the last all hands! :-)
I will walk you thru it so you can fix it yourself entirely!
Can you also post the wiki with this rename request so we can check which wikis have more edits?
Feb 1 2019
Let's get it replaced sooner than later as it is a master on m5
Jan 31 2019
From what I can see it was failing on the call to: google.setOnLoadCallback(drawChart);
Jan 30 2019
Thanks!
I had a chat with Moritz about this we he was not too sure it would be a kernel thing itself as in something really wrong with the kernel or maybe some sort of hardware thing or just maybe a punctual thing although you mentioned it was tried several times.
And after 4 days trying to alter mep_word_persistence dbstore1002 crashed again (T213706#4917915)
So, basically, the table mep_word_persistence cannot be altered without making dbstore1002 to crash. So I guess once we decide to fully migrate, we will need to convert that table to InnoDB once it has been moved to the new servers.
I will do a proof of concept to make sure it works on the new servers.
Jan 28 2019
Go for it!
Thanks for checking it!
This might be more likely: T214838: ms-be1034 crash
Alter table on the last Aria table on the staging database (mep_word_persistence) still running after 3 days.
+1 to get rid of it:
root@dbstore2001:/srv/backups# ls -lhrt | tail -n5 drwx------ 2 dump dump 24K Feb 28 2018 s1.20180228121150 -rw-r--r-- 1 dump dump 86 Feb 28 2018 dump.s3.log drwx------ 2 dump dump 7.7M Feb 28 2018 s3.20180228121150 -rw-r--r-- 1 dump dump 0 Feb 28 2018 dump.s4.log drwx------ 2 dump dump 24K Feb 28 2018 s4.20180228121150
Jan 26 2019
All the tables on incubatorwiki are now InnoDB and replication is catching up.
s3 thread broke with:
Last_SQL_Error: Error 'Got error 22 "Invalid argument" from storage engine TokuDB' on query. Default database: 'incubatorwiki'. Query: 'INSERT /* ActorMigration::getInsertValuesWithTempTable */ INTO `revision_actor_temp` (revactor_rev,revactor_actor,revactor_timestamp,revactor_page) VALUES xxxxx Replicate_Ignore_Server_Ids:
@ssastry just to make sure we have all the data we need here, so it is easier, faster and we can avoid mistakes, can you confirm the following info:
I believe there is nothing else pending here, and this was re-opened just to get an answer from Chris, which was done.
Going to close this, if someone else feels it should remain open, feel free to do so!
@jcrespo +1 to reimage/reclone from an existing host (or mariabackup!)
I am going to close this for now, as it has been 10 days without issues:
root@db1115:~# free -g total used free shared buff/cache available Mem: 125 65 1 1 59 58 Swap: 0
Jan 25 2019
Thanks!
18:15 <+icinga-wm> RECOVERY - HP RAID on db2068 is OK: OK: Slot 0: OK: 1I:1:1, 1I:1:2, 1I:1:3, 1I:1:4, 1I:1:5, 1I:1:6, 1I:1:7, 1I:1:8, 1I:1:9, 1I:1:10, 1I:1:11, 1I:1:12 - Controller: OK - Battery/Capacitor: OK
@Papaul let's get it replaced - thanks!
Jan 24 2019
In T212487#4906132, @MarkTraceur wrote:FWIW (I know I'm a little late on this) I think that illustration project was something we either never got off the ground, or haven't looked at in some time.
These are the only two tables left in Aria after today's alters:
root@dbstore1002.eqiad.wmnet[information_schema]> select TABLE_NAME from tables where ENGINE='Aria' and TABLE_SCHEMA='staging'; +----------------------+ | TABLE_NAME | +----------------------+ | mep_word_persistence | | organic_link | +----------------------+ 2 rows in set (0.05 sec)
@Addshore let us know today that there is a "new" error that started happening today which looks related to this thread (I think):
https://logstash.wikimedia.org/goto/018d06f1ac178c272964fa71b76702e1
s2 eqiad progress
- labsdb1011
- labsdb1010
- labsdb1009
- dbstore1004
- dbstore1002
- db1125
- db1122
- db1105
- db1103
- db1095
- db1090
- db1076
- db1074
- db1066
Jan 23 2019
All the tokudb tables on staging database have been migrated to InnoDB:
root@DBSTORE[information_schema]> select TABLE_SCHEMA,TABLE_NAME,UPDATE_TIME,TABLE_ROWS from tables where ENGINE='TokuDB' and TABLE_SCHEMA='staging' order by update_time desc; Empty set (1.20 sec)
Thank you!
The following Aria tables have been converted to InnoDB on dbstore1002 on the staging database:
tbayer_test2 tbayer_test1 theodora tgr_uw_terminating_errors tgr_gather_user_requests top_2016_by_month tgr_revdel_tmp rad_labeled_user pageviews_by_country_language temp3 ve2_pilot_users wiki_month_registrations pageviews_per_project_country_v2 th2_experimental_user ve2_experimental_users tr_experimental_user rev_reverted_20k_sample tr_experimental_user_revision ve2_experimental_user_revision_stats woe_wiki_edit_count rev_ids_20k_sample overall_control_month_stats th_link_additions wikidata_nonbot_reverted_sample wikidata_nonbot_sample overall_token_stats_cleaned overall_token_stats revert_20150301_commonswiki tbayer_readnavtimesessions_20160107 tbayer_readnavtimesessions5sec_20160107 tbayer_readnavsessions_20160107 tbayer_test3 pentaho04 pentahoviews_countries pentahoviews pentahoviews05 temp resolved_organic_inlink_count revert_20150301_ptwiki resolved_inlink_count tbayer_readnavevents_20160107 yearly_page_edits record_impression revert_20150301_dewiki user_registration_approx
Aria tables on the staging database:
root@dbstore1002.eqiad.wmnet[staging]> select TABLE_SCHEMA,TABLE_NAME,UPDATE_TIME,TABLE_ROWS from information_schema.tables where ENGINE='aria' and TABLE_SCHEMA='staging' order by table_rows asc; +--------------+-----------------------------------------+---------------------+------------+ | TABLE_SCHEMA | TABLE_NAME | UPDATE_TIME | TABLE_ROWS | +--------------+-----------------------------------------+---------------------+------------+ | staging | tbayer_test2 | 2019-01-14 13:57:19 | 8 | | staging | tbayer_test1 | 2019-01-14 13:57:19 | 28 | | staging | theodora | 2019-01-14 13:57:37 | 79 | | staging | tgr_uw_terminating_errors | 2019-01-14 13:57:37 | 110 | | staging | tgr_gather_user_requests | 2019-01-14 13:57:37 | 123 | | staging | top_2016_by_month | 2019-01-14 14:07:43 | 240 | | staging | tgr_revdel_tmp | 2019-01-14 13:57:37 | 343 | | staging | rad_labeled_user | 2019-01-14 13:37:44 | 1063 | | staging | pageviews_by_country_language | 2019-01-14 13:35:21 | 1228 | | staging | temp3 | 2019-01-14 13:57:23 | 2978 | | staging | ve2_pilot_users | 2019-01-14 14:08:52 | 4189 | | staging | wiki_month_registrations | 2019-01-14 14:08:56 | 11184 | | staging | pageviews_per_project_country_v2 | 2019-01-14 13:37:21 | 12274 | | staging | th2_experimental_user | 2019-01-14 13:57:37 | 14766 | | staging | ve2_experimental_users | 2019-01-14 14:08:52 | 26971 | | staging | tr_experimental_user | 2019-01-14 14:07:44 | 41033 | | staging | rev_reverted_20k_sample | 2019-01-14 13:47:51 | 47289 | | staging | tr_experimental_user_revision | 2019-01-14 14:07:44 | 50696 | | staging | ve2_experimental_user_revision_stats | 2019-01-14 14:08:52 | 61541 | | staging | woe_wiki_edit_count | 2019-01-14 14:08:57 | 70877 | | staging | rev_ids_20k_sample | 2019-01-14 13:47:51 | 80000 | | staging | overall_control_month_stats | 2019-01-14 13:21:55 | 407344 | | staging | th_link_additions | 2019-01-14 13:57:39 | 444682 | | staging | wikidata_nonbot_reverted_sample | 2019-01-14 14:08:54 | 488536 | | staging | wikidata_nonbot_sample | 2019-01-14 14:08:56 | 1000000 | | staging | overall_token_stats_cleaned | 2019-01-14 13:21:57 | 1028955 | | staging | overall_token_stats | 2019-01-14 13:22:00 | 1028955 | | staging | revert_20150301_commonswiki | 2019-01-14 13:40:33 | 1460090 | | staging | tbayer_readnavtimesessions_20160107 | 2019-01-14 13:57:15 | 1498379 | | staging | tbayer_readnavtimesessions5sec_20160107 | 2019-01-14 13:57:19 | 1498379 | | staging | tbayer_readnavsessions_20160107 | 2019-01-14 13:57:12 | 1498379 | | staging | tbayer_test3 | 2019-01-14 13:57:23 | 1510999 | | staging | pentaho04 | 2019-01-14 13:37:26 | 1530114 | | staging | pentahoviews_countries | 2019-01-14 13:37:38 | 1679403 | | staging | pentahoviews | 2019-01-14 13:37:44 | 1679403 | | staging | pentahoviews05 | 2019-01-14 13:37:32 | 1828373 | | staging | temp | 2019-01-14 13:57:36 | 4686386 | | staging | resolved_organic_inlink_count | 2019-01-14 13:40:27 | 4782450 | | staging | revert_20150301_ptwiki | 2019-01-14 13:41:47 | 4948260 | | staging | resolved_inlink_count | 2019-01-14 13:40:14 | 5044527 | | staging | tbayer_readnavevents_20160107 | 2019-01-14 13:57:05 | 7200310 | | staging | yearly_page_edits | 2019-01-14 14:09:28 | 8541686 | | staging | record_impression | 2019-01-14 13:38:48 | 13167822 | | staging | revert_20150301_dewiki | 2019-01-14 13:41:30 | 14081776 | | staging | user_registration_approx | 2019-01-14 14:08:52 | 21283152 | | staging | referer_data | 2019-01-14 13:39:59 | 31056556 | | staging | pageviews04 | 2019-01-14 13:33:05 | 43488232 | | staging | pageviews | 2019-01-14 13:37:20 | 43568621 | | staging | pageviews05 | 2019-01-14 13:35:19 | 51109873 | | staging | revert_20150304_enwiki | 2019-01-14 13:47:49 | 87573047 | | staging | th_subst_template_additions | 2019-01-14 14:07:32 | 94939126 | | staging | page_name_views_dupes | 2019-01-14 13:31:12 | 127987744 | | staging | sessions_enwiki_20150801 | 2019-01-14 13:56:47 | 154379416 | | staging | organic_link | 2019-01-14 13:12:06 | 209134095 | | staging | mep_word_persistence | 2019-01-14 12:34:42 | 435986042 | +--------------+-----------------------------------------+---------------------+------------+
I have altered the following tables:
@elukey we need to get rid of TokuDB before importing on the final dbstore hosts.
These are the tables that currently run TokuDB on staging:
Maybe @MoritzMuehlenhoff can give some ideas
@Cmjohnson you've got any rough ETA for these?
Thanks!
s5 eqiad progress
- labsdb1011
- labsdb1010
- labsdb1009
- dbstore1003
- dbstore1002
- db1124
- db1113
- db1110
- db1102
- db1100
- db1097
- db1096
- db1082
- db1070
I have merged both changes after the review from @Krinkle (thanks!).
Let's see how it goes
Jan 22 2019
There is really not much we (DBAs) can do about this particular issue other than T172497 - check also T203059#4896539
db1098:3316 has some differences on change_tag table comparing it with the rest of the hosts on the section. I am going to get those fixed.
This can now fully go, as tag_summary got fully dropped everywhere yesterday.
All done
From the databases point of view it is all done (I just did a quick check to confirm)
Ah ok! Thanks :)
All the replication threads but x1 started fine.
I have fixed all the x1 rows that failed and it has now caught up
Another crash happened last night
Thread pointer: 0x0x0 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... stack_bottom = 0x0 thread_stack 0x48000 mysys/stacktrace.c:247(my_print_stacktrace)[0xbdd6ee] sql/signal_handler.cc:153(handle_fatal_signal)[0x73dc40] /lib/x86_64-linux-gnu/libpthread.so.0(+0x10330)[0x7fe0261f7330] /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x37)[0x7fe02500bc37] /lib/x86_64-linux-gnu/libc.so.6(abort+0x148)[0x7fe02500f028] srv/srv0srv.cc:2200(srv_error_monitor_thread)[0x9870aa] /lib/x86_64-linux-gnu/libpthread.so.0(+0x8184)[0x7fe0261ef184] /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7fe0250d303d] The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains information that should help you find out what is causing the crash. 190122 00:19:05 mysqld_safe Number of processes running now: 0 190122 00:19:05 mysqld_safe mysqld restarted 190122 0:19:06 [Note] /opt/wmf-mariadb10/bin/mysqld (mysqld 10.0.22-MariaDB) starting as process 18005 ... 2019-01-22 00:19:06 7efde9a2e7c0 InnoDB: Warning: Using innodb_locks_unsafe_for_binlog is DEPRECATED. This option may be removed in future releases. Please use READ COMMITTED transaction isolation level instead, see http://dev.mysql.com/doc/refman/5.6/en/set-transaction.html. 190122 0:19:06 [Note] InnoDB: Using mutexes to ref count buffer pool pages 190122 0:19:06 [Note] InnoDB: The InnoDB memory heap is disabled 190122 0:19:06 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins 190122 0:19:06 [Note] InnoDB: Memory barrier is not used 190122 0:19:06 [Note] InnoDB: Compressed tables use zlib 1.2.8 190122 0:19:06 [Note] InnoDB: Using CPU crc32 instructions 190122 0:19:06 [Note] InnoDB: Initializing buffer pool, size = 18.0G 190122 0:19:07 [Note] InnoDB: Completed initialization of buffer pool 190122 0:19:07 [Note] InnoDB: Highest supported file format is Barracuda. 190122 0:19:07 [Note] InnoDB: Log scan progressed past the checkpoint lsn 99856990038248 190122 0:19:07 [Note] InnoDB: Database was not shutdown normally! 190122 0:19:07 [Note] InnoDB: Starting crash recovery. 190122 0:19:07 [Note] InnoDB: Reading tablespace information from the .ibd files... 190122 0:27:20 [Note] InnoDB: Restoring possible half-written data pages 190122 0:27:20 [Note] InnoDB: from the doublewrite buffer... InnoDB: Doing recovery: scanned up to log sequence number 99856995280896 InnoDB: Doing recovery: scanned up to log sequence number 99857000523776 InnoDB: Doing recovery: scanned up to log sequence number 99857005766656 InnoDB: Doing recovery: scanned up to log sequence number 99857011009536 InnoDB: Doing recovery: scanned up to log sequence number 99857016252416
Jan 21 2019
Thanks for the heads up
Keep in mind that there is also the Actor migration (T188327#4895206) on going