Page MenuHomePhabricator

Pending maintenance on the eventlogging databases (db1046, db1047, dbstore1002, other dbstores)
Closed, ResolvedPublic

Description

As a followup of T119380, things that are pending:

  • Set a maintenance window for converting tables on db1046 to TokuDB. The maintenance requires going read-only for some days or coordinate on a failover.

1SELECT table_name, (DATA_LENGTH + INDEX_LENGTH)/1024/1024/1024 as `TOTAL SIZE (GB)`, ENGINE, CREATE_OPTIONS FROM information_schema.tables WHERE TABLE_SCHEMA='log' /* AND `ENGINE` <> 'TokuDB' */ ORDER BY (DATA_LENGTH + INDEX_LENGTH) DESC LIMIT 30;
2
3mysql> SELECT table_name, (DATA_LENGTH + INDEX_LENGTH)/1024/1024/1024 as `TOTAL SIZE (GB)`, ENGINE, CREATE_OPTIONS FROM information_schema.tables WHERE TABLE_SCHEMA='log' /* AND `ENGINE` <> 'TokuDB' */ ORDER BY (DATA_LENGTH + INDEX_LENGTH) DESC LIMIT 30;
4+--------------------------------------------+------------------+--------+---------------------------------------------------+
5| table_name | TOTAL SIZE (GB) | ENGINE | CREATE_OPTIONS |
6+--------------------------------------------+------------------+--------+---------------------------------------------------+
7| MobileWebUIClickTracking_10742159 | 381.725290197879 | TokuDB | `compression`='tokudb_zlib' |
8| MobileWebClickTracking_5929948 | 361.198562531732 | TokuDB | row_format=COMPRESSED `compression`='tokudb_zlib' |
9| Edit_11448630 | 141.018585060723 | TokuDB | `compression`='tokudb_zlib' |
10| PageContentSaveComplete_5588433 | 137.354089650325 | TokuDB | row_format=COMPRESSED `compression`='tokudb_zlib' |
11| MediaViewer_10867062 | 133.406105597503 | TokuDB | `compression`='tokudb_zlib' |
12| Edit_13457736 | 121.433242797852 | InnoDB | |
13| MobileWikiAppToCInteraction_10375484 | 81.189829871990 | TokuDB | `compression`='tokudb_zlib' |
14| MobileWebEditing_8599025 | 61.526363065466 | TokuDB | row_format=COMPRESSED `compression`='tokudb_zlib' |
15| DeprecatedUsage_7906187 | 53.731984032318 | TokuDB | row_format=COMPRESSED `compression`='tokudb_zlib' |
16| MobileWikiAppArticleSuggestions_12443791 | 49.361083984375 | InnoDB | |
17| MobileWebSectionUsage_14321266 | 49.022262573242 | InnoDB | |
18| MobileWikiAppMediaGallery_10923135 | 29.499001029879 | TokuDB | `compression`='tokudb_zlib' |
19| MobileWikiAppSearch_10641988 | 28.150512695313 | InnoDB | |
20| NavigationTiming_10785754 | 21.677765787579 | TokuDB | `compression`='tokudb_zlib' |
21| MobileWikiAppToCInteraction_8461467 | 19.969530507922 | TokuDB | row_format=COMPRESSED `compression`='tokudb_zlib' |
22| MobileWikiAppArticleSuggestions_11448426 | 16.618270874023 | InnoDB | |
23| ImageMetricsCorsSupport_11686678 | 16.049972534180 | InnoDB | |
24| ContentTranslationCTA_11616099 | 15.547241210938 | InnoDB | |
25| NavigationTiming_12405818 | 15.501541137695 | InnoDB | |
26| MultimediaViewerNetworkPerformance_7917896 | 14.237171442248 | TokuDB | row_format=COMPRESSED `compression`='tokudb_zlib' |
27| MobileWikiAppArticleSuggestions_10590869 | 14.102098437957 | TokuDB | `compression`='tokudb_zlib' |
28| EchoInteraction_5782287 | 13.824828155339 | TokuDB | row_format=COMPRESSED `compression`='tokudb_zlib' |
29| MediaViewer_8572637 | 12.886538302526 | TokuDB | row_format=COMPRESSED `compression`='tokudb_zlib' |
30| MediaViewer_8245578 | 12.405743772164 | TokuDB | row_format=COMPRESSED `compression`='tokudb_zlib' |
31| UniversalLanguageSelector_7327441 | 12.037833562121 | TokuDB | row_format=COMPRESSED `compression`='tokudb_zlib' |
32| PersonalBar_7829128 | 11.579391132109 | TokuDB | row_format=COMPRESSED `compression`='tokudb_zlib' |
33| CentralAuth_5690875 | 10.704098877497 | TokuDB | row_format=COMPRESSED `compression`='tokudb_zlib' |
34| PageCreation_7481635 | 10.452265075408 | TokuDB | row_format=COMPRESSED `compression`='tokudb_zlib' |
35| MediaViewer_10606177 | 8.973897109739 | TokuDB | `compression`='tokudb_zlib' |
36| Echo_7731316 | 8.949344517663 | TokuDB | row_format=COMPRESSED `compression`='tokudb_zlib' |
37+--------------------------------------------+------------------+--------+---------------------------------------------------+

  • Change the application so it create the tables in TokuDB compressed format by default:
ENGINE=TokuDB row_format=COMPRESSED `compression`='tokudb_zlib'
  • Perform an automatic rotation of tables when they become too big for easier maintenance (right now it is impossible to perform online maintenance because some tables grow faster than the changes are applied)

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Nuria added a comment.Dec 7 2015, 7:29 PM

@jcrespo, ori: I can help plan this work as needed. Let me know when it is a good time for ops and we will announce a partial outage, the sooner the better.

jcrespo added a comment.EditedDec 7 2015, 7:39 PM

Here is what I can give you as a feedback. Things I need:

  • Upgrade db1046 to the latest MySQL version (that means stopping the loading of data for <~2 hours). General configuration changes that are being rolled in to the other MySQL servers.
  • The most important part is ALTERing the InnoDB tables to convert and defragment them. For the largest tables I think it could take a day or so, and while it could be done in parallel, we do not need to do so, so we could lock for writes only one table at a time, and writes could continue on the other tables so the overall process do not gets behind. There are 7 "large" tables. Once they are TokuDB, this process can be done fully online.
  • The changes on the CREATE TABLE executed by the application have to be rolled in before any of the above.

For the master maintenance, the analytics slaves would not be affected.

Aside from this week, and the 2 first of January, I am generally available (except typical holidays like Christmas or New Year).

Nuria added a comment.Dec 7 2015, 8:00 PM

I see, can we do this in stages?

Stage 1:
-Upgrade db1046 to latests Mysql.
We can do this as early as Wednesday, we just need to pick a 2 hour interval and communicate it.

Stage 2
See how long does it get to ALTERing a InnoDB table to convert and defragment it.
We pick the largest table, blacklist this schema so events are not going to it and do necessary changes on DB, see how long it took and based on this we plan further work on the other large tables.

Stage 3:
ALTERing the rest of InnoDB tables to convert and defragment them.

Nuria added a comment.Dec 7 2015, 8:00 PM

I see, can we do this in stages?

Stage 1:
-Upgrade db1046 to latests Mysql.
We can do this as early as Wednesday, we just need to pick a 2 hour interval and communicate it.

Stage 2
See how long does it get to ALTERing a InnoDB table to convert and defragment it.
We pick the largest table, blacklist this schema so events are not going to it and do necessary changes on DB, see how long it took and based on this we plan further work on the other large tables.

Stage 3:
ALTERing the rest of InnoDB tables to convert and defragment them.

Seems ok to me, although as I mentioned, I would prefer to schedule Stage 1 for next week.

Stage 2 and the rest should be blocked first by the application changes.

Nuria added a comment.Dec 7 2015, 8:47 PM

Seems ok to me, although as I mentioned, I would prefer to schedule Stage 1 for next week.

Ok, let's schedule stage 1 Tuesday next week? On European timezone right? You let me know what is a good time, while I am on PST our folks in Europe can assist as needed.

Stage 2 and the rest should be blocked first by the application changes.

not really, we have ability to block a schema from publishing to db, I think we can blacklist a schema and see how long it takes to do the innodb defragmentation without any application changes , the rest of eventlogging will continue publishing accordingly.

Tuesday is ok.

We should really avoid the application from creating more InnoDB tables. If not, this conversion will be needed again, and again, and again.

Nuria added a comment.Dec 7 2015, 8:58 PM
This comment was removed by Nuria.
Nuria added a comment.EditedDec 7 2015, 9:03 PM

We should really avoid the application from creating more InnoDB tables. If not, this conversion will be needed again, and again, and again.

Maybe I missunderstood this, do you want us to try to create tokudb tables from now on? our db code just connects via python-mysql driver and creates the table if it doesn't exists via ORM. The only config on that regard we have is:

ENGINE_TABLE_OPTIONS = {

'mysql': {
    'mysql_charset': 'utf8',
    'mysql_engine': 'InnoDB'
}

}

We can change that to toku db when available i guess but we probably should try those changes in the beta cluster 1st?

Nuria added a subscriber: Ottomata.Dec 7 2015, 9:37 PM

So, as we mentioned in real time, just 'mysql_engine': 'TokuDB' will work (in SQL syntax CREATE TABLE ... ENGINE=TokuDB) creates by default the tokudb tables compressed by default.

Nuria added a comment.Dec 8 2015, 12:52 AM

Ahem ... just tried this: 'mysql_engine': 'TokuDB' on eventlogging on labs and no, table did not get created. But I am not sure if @Ottomata finished the install of tokudb

Nuria claimed this task.Dec 8 2015, 4:55 PM
Nuria raised the priority of this task from Normal to High.Dec 8 2015, 9:26 PM
Nuria edited projects, added Analytics-Kanban; removed Analytics-Backlog.

Change 257738 had a related patch set uploaded (by Nuria):
Creating tables with engine TokuDB by default

https://gerrit.wikimedia.org/r/257738

Change 257738 merged by Nuria:
Creating tables with engine TokuDB by default

https://gerrit.wikimedia.org/r/257738

From the gerrit comments, I assume it did work, finally? What was the issue (so we check it on production)?

Yeah, apparently transparent huge pages and TokuDB don't like each other:

Transparent huge pages are enabled, according to /sys/kernel/mm/transparent_hugepage/enabled
151208 20:08:24 [ERROR] TokuDB: Huge pages are enabled, disable them before continuing

151208 20:08:24 [ERROR] ************************************************************
151208 20:08:24 [ERROR]
151208 20:08:24 [ERROR]                         @@@@@@@@@@@
151208 20:08:24 [ERROR]                       @@'         '@@
151208 20:08:24 [ERROR]                      @@    _     _  @@
151208 20:08:24 [ERROR]                      |    (.)   (.)  |
151208 20:08:24 [ERROR]                      |             ` |
151208 20:08:24 [ERROR]                      |        >    ' |
151208 20:08:24 [ERROR]                      |     .----.    |
151208 20:08:24 [ERROR]                      ..   |.----.|  ..
151208 20:08:24 [ERROR]                       ..  '      ' ..
151208 20:08:24 [ERROR]                         .._______,.
151208 20:08:24 [ERROR]
151208 20:08:24 [ERROR] TokuDB will not run with transparent huge pages enabled.
151208 20:08:24 [ERROR] Please disable them to continue.
151208 20:08:24 [ERROR] (echo never > /sys/kernel/mm/transparent_hugepage/enabled)
151208 20:08:24 [ERROR]
151208 20:08:24 [ERROR] ************************************************************
151208 20:08:24 [ERROR] Plugin 'TokuDB' init function returned error.
151208 20:08:24 [ERROR] Plugin 'TokuDB' registration as a STORAGE ENGINE failed.

I just did as it said (echo never > /sys/kernel/mm/transparent_hugepage/enabled) and then everything worked.

Oh yes, we have that puppetized on production.

Nuria moved this task from Next Up to In Progress on the Analytics-Kanban board.Dec 9 2015, 5:05 PM
Nuria added a comment.EditedDec 9 2015, 8:31 PM

@jcrespo: Chnaged configuration but as far as i can see tables in prod are not created.

How to repro:

curl https://en.wikipedia.org/beacon/event?%7B%22event%22%3A%20%7B%22message%22%3A%22hola%22%7D%2C%22revision%22%3A12174936%2C%22schema%22%3A%22Test%22%2C%22webHost%22%3A%22en.wikipedia.org%22%2C%22wiki%22%3A%22enwiki%22%7D

Should create a table like Test_12174936 as it is a valid event for that schema, which is not happening, I am leaving the config on to let you troubleshoot. Please let us know your findings on the log.

We can work on this together at your convenience

I've done a quick test on db1046 and I can create TokuDB tables with no problem:

$ mysql -h db1046 log
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 202481080
Server version: 5.5.5-10.0.15-MariaDB-log Source distribution

Copyright (c) 2000, 2015, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> CREATE TABLE test (i int primary key) ENGINE=TokuDB;
Query OK, 0 rows affected (0.13 sec)

mysql> SELECT * FROM test;
Empty set (0.02 sec)

mysql> DROP TABLE test;
Query OK, 0 rows affected (0.06 sec)

I can also see the table Test_12174936 as created:

mysql> SHOW TABLES like 'Test\_%';
+-------------------------+
| Tables_in_log (Test\_%) |
+-------------------------+
| Test_10277037           |
| Test_12174936           |
| Test_8327132            |
+-------------------------+
3 rows in set (0.00 sec)
Nuria added a comment.Dec 10 2015, 4:41 PM

@jcrespo: ah, it just took longer than i was expecting. Excellent then.
Just confirmed that it is indeed toku db with right compression:

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

Test_12174936CREATE TABLE Test_12174936 (
`uuid` binary(32) NOT NULL,
`clientIp` varbinary(191) DEFAULT NULL,
`timestamp` varbinary(14) NOT NULL,
`userAgent` varbinary(191) DEFAULT NULL,
`webHost` varbinary(191) DEFAULT NULL,
`wiki` varbinary(191) NOT NULL,
`event_message` varbinary(191) NOT NULL,
UNIQUE KEY `ix_Test_12174936_uuid` (`uuid`),
KEY `ix_Test_12174936_timestamp` (`timestamp`),
KEY `wiki_timestamp` (`wiki`,`timestamp`)

) ENGINE=TokuDB DEFAULT CHARSET=binary compression='tokudb_zlib' |

Then. can we list work to be done? I think this is it but please confirm/correct

Stage1 : Upgrade db1046 to latests Mysql.

Stage 2: convert to tokudb one large table, see time it takes and decide on outage window to convert the rest.

Stage 3: convert all tables to tokudb.

CHARSET=binary, didn't you used to create tables with utf8 charset? Most other are utf8 indeed.

We can schedule a couple of hours for the mysql restart on Tuesday. At what time? After deciding, we should announce it ASAP.

The steps seem ok.

BTW, purging is currently disabled, it is pending to be resumed.

Nuria added a comment.Dec 10 2015, 4:56 PM

CHARSET=binary, didn't you used to create tables with utf8 charset?

we changed nothing on that regard, do you have binary maybe set up as toku db default?

We can schedule a couple of hours for the mysql restart on Tuesday. At what time?

At your convenience. You let us know.

The server's default is binary, as that is what mediawiki uses, but the software used to set utf8 in the table options creation.

At your convenience. You let us know.

Let's schedule 2 hours of downtime on m4-master for 2015-12-15 at 10:00 UTC. Do you want me to send the announcement to the analytics list?

Actually I see them as utf8 myself. I am not sure what server you are querying? But the right one has the right options:

           Name: Test_12174936
         Engine: TokuDB
        Version: 10
     Row_format: Dynamic
           Rows: 2
 Avg_row_length: 176
    Data_length: 352
Max_data_length: 9223372036854775807
   Index_length: 98
      Data_free: 24576
 Auto_increment: NULL
    Create_time: 2015-12-10 10:28:13
    Update_time: 2015-12-10 10:29:15
     Check_time: NULL
      Collation: utf8_general_ci
       Checksum: NULL
 Create_options: `compression`='tokudb_zlib'
        Comment:
Nuria added a comment.Dec 10 2015, 5:42 PM

I am not sure what server you are querying?

The slave, as i am going through 1002.

Let's schedule 2 hours of downtime on m4-master for 2015-12-15 at 10:00 UTC

Sounds good, if you send e-mail someone from our team on EU timezone can work with you on that

MySQL at db1046 has been upgraded and reconfigured: T121120.

MySQL at dbstore2002 has been upgraded and reconfigured, too.

Nuria added a comment.EditedDec 15 2015, 8:05 PM

@jcrespo: Can we move to the next stage of DB maintenance?

I listed it above as:

See how long does it take to convert and defragment an InnoDB table
We pick the largest table, blacklist this schema so events are not going to it and do necessary changes on DB,
see how long it took and based on this we plan the length of outage window.

Yes, we can choose one of the InnoDB tables mentioned on the description. Can you stop writes to a single table?

Nuria added a subscriber: mforns.Dec 18 2015, 7:17 PM

Yes, we can choose one of the InnoDB tables mentioned on the description. Can you stop writes to a single table?

Yes, that is easy enough for us to do.

Let us know the table and @mforns can work with you again when calling another outage window for this one table. Should we do that next week?

@jcrespo: Let us know what table you would like to tackle on tokudb 1st.

MobileWikiAppShareAFact_12588711

jcrespo added a comment.EditedDec 22 2015, 12:16 PM

Maintenance on db1047 and dbstore2002 has been done.

Change 260595 had a related patch set uploaded (by Nuria):
Blacklisting (temporarily) MobileWikiAppShareAFact schema

https://gerrit.wikimedia.org/r/260595

Change 260595 merged by Ottomata:
Blacklisting (temporarily) MobileWikiAppShareAFact schema

https://gerrit.wikimedia.org/r/260595

MariaDB EVENTLOGGING m4 localhost log > ALTER TABLE MobileWikiAppShareAFact_12588711 ENGINE=TokuDB;
Query OK, 16632320 rows affected (1 hour 4 min 28.07 sec)             
Records: 16632320  Duplicates: 0  Warnings: 0

MariaDB EVENTLOGGING m4 localhost log > SHOW CREATE TABLE MobileWikiAppShareAFact_12588711\G
*************************** 1. row ***************************
       Table: MobileWikiAppShareAFact_12588711
Create Table: CREATE TABLE `MobileWikiAppShareAFact_12588711` (
  `uuid` char(32) NOT NULL,
  `clientIp` varchar(191) DEFAULT NULL,
  `timestamp` varchar(14) NOT NULL,
  `userAgent` varchar(191) DEFAULT NULL,
  `webHost` varchar(191) DEFAULT NULL,
  `wiki` varchar(191) NOT NULL,
  `event_action` varchar(191) NOT NULL,
  `event_appInstallID` varchar(191) NOT NULL,
  `event_article` varchar(191) NOT NULL,
  `event_pageID` bigint(20) NOT NULL,
  `event_revID` bigint(20) NOT NULL,
  `event_shareSessionToken` varchar(191) NOT NULL,
  `event_sharemode` varchar(191) DEFAULT NULL,
  `event_target` varchar(191) DEFAULT NULL,
  `event_text` varchar(191) DEFAULT NULL,
  `event_tutorialFeatureEnabled` tinyint(1) NOT NULL,
  `event_tutorialShown` bigint(20) NOT NULL,
  UNIQUE KEY `ix_MobileWikiAppShareAFact_12588711_uuid` (`uuid`),
  KEY `ix_MobileWikiAppShareAFact_12588711_timestamp` (`timestamp`)
) ENGINE=TokuDB DEFAULT CHARSET=utf8 `compression`='tokudb_zlib'
1 row in set (0.00 sec)

The table has been compresed from 11 GB to 378MB: https://grafana.wikimedia.org/dashboard/db/server-board?panelId=17&fullscreen&from=1450738800000&to=1450825199999&var-server=db1046&var-network=eth0

It took a bit over 1 hour, or 2.9 MB/s.

It would take 12-24 hours to convert the most problematic one, Edit_13457736. Do you want to do a batch of them at the same time?

Change 260628 had a related patch set uploaded (by Nuria):
Restoring MobileWikiAppShareAFact schema to eventlogging stream

https://gerrit.wikimedia.org/r/260628

Change 260628 merged by Ottomata:
Restoring MobileWikiAppShareAFact schema to eventlogging stream

https://gerrit.wikimedia.org/r/260628

Nuria added a comment.EditedDec 22 2015, 9:26 PM

@jcrespo: From the point of view of backfilling it is easier to do all tables at once. Let us know when it i s a suitable time to announce the outage. We will plan for 24hours.

We will need to stop the mysql consumer and alarms triggered when this consumers lags too much.

These are the tables pending to be converted. They are a total of 425GB or ~41 hours of conversion (obviously it can be split, for example, in two):

$ mysql -A -BN -e "SELECT CONCAT ('ALTER TABLE \`', table_name, '\` ENGINE=TokuDB;') FROM information_schema.tables WHERE table_schema = 'log' and table_name rlike '^[a-zA-Z]+_[0-9]+$' AND engine='InnoDB'"
ALTER TABLE `BannerImpression_5329872` ENGINE=TokuDB;
ALTER TABLE `CentralNoticeBannerHistory_13447710` ENGINE=TokuDB;
ALTER TABLE `CompletionSuggestions_13424343` ENGINE=TokuDB;
ALTER TABLE `CompletionSuggestions_13630018` ENGINE=TokuDB;
ALTER TABLE `ContentTranslationCTA_11616099` ENGINE=TokuDB;
ALTER TABLE `ContentTranslationError_11767097` ENGINE=TokuDB;
ALTER TABLE `ContentTranslation_11628043` ENGINE=TokuDB;
ALTER TABLE `ContentTranslation_7146627` ENGINE=TokuDB;
ALTER TABLE `DidYouMean_13316693` ENGINE=TokuDB;
ALTER TABLE `DidYouMean_13800499` ENGINE=TokuDB;
ALTER TABLE `EchoMail_5467650` ENGINE=TokuDB;
ALTER TABLE `EchoPrefUpdate_5488876` ENGINE=TokuDB;
ALTER TABLE `Echo_5285750` ENGINE=TokuDB;
ALTER TABLE `Echo_5364744` ENGINE=TokuDB;
ALTER TABLE `Echo_5423520` ENGINE=TokuDB;
ALTER TABLE `Echo_6081131` ENGINE=TokuDB;
ALTER TABLE `Echo_7572295` ENGINE=TokuDB;
ALTER TABLE `EditConflict_8860941` ENGINE=TokuDB;
ALTER TABLE `Edit_10604157` ENGINE=TokuDB;
ALTER TABLE `Edit_11319708` ENGINE=TokuDB;
ALTER TABLE `Edit_13457736` ENGINE=TokuDB;
ALTER TABLE `Edit_5563071` ENGINE=TokuDB;
ALTER TABLE `Edit_5570274` ENGINE=TokuDB;
ALTER TABLE `EditorActivation_14208837` ENGINE=TokuDB;
ALTER TABLE `ExtDistDownloads_12369387` ENGINE=TokuDB;
ALTER TABLE `FlowReplies_10512128` ENGINE=TokuDB;
ALTER TABLE `FlowReplies_10560491` ENGINE=TokuDB;
ALTER TABLE `FlowReplies_10561344` ENGINE=TokuDB;
ALTER TABLE `GatherClicks_11639881` ENGINE=TokuDB;
ALTER TABLE `GatherClicks_11770314` ENGINE=TokuDB;
ALTER TABLE `GatherClicks_12114785` ENGINE=TokuDB;
ALTER TABLE `GatherFlags_11793295` ENGINE=TokuDB;
ALTER TABLE `GenderSurvey_5607845` ENGINE=TokuDB;
ALTER TABLE `GeoFeatures_12518424` ENGINE=TokuDB;
ALTER TABLE `GeoFeatures_12914994` ENGINE=TokuDB;
ALTER TABLE `GettingStartedNavbarNoArticle_5483117` ENGINE=TokuDB;
ALTER TABLE `GettingStartedNavbar_5406748` ENGINE=TokuDB;
ALTER TABLE `GettingStartedNavbar_5451155` ENGINE=TokuDB;
ALTER TABLE `GettingStartedNavbar_5451601` ENGINE=TokuDB;
ALTER TABLE `GettingStartedNavbar_5451764` ENGINE=TokuDB;
ALTER TABLE `GettingStartedNavbar_5457535` ENGINE=TokuDB;
ALTER TABLE `GettingStartedNavbar_5467309` ENGINE=TokuDB;
ALTER TABLE `GettingStartedNavbar_5491042` ENGINE=TokuDB;
ALTER TABLE `GettingStartedNavbar_5491218` ENGINE=TokuDB;
ALTER TABLE `GettingStartedNavbar_5496876` ENGINE=TokuDB;
ALTER TABLE `GettingStartedNavbar_5588411` ENGINE=TokuDB;
ALTER TABLE `GettingStartedNavbar_5588671` ENGINE=TokuDB;
ALTER TABLE `GettingStartedOnRedirect_5726087` ENGINE=TokuDB;
ALTER TABLE `GettingStartedOnRedirect_5731579` ENGINE=TokuDB;
ALTER TABLE `GettingStartedOnRedirect_5757712` ENGINE=TokuDB;
ALTER TABLE `GettingStartedOnRedirect_5928325` ENGINE=TokuDB;
ALTER TABLE `GettingStartedOnRedirect_5944134` ENGINE=TokuDB;
ALTER TABLE `GettingStartedRedirectImpression_7188688` ENGINE=TokuDB;
ALTER TABLE `GettingStartedRedirectImpression_7257235` ENGINE=TokuDB;
ALTER TABLE `GettingStartedRedirectImpression_7350115` ENGINE=TokuDB;
ALTER TABLE `GettingStartedRedirectImpression_7355552` ENGINE=TokuDB;
ALTER TABLE `GettingStarted_4993508` ENGINE=TokuDB;
ALTER TABLE `GettingStarted_5219269` ENGINE=TokuDB;
ALTER TABLE `GettingStarted_5243394` ENGINE=TokuDB;
ALTER TABLE `GettingStarted_5285688` ENGINE=TokuDB;
ALTER TABLE `GettingStarted_5285747` ENGINE=TokuDB;
ALTER TABLE `GettingStarted_5285779` ENGINE=TokuDB;
ALTER TABLE `GettingStarted_5319200` ENGINE=TokuDB;
ALTER TABLE `GettingStarted_5320430` ENGINE=TokuDB;
ALTER TABLE `GettingStarted_5359086` ENGINE=TokuDB;
ALTER TABLE `GuidedTourButtonClick_13869649` ENGINE=TokuDB;
ALTER TABLE `GuidedTourButtonClick_8221559` ENGINE=TokuDB;
ALTER TABLE `GuidedTourButtonClick_8690550` ENGINE=TokuDB;
ALTER TABLE `GuidedTourExited_8208763` ENGINE=TokuDB;
ALTER TABLE `GuidedTourExited_8690566` ENGINE=TokuDB;
ALTER TABLE `GuidedTourExternalLinkActivation_8208762` ENGINE=TokuDB;
ALTER TABLE `GuidedTourExternalLinkActivation_8690560` ENGINE=TokuDB;
ALTER TABLE `GuidedTourGuiderHidden_8208754` ENGINE=TokuDB;
ALTER TABLE `GuidedTourGuiderHidden_8690549` ENGINE=TokuDB;
ALTER TABLE `GuidedTourGuiderImpression_8208752` ENGINE=TokuDB;
ALTER TABLE `GuidedTourGuiderImpression_8694395` ENGINE=TokuDB;
ALTER TABLE `GuidedTourInternalLinkActivation_8208760` ENGINE=TokuDB;
ALTER TABLE `GuidedTourInternalLinkActivation_8690553` ENGINE=TokuDB;
ALTER TABLE `GuidedTour_4972209` ENGINE=TokuDB;
ALTER TABLE `GuidedTour_5222838` ENGINE=TokuDB;
ALTER TABLE `HttpsSupport_11437897` ENGINE=TokuDB;
ALTER TABLE `HttpsSupport_11518527` ENGINE=TokuDB;
ALTER TABLE `HttpsSupport_5712722` ENGINE=TokuDB;
ALTER TABLE `HttpsSupport_5731023` ENGINE=TokuDB;
ALTER TABLE `ImageMetricsCorsSupport_10884476` ENGINE=TokuDB;
ALTER TABLE `ImageMetricsCorsSupport_11686678` ENGINE=TokuDB;
ALTER TABLE `ImageMetricsLoadingTime_10078363` ENGINE=TokuDB;
ALTER TABLE `JQMigrateUsage_8773447` ENGINE=TokuDB;
ALTER TABLE `MediaViewerPerf_6636500` ENGINE=TokuDB;
ALTER TABLE `MediaViewer_6054199` ENGINE=TokuDB;
ALTER TABLE `MediaViewer_6055641` ENGINE=TokuDB;
ALTER TABLE `MediaViewer_6066908` ENGINE=TokuDB;
ALTER TABLE `MediaViewer_6636420` ENGINE=TokuDB;
ALTER TABLE `MediaViewer_7670440` ENGINE=TokuDB;
ALTER TABLE `MediaWikiException_5286143` ENGINE=TokuDB;
ALTER TABLE `MediaWikiException_5286145` ENGINE=TokuDB;
ALTER TABLE `MobileAppCategorizationAttempts_5359208` ENGINE=TokuDB;
ALTER TABLE `MobileAppLoginAttempts_5254859` ENGINE=TokuDB;
ALTER TABLE `MobileAppLoginAttempts_5257721` ENGINE=TokuDB;
ALTER TABLE `MobileAppShareAttempts_5346170` ENGINE=TokuDB;
ALTER TABLE `MobileAppTrackingChange_5369400` ENGINE=TokuDB;
ALTER TABLE `MobileAppTrackingChange_5412592` ENGINE=TokuDB;
ALTER TABLE `MobileAppUploadAttempts_5241449` ENGINE=TokuDB;
ALTER TABLE `MobileAppUploadAttempts_5257716` ENGINE=TokuDB;
ALTER TABLE `MobileAppUploadAttempts_5334329` ENGINE=TokuDB;
ALTER TABLE `MobileBetaWatchlist_4921083` ENGINE=TokuDB;
ALTER TABLE `MobileBetaWatchlist_4961357` ENGINE=TokuDB;
ALTER TABLE `MobileBetaWatchlist_5235429` ENGINE=TokuDB;
ALTER TABLE `MobileLeftNavbarEditCTA_6792179` ENGINE=TokuDB;
ALTER TABLE `MobileLeftNavbarEditCTA_7074652` ENGINE=TokuDB;
ALTER TABLE `MobileOperatorCode_8365469` ENGINE=TokuDB;
ALTER TABLE `MobileOptionsTracking_14003392` ENGINE=TokuDB;
ALTER TABLE `MobileOptionsTracking_8101982` ENGINE=TokuDB;
ALTER TABLE `MobileWatchlistInteraction_5677898` ENGINE=TokuDB;
ALTER TABLE `MobileWatchlistInteraction_5678021` ENGINE=TokuDB;
ALTER TABLE `MobileWebBrowse_12119641` ENGINE=TokuDB;
ALTER TABLE `MobileWebCentralAuthError_5294684` ENGINE=TokuDB;
ALTER TABLE `MobileWebCta_5972486` ENGINE=TokuDB;
ALTER TABLE `MobileWebCta_5972684` ENGINE=TokuDB;
ALTER TABLE `MobileWebDiffClickTracking_10720373` ENGINE=TokuDB;
ALTER TABLE `MobileWebEditing_5454549` ENGINE=TokuDB;
ALTER TABLE `MobileWebEditing_5518026` ENGINE=TokuDB;
ALTER TABLE `MobileWebEditing_5644223` ENGINE=TokuDB;
ALTER TABLE `MobileWebEditing_6077315` ENGINE=TokuDB;
ALTER TABLE `MobileWebEditing_6626343` ENGINE=TokuDB;
ALTER TABLE `MobileWebEditing_6637866` ENGINE=TokuDB;
ALTER TABLE `MobileWebEditing_7667035` ENGINE=TokuDB;
ALTER TABLE `MobileWebEditing_7675117` ENGINE=TokuDB;
ALTER TABLE `MobileWebEditing_8593886` ENGINE=TokuDB;
ALTER TABLE `MobileWebInfobox_6213587` ENGINE=TokuDB;
ALTER TABLE `MobileWebInfobox_6221064` ENGINE=TokuDB;
ALTER TABLE `MobileWebMainMenuClickTracking_11568715` ENGINE=TokuDB;
ALTER TABLE `MobileWebSectionUsage_14321266` ENGINE=TokuDB;
ALTER TABLE `MobileWebUploads_5246000` ENGINE=TokuDB;
ALTER TABLE `MobileWebUploads_5281063` ENGINE=TokuDB;
ALTER TABLE `MobileWebUploads_5383883` ENGINE=TokuDB;
ALTER TABLE `MobileWebUploads_7967082` ENGINE=TokuDB;
ALTER TABLE `MobileWebUploads_8209043` ENGINE=TokuDB;
ALTER TABLE `MobileWebWatching_11761466` ENGINE=TokuDB;
ALTER TABLE `MobileWebWatchlistClickTracking_10720361` ENGINE=TokuDB;
ALTER TABLE `MobileWebWikiGrokError_10352248` ENGINE=TokuDB;
ALTER TABLE `MobileWebWikiGrokError_10353516` ENGINE=TokuDB;
ALTER TABLE `MobileWebWikiGrokError_11517270` ENGINE=TokuDB;
ALTER TABLE `MobileWebWikiGrokResponse_10278938` ENGINE=TokuDB;
ALTER TABLE `MobileWebWikiGrokResponse_10352279` ENGINE=TokuDB;
ALTER TABLE `MobileWebWikiGrok_10735928` ENGINE=TokuDB;
ALTER TABLE `MobileWebWikiGrok_9913839` ENGINE=TokuDB;
ALTER TABLE `MobileWikiAppAppearanceSettings_10375462` ENGINE=TokuDB;
ALTER TABLE `MobileWikiAppAppearanceSettings_9378399` ENGINE=TokuDB;
ALTER TABLE `MobileWikiAppArticleSuggestions_11448426` ENGINE=TokuDB;
ALTER TABLE `MobileWikiAppArticleSuggestions_12443791` ENGINE=TokuDB;
ALTER TABLE `MobileWikiAppBannerClickThrough_13295306` ENGINE=TokuDB;
ALTER TABLE `MobileWikiAppCreateAccount_8240702` ENGINE=TokuDB;
ALTER TABLE `MobileWikiAppCreateAccount_9135391` ENGINE=TokuDB;
ALTER TABLE `MobileWikiAppDailyStats_12637385` ENGINE=TokuDB;
ALTER TABLE `MobileWikiAppEdit_8198182` ENGINE=TokuDB;
ALTER TABLE `MobileWikiAppEdit_8993428` ENGINE=TokuDB;
ALTER TABLE `MobileWikiAppEdit_8994704` ENGINE=TokuDB;
ALTER TABLE `MobileWikiAppEdit_9003125` ENGINE=TokuDB;
ALTER TABLE `MobileWikiAppFindInPage_14586774` ENGINE=TokuDB;
ALTER TABLE `MobileWikiAppInstallReferrer_12601905` ENGINE=TokuDB;
ALTER TABLE `MobileWikiAppLangSelect_12588733` ENGINE=TokuDB;
ALTER TABLE `MobileWikiAppLinkPreview_12014128` ENGINE=TokuDB;
ALTER TABLE `MobileWikiAppLinkPreview_12143205` ENGINE=TokuDB;
ALTER TABLE `MobileWikiAppLinkPreview_14095177` ENGINE=TokuDB;
ALTER TABLE `MobileWikiAppLogin_8234533` ENGINE=TokuDB;
ALTER TABLE `MobileWikiAppLogin_9135390` ENGINE=TokuDB;
ALTER TABLE `MobileWikiAppMediaGallery_10914526` ENGINE=TokuDB;
ALTER TABLE `MobileWikiAppMediaGallery_12588701` ENGINE=TokuDB;
ALTER TABLE `MobileWikiAppNavMenu_12732211` ENGINE=TokuDB;
ALTER TABLE `MobileWikiAppOnboarding_9122680` ENGINE=TokuDB;
ALTER TABLE `MobileWikiAppOperatorCode_8983918` ENGINE=TokuDB;
ALTER TABLE `MobileWikiAppPageScroll_14591606` ENGINE=TokuDB;
ALTER TABLE `MobileWikiAppProtectedEditAttempt_8682497` ENGINE=TokuDB;
ALTER TABLE `MobileWikiAppReadingAction_8233801` ENGINE=TokuDB;
ALTER TABLE `MobileWikiAppSavedPages_10375480` ENGINE=TokuDB;
ALTER TABLE `MobileWikiAppSavedPages_8909354` ENGINE=TokuDB;
ALTER TABLE `MobileWikiAppSearch_10593635` ENGINE=TokuDB;
ALTER TABLE `MobileWikiAppSearch_10633564` ENGINE=TokuDB;
ALTER TABLE `MobileWikiAppSearch_10641988` ENGINE=TokuDB;
ALTER TABLE `MobileWikiAppSessions_14031591` ENGINE=TokuDB;
ALTER TABLE `MobileWikiAppSessions_9742902` ENGINE=TokuDB;
ALTER TABLE `MobileWikiAppShareAFact_10916168` ENGINE=TokuDB;
ALTER TABLE `MobileWikiAppShareAFact_11331974` ENGINE=TokuDB;
ALTER TABLE `MobileWikiAppStuffHappens_8955468` ENGINE=TokuDB;
ALTER TABLE `MobileWikiAppTabs_12453651` ENGINE=TokuDB;
ALTER TABLE `MobileWikiAppToCInteraction_11014396` ENGINE=TokuDB;
ALTER TABLE `MobileWikiAppToCInteraction_14585319` ENGINE=TokuDB;
ALTER TABLE `MobileWikiAppWidgets_11312870` ENGINE=TokuDB;
ALTER TABLE `ModuleLoadFailure_12407847` ENGINE=TokuDB;
ALTER TABLE `ModuleStorage_6978194` ENGINE=TokuDB;
ALTER TABLE `MultimediaTiming_7193302` ENGINE=TokuDB;
ALTER TABLE `MultimediaViewerAttribution_9758179` ENGINE=TokuDB;
ALTER TABLE `MultimediaViewerDimensions_10014238` ENGINE=TokuDB;
ALTER TABLE `MultimediaViewerDuration_10427980` ENGINE=TokuDB;
ALTER TABLE `MultimediaViewerDuration_8318615` ENGINE=TokuDB;
ALTER TABLE `MultimediaViewerDuration_8572641` ENGINE=TokuDB;
ALTER TABLE `MultimediaViewerNetworkPerformance_10596581` ENGINE=TokuDB;
ALTER TABLE `MultimediaViewerNetworkPerformance_11030254` ENGINE=TokuDB;
ALTER TABLE `MultimediaViewerNetworkPerformance_12458951` ENGINE=TokuDB;
ALTER TABLE `MultimediaViewerNetworkPerformance_7393226` ENGINE=TokuDB;
ALTER TABLE `MultimediaViewerNetworkPerformance_7488625` ENGINE=TokuDB;
ALTER TABLE `MultimediaViewerVersusPageFilePerformance_7907636` ENGINE=TokuDB;
ALTER TABLE `NavigationTiming_12405818` ENGINE=TokuDB;
ALTER TABLE `NavigationTiming_13317958` ENGINE=TokuDB;
ALTER TABLE `NavigationTiming_13332008` ENGINE=TokuDB;
ALTER TABLE `NavigationTiming_14899847` ENGINE=TokuDB;
ALTER TABLE `NavigationTiming_5323808` ENGINE=TokuDB;
ALTER TABLE `NavigationTiming_5333197` ENGINE=TokuDB;
ALTER TABLE `NewEditorEdit_6792669` ENGINE=TokuDB;
ALTER TABLE `PageDeletion_7481655` ENGINE=TokuDB;
ALTER TABLE `PageMove_7495717` ENGINE=TokuDB;
ALTER TABLE `PageRestoration_7758372` ENGINE=TokuDB;
ALTER TABLE `Popups_11625443` ENGINE=TokuDB;
ALTER TABLE `Popups_7536956` ENGINE=TokuDB;
ALTER TABLE `PrefUpdate_5563398` ENGINE=TokuDB;
ALTER TABLE `Sandbox_5500713` ENGINE=TokuDB;
ALTER TABLE `Search_11670541` ENGINE=TokuDB;
ALTER TABLE `Search_14361785` ENGINE=TokuDB;
ALTER TABLE `SendBeaconReliability_10676430` ENGINE=TokuDB;
ALTER TABLE `SendBeaconReliability_10735916` ENGINE=TokuDB;
ALTER TABLE `ServerSideAccountCreation_5014296` ENGINE=TokuDB;
ALTER TABLE `ServerSideAccountCreation_5150394` ENGINE=TokuDB;
ALTER TABLE `ServerSideAccountCreation_5233795` ENGINE=TokuDB;
ALTER TABLE `ServerSideAccountCreation_5487345` ENGINE=TokuDB;
ALTER TABLE `SignupExpAccountCreationComplete_8539421` ENGINE=TokuDB;
ALTER TABLE `SignupExpAccountCreationImpression_8539445` ENGINE=TokuDB;
ALTER TABLE `SignupExpCTAButtonClick_8965028` ENGINE=TokuDB;
ALTER TABLE `SignupExpCTAImpression_8965023` ENGINE=TokuDB;
ALTER TABLE `SignupExpPageLinkClick_8101692` ENGINE=TokuDB;
ALTER TABLE `SignupExpPageLinkClick_8965014` ENGINE=TokuDB;
ALTER TABLE `SimpleBanner_11163678` ENGINE=TokuDB;
ALTER TABLE `StatsD_5815068` ENGINE=TokuDB;
ALTER TABLE `TaskRecommendationClick_9266317` ENGINE=TokuDB;
ALTER TABLE `TaskRecommendationImpression_9266226` ENGINE=TokuDB;
ALTER TABLE `TaskRecommendationLightbulbClick_9266338` ENGINE=TokuDB;
ALTER TABLE `TaskRecommendationLightbulbClick_9433256` ENGINE=TokuDB;
ALTER TABLE `TaskRecommendation_9266319` ENGINE=TokuDB;
ALTER TABLE `TestSearchSatisfaction_12423691` ENGINE=TokuDB;
ALTER TABLE `Test_10277037` ENGINE=TokuDB;
ALTER TABLE `Test_8327132` ENGINE=TokuDB;
ALTER TABLE `TrackedPageContentSaveComplete_8535426` ENGINE=TokuDB;
ALTER TABLE `UniversalLanguageSelector_5510627` ENGINE=TokuDB;
ALTER TABLE `UniversalLanguageSelector_5729800` ENGINE=TokuDB;
ALTER TABLE `UploadWizardErrorFlowEvent_11772725` ENGINE=TokuDB;
ALTER TABLE `UploadWizardErrorFlowEvent_9924376` ENGINE=TokuDB;
ALTER TABLE `UploadWizardExceptionFlowEvent_11717009` ENGINE=TokuDB;
ALTER TABLE `UploadWizardExceptionFlowEvent_11772722` ENGINE=TokuDB;
ALTER TABLE `UploadWizardFlowEvent_11562780` ENGINE=TokuDB;
ALTER TABLE `UploadWizardFlowEvent_11772723` ENGINE=TokuDB;
ALTER TABLE `UploadWizardFlowEvent_8851807` ENGINE=TokuDB;
ALTER TABLE `UploadWizardStep_11772724` ENGINE=TokuDB;
ALTER TABLE `UploadWizardStep_8612364` ENGINE=TokuDB;
ALTER TABLE `UploadWizardStep_8851805` ENGINE=TokuDB;
ALTER TABLE `UploadWizardTutorialActions_5803466` ENGINE=TokuDB;
ALTER TABLE `UploadWizardUploadActions_5811620` ENGINE=TokuDB;
ALTER TABLE `UploadWizardUploadFlowEvent_11772717` ENGINE=TokuDB;
ALTER TABLE `UploadWizardUploadFlowEvent_9651951` ENGINE=TokuDB;
ALTER TABLE `VisualEditorDOMRetrieved_5961496` ENGINE=TokuDB;
ALTER TABLE `VisualEditorDOMSaved_6063754` ENGINE=TokuDB;
ALTER TABLE `WikimediaBlogVisit_5308166` ENGINE=TokuDB;
ALTER TABLE `WikipediaPortal_14377354` ENGINE=TokuDB;
Nuria added a comment.Dec 23 2015, 4:39 PM

@jcrespo: I see, what about scheduling an outage monday and Tuesday next week (28 and 29) ?

jcrespo added a comment.EditedDec 23 2015, 4:40 PM

I'm ok with that. Hope you have into account that 41 hours will require around 3 days including enabling and disabling the tables on eventlogging.

Thinking it again, and given the time of the day it is already, I would reschedule it for January 19th.

Nuria added a comment.Dec 28 2015, 4:00 PM

Thinking it again, and given the time of the day it is already, I would reschedule it for January 19th.

Sounds good, let's do it then.

Nuria moved this task from In Progress to Paused on the Analytics-Kanban board.Jan 13 2016, 4:01 PM

BTW, a small nitpick, we commonly use "outage" when we had unexpected loss of service; "scheduled maintenance" when we degrade the service in a controlled, predictable way. While both can have bad consequences, during a programmed maintenance, users are warned well in advance, which tends to minimize a bit its consequences.

Nuria added a comment.Jan 18 2016, 5:58 PM

Sorry, let's do this maintenace after we have taken a look at replication issues, right?

Nuria added a comment.Jan 19 2016, 3:58 PM

@jcrespo: let us know if wednesday is a good day to do this scheduled maintenance and we will proceed to announce it

Let's announce it for Thursday (if you are ok with it), my backlog is larger than I thought :-)

Do you want to do the whole block in one or do we only do a smaller block? Even if we start on Wednesday, it will not have finished by Friday, probably. We can cancel it every X hours, as most tables will take less than 1-2 hours to be converted. So my suggestion would be using the weekend and enable the process on Monday, as I believe less people will be impacted by it/notice it.

Let's adapt to when you at Analytics team will be available for stopping/restarting the writing process at application side. The database changes themselves requires little monitoring so it can be done during the weekend with no problem.

Nuria added a comment.EditedJan 19 2016, 9:09 PM

@jcrespo: Thursday sounds good. We will announce it cc @Ottomata to confirm that we are not deploying any changes to EL that would prevent the scheduled maintenance.

Nuria added a comment.Jan 20 2016, 6:46 PM

@jcrespo: let's do this hardware update before the conversion Ok? https://phabricator.wikimedia.org/T123546

Ottomata added a subscriber: madhuvishy.EditedJan 20 2016, 8:27 PM

Just worked this out in IRC. The downtime will start at Jan 21 16:00 UTC. @madhuvishy will email the analytics list. @jcrespo will begin the migration after we finish T123546.

Change 265506 had a related patch set uploaded (by Ottomata):
Temporarily disable eventlogging mysql consumers and burrow monitoring for them

https://gerrit.wikimedia.org/r/265506

Change 265506 merged by Ottomata:
Temporarily disable eventlogging mysql consumers and burrow monitoring for them

https://gerrit.wikimedia.org/r/265506

EventLogging MySQL processes are stopped, downtime is scheduled, folks have been notified. Proceed! :)

my log < /srv/tmp/convert_innodb_to_tokudb.sql
sleep(20)
0
sleep(20)
0
sleep(20)
0
sleep(20)
0
[detached from 22307.tokudb]
root@db1046:/etc/mysql/ssl$ mysql -e "SHOW FULL PROCESSLIST"
+---------+----------+--------------------+--------------------+---------+------+--------------------------------------------------------+------------------------------------------------------------------------------------------------------------+----------+
| Id      | User     | Host               | db                 | Command | Time | State                                                  | Info                                                                                                       | Progress |
+---------+----------+--------------------+--------------------+---------+------+--------------------------------------------------------+------------------------------------------------------------------------------------------------------------+----------+
...
| 5081484 | root     | localhost          | log                | Query   |  282 | Fetched about 2567000 rows, loading data still remains | ALTER TABLE `ContentTranslationCTA_11616099` ENGINE=TokuDB                                                 |    7.516 |
| 5083432 | root     | localhost          | NULL               | Query   |    0 | init                                                   | SHOW FULL PROCESSLIST                                                                                      |    0.000 |
+---------+----------+--------------------+--------------------+---------+------+--------------------------------------------------------+------------------------------------------------------------------------------------------------------------+----------+

If my previous cryptic message is not understood, it means that maintenance has started successfully, I will try to do another update tomorrow for a more accurate estimation of the remaining time.

I've added a sleep 20 so that in the event of early cancel, it can be done between table applications (it is a problem to do it in the middle of an alter).

I will want to restart mysql after the maintenance to apply some extra updates (ssl support).

It is now executing ALTER 132/261, although that may be a bit misleading because it has already converted Edit_13457736, the largest non-TokuDB table on the server. So I think it will take the 41 hours predicted.

I want to point out the reduction on disk usage that was converting the Edit table, which is a considerable amount of disk save, and consequently other resources that cost $$$ (plus better performance):

The maintenance for db1046 has finished. We have saved over 600GB of space.

The application can start again. I will see on Monday what is left to do regarding maintenance, but we have done the most important parts already.

Cool, totally missed this on Saturday! Starting them now...

Nuria moved this task from Paused to In Progress on the Analytics-Kanban board.Jan 25 2016, 5:06 PM

We are backfilling evens that were not inserted during scheduled maintenance window.

So, some of the pending tasks to discuss/solve:

  • With the new application configuration (almost all tables TokuDB), we may need to tune our InnnoDB parameters (specially the buffer pool), but that all depends on the schema used by non-log tables (there are none, I think, relevant on db1046, but there may be on the analytics slaves). That may need a reboot
  • We need to extend the primary key of certain tables to an unsigned long
  • We need to decide a purging strategy. This has to do with T108850 and T124676. Right now, purging is disabled until we catch up with the current events, but something has to be implemented with different properties on masters and slaves, and maybe certain configurability? Do we continue using the events or do we puppetize a custom script?

Otto has already praised the improved performance of the new setup. Not all of these have to be fixed here, they can be handled on separate tickets, but first we must reach a decision.

Nuria added a comment.Jan 25 2016, 9:45 PM

@jcrespo:

Let's keep purging conversation to purging ticket: https://phabricator.wikimedia.org/T108850 but let's address here what remains to be able to close this ticket:

With the new application configuration (almost all tables TokuDB), we may need to tune our InnnoDB parameters (specially the buffer pool), but that all depends on the schema used >by non-log tables (there are none, I think, relevant on db1046, but there may be on the analytics slaves). That may need a reboot

We need to extend the primary key of certain tables to an unsigned long

Do we need another scheduled outage to do these two? Please let us know.

Not for the second. Maybe for the first, but it is not yet decided- I am coordinating with Otto.

If we just have to do an easy peasy restart of mysql to apply a config change, I think we do not need to coordinate a downtime. I'll temporarily stop the mysql consumers and we can do a mysql restart.

Backfilling on m4-master done. @jcrespo, feel free to start the slave resync script as soon as you get a chance.

@jcrespo, could you give us a slave resync status update when you get a chance? Danke!

It is continuing resyncing- but I do not have an ETA to finish. I will try to run a script today to approximate one.

All tables except Edit should be synced now. Edit is still pending.

Nuria added a comment.Feb 1 2016, 5:06 PM

@jcrespo: Anything else here, is edit table still syncying?

@Nuria, the Edit table finished syncing during the weekend, but sadly with errors, so I will have to force a resync again of just that table. It may only have failed to include 1% of the total rows (as a ball park figure), but they should take way less time to fix that.

After that, there are some small tasks pending, but we may want to handle them separately, as the main thing (solving space problems) has already been fixed.

Nuria added a comment.Feb 1 2016, 7:05 PM

Excellent. Let us know when we can close this ticket.

I have just confirmed it is in sync before 24 January (it is still a bit behind in replication, but that is normal). We can close this.

Nuria moved this task from In Progress to Done on the Analytics-Kanban board.Feb 1 2016, 11:52 PM
Nuria closed this task as Resolved.Feb 4 2016, 4:36 PM