Page MenuHomePhabricator

Test InnoDB compression
Closed, ResolvedPublic

Description

  • Check which tables will benefit from compression
  • Check which ones are too small or have non-compressible data
  • Impact on production latency, on SSD and old HDs
  • Check compression metrics (compression failures)
  • Impact on memory/buffer pool
  • Check which block size is the most advantageous

Event Timeline

jcrespo moved this task from Triage to In progress on the DBA board.

Testing on db1073.

db1073, at ~14 UTC, 2016-06-30, before compression:

+--------------------+----------------------------------------------------+--------------------+----------+---------+---------+---------+---------+
| DATABASE           | TABLE                                              | ENGINE             | ROWS     | DATA    | IDX     | TOTAL   | IDXFRAC |
+--------------------+----------------------------------------------------+--------------------+----------+---------+---------+---------+---------+
| enwiki             | revision                                           | InnoDB             | 654.94M  | 112.17G | 132.29G | 244.46G |    1.18 |
| enwiki             | pagelinks                                          | InnoDB             | 1091.45M | 78.50G  | 112.24G | 190.74G |    1.43 |
| enwiki             | templatelinks                                      | InnoDB             | 629.08M  | 48.19G  | 60.22G  | 108.40G |    1.25 |
| enwiki             | externallinks                                      | InnoDB             | 97.47M   | 22.98G  | 37.29G  | 60.26G  |    1.62 |
| enwiki             | text                                               | InnoDB             | 641.19M  | 49.49G  | 0.00G   | 49.49G  |    0.00 |
| enwiki             | categorylinks                                      | InnoDB             | 93.34M   | 18.17G  | 28.40G  | 46.57G  |    1.56 |
| enwiki             | logging                                            | InnoDB             | 69.83M   | 12.44G  | 33.70G  | 46.14G  |    2.71 |
| enwiki             | article_feedback                                   | InnoDB             | 68.60M   | 7.13G   | 26.04G  | 33.17G  |    3.65 |
| enwiki             | watchlist                                          | InnoDB             | 70.68M   | 13.10G  | 19.21G  | 32.31G  |    1.47 |
| enwiki             | archive                                            | InnoDB             | 53.40M   | 11.39G  | 8.12G   | 19.51G  |    0.71 |
| enwiki             | user_properties                                    | InnoDB             | 104.29M  | 7.52G   | 11.70G  | 19.22G  |    1.56 |
| enwiki             | blob_tracking                                      | InnoDB             | 124.22M  | 8.80G   | 7.49G   | 16.29G  |    0.85 |
| enwiki             | imagelinks                                         | InnoDB             | 89.64M   | 6.15G   | 8.88G   | 15.03G  |    1.44 |
| enwiki             | page                                               | InnoDB             | 39.61M   | 7.84G   | 5.09G   | 12.92G  |    0.65 |
| enwiki             | user                                               | InnoDB             | 26.92M   | 7.60G   | 3.22G   | 10.82G  |    0.42 |
| enwiki             | cu_changes                                         | InnoDB             | 15.50M   | 5.79G   | 4.61G   | 10.39G  |    0.80 |
| enwiki             | abuse_filter_log                                   | InnoDB             | 15.55M   | 2.39G   | 4.32G   | 6.70G   |    1.81 |
| enwiki             | updates                                            | InnoDB             | 48.26M   | 2.37G   | 3.94G   | 6.30G   |    1.66 |
| enwiki             | recentchanges                                      | InnoDB             | 8.00M    | 3.03G   | 2.79G   | 5.83G   |    0.92 |
| enwiki             | mathoid                                            | InnoDB             | 0.30M    | 4.98G   | 0.00G   | 4.98G   |    0.00 |
| enwiki             | titlekey                                           | InnoDB             | 36.41M   | 2.52G   | 2.12G   | 4.64G   |    0.84 |
| enwiki             | page_props                                         | InnoDB             | 26.72M   | 2.00G   | 2.53G   | 4.53G   |    1.26 |
| enwiki             | cur                                                | InnoDB             | 3.12M    | 3.55G   | 0.73G   | 4.28G   |    0.20 |
| test               | eltx                                               | InnoDB             | 28.87M   | 2.25G   | 1.77G   | 4.02G   |    0.79 |
| enwiki             | iwlinks                                            | InnoDB             | 16.13M   | 1.54G   | 2.23G   | 3.77G   |    1.45 |
| enwiki             | change_tag                                         | InnoDB             | 19.63M   | 0.95G   | 2.69G   | 3.63G   |    2.84 |
| enwiki             | langlinks                                          | InnoDB             | 25.05M   | 1.78G   | 1.32G   | 3.10G   |    0.74 |
| enwiki             | spoofuser                                          | InnoDB             | 21.69M   | 1.77G   | 1.16G   | 2.93G   |    0.65 |
| enwiki             | click_tracking                                     | InnoDB             | 12.77M   | 1.21G   | 1.58G   | 2.79G   |    1.31 |
| enwiki             | tag_summary                                        | InnoDB             | 13.94M   | 0.75G   | 1.52G   | 2.28G   |    2.02 |
| enwiki             | filearchive                                        | InnoDB             | 2.48M    | 1.50G   | 0.72G   | 2.21G   |    0.48 |
| enwiki             | wbc_entity_usage                                   | InnoDB             | 12.52M   | 0.87G   | 1.01G   | 1.88G   |    1.15 |
| enwiki             | article_feedback_properties                        | InnoDB             | 19.48M   | 1.68G   | 0.00G   | 1.68G   |    0.00 |
| enwiki             | article_feedback_revisions                         | InnoDB             | 33.29M   | 1.29G   | 0.00G   | 1.29G   |    0.00 |
| enwiki             | redirect                                           | InnoDB             | 8.73M    | 0.58G   | 0.51G   | 1.09G   |    0.88 |
| enwiki             | image                                              | InnoDB             | 0.97M    | 0.67G   | 0.34G   | 1.02G   |    0.51 |
| ops                | db32_query_review_history                          | MyISAM             | 0.17M    | 1.00G   | 0.01G   | 1.01G   |    0.01 |
| ops                | db38_query_review_history                          | MyISAM             | 0.22M    | 0.87G   | 0.01G   | 0.88G   |    0.01 |
| enwiki             | math                                               | InnoDB             | 2.79M    | 0.86G   | 0.00G   | 0.86G   |    0.00 |
| enwiki             | _image_new                                         | InnoDB             | 0.91M    | 0.50G   | 0.30G   | 0.80G   |    0.61 |
| enwiki             | user_newtalk                                       | InnoDB             | 6.79M    | 0.41G   | 0.36G   | 0.77G   |    0.87 |
| enwiki             | bv2015_edits                                       | InnoDB             | 22.10M   | 0.70G   | 0.00G   | 0.70G   |    0.00 |
| enwiki             | aft_article_feedback                               | InnoDB             | 0.96M    | 0.18G   | 0.31G   | 0.49G   |    1.70 |
| enwiki             | bv2011_edits                                       | InnoDB             | 14.24M   | 0.45G   | 0.00G   | 0.45G   |    0.00 |
| enwiki             | log_search                                         | InnoDB             | 3.55M    | 0.30G   | 0.13G   | 0.43G   |    0.44 |
| enwiki             | bv2013_edits                                       | MyISAM             | 19.09M   | 0.23G   | 0.18G   | 0.41G   |    0.79 |
| enwiki             | category                                           | InnoDB             | 2.50M    | 0.18G   | 0.22G   | 0.40G   |    1.18 |
| enwiki             | edit_page_tracking                                 | InnoDB             | 2.49M    | 0.18G   | 0.18G   | 0.35G   |    1.01 |
| enwiki             | prefswitch_survey                                  | InnoDB             | 3.54M    | 0.34G   | 0.00G   | 0.34G   |    0.00 |
| enwiki             | bv2009_edits                                       | InnoDB             | 10.17M   | 0.31G   | 0.00G   | 0.31G   |    0.00 |
| enwiki             | ipblocks                                           | InnoDB             | 0.89M    | 0.16G   | 0.15G   | 0.31G   |    0.91 |
| enwiki             | objectcache                                        | InnoDB             | 1.22M    | 0.19G   | 0.11G   | 0.30G   |    0.58 |
| enwiki             | prefstats                                          | InnoDB             | 2.89M    | 0.17G   | 0.12G   | 0.28G   |    0.69 |
| enwiki             | email_capture                                      | InnoDB             | 0.31M    | 0.24G   | 0.03G   | 0.27G   |    0.13 |
| enwiki             | click_tracking_user_properties                     | InnoDB             | 1.39M    | 0.12G   | 0.15G   | 0.26G   |    1.21 |
| enwiki             | securepoll_lists                                   | InnoDB             | 1.90M    | 0.09G   | 0.17G   | 0.26G   |    1.96 |
| enwiki             | article_feedback_pages                             | InnoDB             | 7.48M    | 0.26G   | 0.00G   | 0.26G   |    0.00 |
| enwiki             | filejournal                                        | InnoDB             | 0.83M    | 0.17G   | 0.07G   | 0.23G   |    0.42 |
| enwiki             | pagetriage_page_tags                               | InnoDB             | 1.71M    | 0.15G   | 0.08G   | 0.23G   |    0.56 |
| enwiki             | flaggedrevs                                        | InnoDB             | 0.81M    | 0.09G   | 0.14G   | 0.23G   |    1.58 |
| enwiki             | geo_tags                                           | InnoDB             | 1.75M    | 0.14G   | 0.09G   | 0.23G   |    0.61 |
| enwiki             | blob_orphans                                       | InnoDB             | 6.16M    | 0.21G   | 0.00G   | 0.21G   |    0.00 |
| enwiki             | ep_events                                          | InnoDB             | 0.48M    | 0.15G   | 0.04G   | 0.19G   |    0.27 |
| enwiki             | wikigrok_questions                                 | InnoDB             | 0.30M    | 0.14G   | 0.05G   | 0.18G   |    0.35 |
| enwiki             | flaggedrevs_promote                                | InnoDB             | 0.96M    | 0.14G   | 0.00G   | 0.14G   |    0.00 |
| enwiki             | cu_log                                             | InnoDB             | 0.39M    | 0.05G   | 0.08G   | 0.13G   |    1.48 |
| enwiki             | oldimage                                           | InnoDB             | 0.12M    | 0.09G   | 0.04G   | 0.12G   |    0.42 |
| ops                | db36_query_review_history                          | MyISAM             | 0.06M    | 0.11G   | 0.00G   | 0.11G   |    0.02 |
| enwiki             | aft_article_answer                                 | InnoDB             | 1.58M    | 0.08G   | 0.02G   | 0.11G   |    0.27 |
| enwiki             | msg_resource                                       | InnoDB             | 0.10M    | 0.10G   | 0.00G   | 0.10G   |    0.00 |
| enwiki             | article_feedback_stats                             | InnoDB             | 0.45M    | 0.05G   | 0.03G   | 0.08G   |    0.60 |
| enwiki             | abuse_filter_history                               | InnoDB             | 0.03M    | 0.08G   | 0.00G   | 0.08G   |    0.02 |
| enwiki             | module_deps                                        | InnoDB             | 0.09M    | 0.05G   | 0.00G   | 0.05G   |    0.00 |
| enwiki             | wikilove_log                                       | InnoDB             | 0.09M    | 0.03G   | 0.02G   | 0.05G   |    0.63 |
| enwiki             | flaggedrevs_statistics                             | InnoDB             | 0.27M    | 0.02G   | 0.02G   | 0.04G   |    1.00 |
| enwiki             | aft_article_filter_count                           | InnoDB             | 0.90M    | 0.04G   | 0.00G   | 0.04G   |    0.00 |
| enwiki             | pagetriage_page                                    | InnoDB             | 0.10M    | 0.01G   | 0.02G   | 0.04G   |    2.01 |
| enwiki             | page_restrictions                                  | InnoDB             | 0.11M    | 0.01G   | 0.02G   | 0.03G   |    2.05 |
| enwiki             | moodbar_feedback                                   | InnoDB             | 0.06M    | 0.01G   | 0.02G   | 0.03G   |    1.82 |
| enwiki             | querycachetwo                                      | InnoDB             | 0.10M    | 0.01G   | 0.02G   | 0.03G   |    1.81 |
| enwiki             | pif_edits                                          | InnoDB             | 0.99M    | 0.03G   | 0.00G   | 0.03G   |    0.00 |
| enwiki             | gather_list_item                                   | InnoDB             | 0.10M    | 0.01G   | 0.01G   | 0.02G   |    1.25 |
| enwiki             | securepoll_voters                                  | InnoDB             | 0.04M    | 0.02G   | 0.00G   | 0.02G   |    0.14 |
| enwiki             | querycache                                         | InnoDB             | 0.13M    | 0.01G   | 0.01G   | 0.02G   |    0.65 |
| enwiki             | aft_article_revision_feedback_ratings_rollup       | InnoDB             | 0.42M    | 0.02G   | 0.00G   | 0.02G   |    0.00 |
| enwiki             | aft_article_answer_text                            | InnoDB             | 0.02M    | 0.01G   | 0.00G   | 0.01G   |    0.00 |
| enwiki             | l10n_cache                                         | InnoDB             | 0.07M    | 0.01G   | 0.00G   | 0.01G   |    0.47 |
| enwiki             | protected_titles                                   | InnoDB             | 0.05M    | 0.01G   | 0.00G   | 0.01G   |    0.34 |
| enwiki             | ep_revisions                                       | InnoDB             | 0.00M    | 0.01G   | 0.00G   | 0.01G   |    0.08 |
| enwiki             | pagetriage_log                                     | InnoDB             | 0.07M    | 0.00G   | 0.01G   | 0.01G   |    1.39 |
| enwiki             | securepoll_votes                                   | InnoDB             | 0.01M    | 0.01G   | 0.00G   | 0.01G   |    0.30 |
| enwiki             | hidden                                             | InnoDB             | 0.01M    | 0.00G   | 0.01G   | 0.01G   |    2.12 |
| enwiki             | gather_list                                        | InnoDB             | 0.02M    | 0.00G   | 0.00G   | 0.01G   |    0.66 |
| ops                | db36_query_review                                  | MyISAM             | 0.00M    | 0.01G   | 0.00G   | 0.01G   |    0.00 |
| enwiki             | aft_article_feedback_ratings_rollup                | InnoDB             | 0.17M    | 0.01G   | 0.00G   | 0.01G   |    0.00 |
| enwiki             | moodbar_feedback_response                          | InnoDB             | 0.02M    | 0.00G   | 0.00G   | 0.01G   |    0.28 |
| enwiki             | user_groups                                        | InnoDB             | 0.04M    | 0.00G   | 0.00G   | 0.01G   |    0.75 |
| enwiki             | aft_article_feedback_properties                    | InnoDB             | 0.10M    | 0.01G   | 0.00G   | 0.01G   |    0.00 |
| ops                | db32_query_review                                  | MyISAM             | 0.00M    | 0.00G   | 0.00G   | 0.00G   |    0.00 |
| ops                | db38_query_review                                  | MyISAM             | 0.00M    | 0.00G   | 0.00G   | 0.00G   |    0.00 |
| enwiki             | ep_students                                        | InnoDB             | 0.01M    | 0.00G   | 0.00G   | 0.00G   |    1.48 |
| enwiki             | wikilove_image_log                                 | InnoDB             | 0.02M    | 0.00G   | 0.00G   | 0.00G   |    1.27 |
| enwiki             | ep_users_per_course                                | InnoDB             | 0.01M    | 0.00G   | 0.00G   | 0.00G   |    1.20 |
| enwiki             | ep_articles                                        | InnoDB             | 0.01M    | 0.00G   | 0.00G   | 0.00G   |    0.79 |
| enwiki             | __wmf_checksums                                    | InnoDB             | 0.00M    | 0.00G   | 0.00G   | 0.00G   |    0.23 |
...
+--------------------+----------------------------------------------------+--------------------+----------+---------+---------+---------+---------+
415 rows in set (0.06 sec)

+----------+---------+---------+---------+---------+
| ROWS     | DATA    | IDX     | TOTAL   | IDXFRAC |
+----------+---------+---------+---------+---------+
| 4349.57M | 464.78G | 532.99G | 997.77G |    1.15 |
+----------+---------+---------+---------+---------+
1 row in set (0.06 sec)

Mentioned in SAL [2016-06-30T17:07:58Z] <jynus> stopping slave on db1073 to test InnoDB compression T139055

MariaDB PRODUCTION s1 localhost enwiki > ALTER TABLE revision row_format=compressed key_block_size=8\G
Stage: 1 of 1 'altering table'    100% of stage done
Query OK, 0 rows affected (16 hours 22 min 24.83 sec)
Records: 0  Duplicates: 0  Warnings: 0

Before:

-rw-rw----  1 mysql mysql 267772755968 Jun 30 17:14 revision.ibd

After:

-rw-rw---- 1 mysql mysql 110121451520 Jul  1 11:03 revision.ibd

With key_block=8, space used as been reduced to less than a half (although some of it may be due to defragementation, rather than compression itself- on my tests, compression was a bit worse than a half of the original size).

pagelinks, which is a very denormalized table (but that is another story for another time):

-rw-rw----  1 mysql mysql 207722905600 Jun 30 17:14 pagelinks.ibd
Query OK, 0 rows affected (13 hours 10 min 8.65 sec)
Records: 0  Duplicates: 0  Warnings: 0
-rw-rw---- 1 mysql mysql 77456211968 Jul  2 15:30 pagelinks.ibd

https://grafana.wikimedia.org/dashboard/db/server-board?from=1466812008461&to=1467473855463&var-server=db1073&var-network=eth0&panelId=17&fullscreen

At this rythm, (if performance abides), we would be able to do 3 times more with the same equipment, or triplicate the redundancy, and/or reduce the machine count by 3, reducing also the administration overhead.

@aaron With this achievements, we may be reducing the disk space needed for databases by 50% or more. Do not worry, I will take care of evaluating the performance hit myself (or its boost, if we save IOPS).

However, one of the things that you mentioned on T85266#2283326 worries me, which is the thing you mention about connection reuse and caching.

One of the first applications of compression will be the possibility of having servers with more than 1 shard, specially for roles that have very little QPS and would be otherwise underutilized, such as "recentchanges" (for example, db1051 and db1052 would have 's1', 's2' and 's3' [rc -role] at the same time so usage is efficient). Do you think that shards sharing hosts could lead to issues at the load balancing level or in this use case it would not affect those issues (when the whole shard is not fragmented)?

Please feel free to subscriber from this so I do not spam you for pending work on this.

So the two things to be careful of (both having to do with lag), are:
a) initial lag checks and caching to get new connections
b) accounting for lag when reusing connections via selectDB() to change the default database for queries

This will not work properly with seconds_behind_master.

With pt-heartbeat (assuming multi-source replication can act on a shared heartbeat table), (a) should work fine since the shard is in the SELECT and the cache key uses the master name (which is unique to each shard/LoadBalancer). Also, (b) should work since connection reuse only happens within each LoadBalancer object (which is not even aware that other LoadBalancer objects might share a host). So getting an enwiki (s1) and mediawikiwiki (s3) connection to 'recentchanges' group would just be two separate connection to (possibly) the same DB. This handles lag correctly, though is not super efficient with connection usage though.

pt-heartbeat (assuming multi-source replication can act on a shared heartbeat table)

It does work- I use it for other -non-core- multi source replication monitoring. E.g.: https://tools.wmflabs.org/replag/

Thank you for your comment. I realized before it would not work with regular replication monitoring as it requires special syntax, but I had completely forgotten about it. I need to find some time to read the db code.

I hope I can solve the pt-heartbeat blocker soon, needed for many reasons now.

Please feel free to subscriber from this so I do not spam you for pending work on this.

I meant "remove subscription".

Nemo_bis triaged this task as Medium priority.Jul 12 2016, 10:37 AM
Nemo_bis awarded a token.
Nemo_bis subscribed.
jcrespo raised the priority of this task from Medium to High.Aug 18 2016, 2:22 PM
jcrespo moved this task from In progress to Pending comment on the DBA board.

db2034 was cloned from db1073; it now also uses compression with a reduction to 1/2-1/3 of the original size.

jcrespo mentioned this in Unknown Object (Task).Aug 25 2016, 10:02 AM

db1073 as of now:

+----------+------------------+--------+------------+----------+--------+--------+---------+---------+
| DATABASE | TABLE            | ENGINE | ROW_FORMAT | ROWS     | DATA   | IDX    | TOTAL   | IDXFRAC |
+----------+------------------+--------+------------+----------+--------+--------+---------+---------+
| enwiki   | pagelinks        | InnoDB | Compressed | 1114.02M | 47.97G | 57.99G | 105.96G |    1.21 |
| enwiki   | revision         | InnoDB | Compressed | 679.52M  | 60.32G | 41.44G | 101.76G |    0.69 |
| enwiki   | templatelinks    | InnoDB | Compressed | 600.91M  | 13.07G | 17.73G | 30.80G  |    1.36 |
| enwiki   | externallinks    | InnoDB | Compressed | 94.20M   | 9.84G  | 19.47G | 29.31G  |    1.98 |
| enwiki   | categorylinks    | InnoDB | Compressed | 104.71M  | 12.10G | 14.87G | 26.97G  |    1.23 |
| enwiki   | text             | InnoDB | Compressed | 798.91M  | 26.65G | 0.00G  | 26.65G  |    0.00 |
| enwiki   | watchlist        | InnoDB | Compressed | 80.11M   | 8.22G  | 10.32G | 18.55G  |    1.26 |
| enwiki   | logging          | InnoDB | Compressed | 75.85M   | 5.19G  | 11.22G | 16.41G  |    2.16 |
| enwiki   | archive          | InnoDB | Compressed | 54.42M   | 7.38G  | 2.49G  | 9.87G   |    0.34 |
| enwiki   | page             | InnoDB | Compressed | 47.42M   | 3.69G  | 3.49G  | 7.18G   |    0.95 |
| enwiki   | recentchanges    | InnoDB | Compact    | 9.38M    | 3.67G  | 3.25G  | 6.92G   |    0.88 |
| enwiki   | user_properties  | InnoDB | Compressed | 112.02M  | 2.91G  | 3.88G  | 6.79G   |    1.33 |
| enwiki   | abuse_filter_log | InnoDB | Compact    | 15.93M   | 2.39G  | 4.32G  | 6.70G   |    1.81 |
| enwiki   | user             | InnoDB | Compressed | 29.33M   | 4.15G  | 2.34G  | 6.49G   |    0.56 |
| enwiki   | updates          | InnoDB | Compact    | 48.26M   | 2.37G  | 3.94G  | 6.30G   |    1.66 |
| enwiki   | mathoid          | InnoDB | Compact    | 0.30M    | 4.98G  | 0.00G  | 4.98G   |    0.00 |
| enwiki   | titlekey         | InnoDB | Compact    | 36.41M   | 2.52G  | 2.12G  | 4.64G   |    0.84 |
| enwiki   | page_props       | InnoDB | Compact    | 23.39M   | 2.07G  | 2.53G  | 4.60G   |    1.23 |
| enwiki   | imagelinks       | InnoDB | Compressed | 79.95M   | 1.97G  | 2.59G  | 4.56G   |    1.32 |
| enwiki   | cu_changes       | InnoDB | Compressed | 16.72M   | 2.58G  | 1.79G  | 4.36G   |    0.69 |

As part of T146261 I am now compressing the whole enwiki database in dbstore2001.
I will post back the results here.

dbstore2001 has enwiki now compressed and the total dataset is 490G

root@dbstore2001:/srv/sqldata/enwiki# mysql --skip-ssl information_schema -e "select TABLE_NAME,ROW_FORMAT,ENGINE,TABLE_ROWS,DATA_LENGTH from TABLES where TABLE_SCHEMA='enwiki'"
+--------------------------------+------------+--------+------------+-------------+
| TABLE_NAME                     | ROW_FORMAT | ENGINE | TABLE_ROWS | DATA_LENGTH |
+--------------------------------+------------+--------+------------+-------------+
| __wmf_checksums                | Compressed | InnoDB |       2323 |      794624 |
| abuse_filter                   | Compressed | InnoDB |        718 |      794624 |
| abuse_filter_action            | Compressed | InnoDB |        556 |       32768 |
| abuse_filter_history           | Compressed | InnoDB |      12559 |    45383680 |
| abuse_filter_log               | Compressed | InnoDB |   16820072 |  1104674816 |
| arbcom1_vote                   | Compressed | InnoDB |        220 |      131072 |
| archive                        | Compressed | InnoDB |   55733183 |  8038383616 |
| babel                          | Compressed | InnoDB |       1272 |       40960 |
| betafeatures_user_counts       | Compressed | InnoDB |         19 |        8192 |
| blob_orphans                   | Compressed | InnoDB |    6145085 |   112640000 |
| blob_tracking                  | Compressed | InnoDB |  126633147 |  4569169920 |
| bv2009_edits                   | Compressed | InnoDB |    9888630 |   185556992 |
| bv2011_edits                   | Compressed | InnoDB |   14200214 |   266338304 |
| bv2013_edits                   | Fixed      | MyISAM |          0 |           0 |
| bv2015_edits                   | Compressed | InnoDB |   24819772 |   450887680 |
| category                       | Compressed | InnoDB |    1527095 |    52240384 |
| categorylinks                  | Compressed | InnoDB |  106070311 |  7609966592 |
| change_tag                     | Compact    | InnoDB |   24094063 |  1173356544 |
| click_tracking                 | Compressed | InnoDB |   12893813 |   609222656 |
| click_tracking_events          | Compressed | InnoDB |       2655 |       81920 |
| click_tracking_user_properties | Compressed | InnoDB |    1392725 |    63258624 |
| cu_changes                     | Compressed | InnoDB |   16138980 |  2359558144 |
| cu_log                         | Compressed | InnoDB |     433628 |    27041792 |
| cur                            | Compressed | InnoDB |    1473744 |  1877475328 |
| edit_page_tracking             | Compressed | InnoDB |    2497255 |    97378304 |
| ep_articles                    | Compressed | InnoDB |       7240 |      794624 |
| ep_cas                         | Compressed | InnoDB |         50 |       24576 |
| ep_courses                     | Compressed | InnoDB |        569 |      794624 |
| ep_events                      | Compressed | InnoDB |     482782 |    85286912 |
| ep_instructors                 | Compressed | InnoDB |          0 |        8192 |
| ep_oas                         | Compressed | InnoDB |         60 |       32768 |
| ep_orgs                        | Compressed | InnoDB |        296 |       40960 |
| ep_revisions                   | Compressed | InnoDB |       2570 |     6561792 |
| ep_students                    | Compressed | InnoDB |      12780 |      794624 |
| ep_users_per_course            | Compressed | InnoDB |      14247 |      794624 |
| exarchive                      | Compressed | InnoDB |          8 |        8192 |
| exrevision                     | Compressed | InnoDB |        123 |        8192 |
| externallinks                  | Compressed | InnoDB |   90916558 |  8844738560 |
| filearchive                    | Compressed | InnoDB |    2811250 |   964689920 |
| filejournal                    | Compressed | InnoDB |     837315 |    82681856 |
| flaggedimages                  | Compressed | InnoDB |          0 |        8192 |
| flaggedpage_config             | Compressed | InnoDB |       2683 |       90112 |
| flaggedpage_pending            | Compressed | InnoDB |         10 |        8192 |
| flaggedpages                   | Compressed | InnoDB |       2600 |       73728 |
| flaggedrevs                    | Compressed | InnoDB |     971371 |    42795008 |
| flaggedrevs_promote            | Compressed | InnoDB |    1084051 |    52240384 |
| flaggedrevs_statistics         | Compressed | InnoDB |     319814 |    10797056 |
| flaggedrevs_stats              | Compressed | InnoDB |          2 |        8192 |
| flaggedrevs_stats2             | Compressed | InnoDB |          1 |        8192 |
| flaggedrevs_tracking           | Compressed | InnoDB |       2055 |       65536 |
| flaggedtemplates               | Compressed | InnoDB |          0 |        8192 |
| gather_list                    | Compressed | InnoDB |      25801 |     1843200 |
| gather_list_flag               | Compressed | InnoDB |          0 |        8192 |
| gather_list_item               | Compressed | InnoDB |     113708 |     3432448 |
| geo_tags                       | Compressed | InnoDB |    1782628 |    70090752 |
| global_block_whitelist         | Compressed | InnoDB |          4 |        8192 |
| hidden                         | Compressed | InnoDB |       9509 |     1318912 |
| image                          | Compressed | InnoDB |     802291 |   253755392 |
| imagelinks                     | Compressed | InnoDB |   84624849 |  2506612736 |
| interwiki                      | Compressed | InnoDB |          0 |        8192 |
| ipblocks                       | Compressed | InnoDB |     935644 |    64315392 |
| iwlinks                        | Compressed | InnoDB |   16562654 |   479723520 |
| job                            | Compressed | InnoDB |          0 |        8192 |
| l10n_cache                     | Compressed | InnoDB |      49668 |     4464640 |
| langlinks                      | Compressed | InnoDB |   24122171 |   775421952 |
| linkscc                        | Compressed | InnoDB |          0 |        8192 |
| localisation                   | Compressed | InnoDB |        980 |       90112 |
| localisation_file_hash         | Compressed | InnoDB |        774 |       73728 |
| log_search                     | Compressed | InnoDB |    4795783 |   108494848 |
| logging                        | Compressed | InnoDB |   79763532 |  5774508032 |
| long_run_profiling             | Compressed | InnoDB |       2049 |      163840 |
| mark_as_helpful                | Compressed | InnoDB |       2363 |      122880 |
| math                           | Compressed | InnoDB |    3002757 |   321912832 |
| mathoid                        | Compressed | InnoDB |     227465 |  2281177088 |
| module_deps                    | Compressed | InnoDB |      73661 |    20275200 |
| moodbar_feedback               | Compressed | InnoDB |      63637 |     4988928 |
| moodbar_feedback_response      | Compressed | InnoDB |      15867 |     2367488 |
| msg_resource                   | Compressed | InnoDB |      62106 |    30203904 |
| msg_resource_links             | Compressed | InnoDB |       3524 |      147456 |
| objectcache                    | Compressed | InnoDB |    2048306 |    95944704 |
| old_growth                     | Compressed | InnoDB |          0 |        8192 |
| oldimage                       | Compressed | InnoDB |     113998 |    37011456 |
| ores_classification            | Compressed | InnoDB |    8504772 |   192389120 |
| ores_model                     | Compressed | InnoDB |          8 |        8192 |
| page                           | Compressed | InnoDB |   41512000 |  2999451648 |
| page_assessments               | Compressed | InnoDB |     208114 |     4988928 |
| page_assessments_projects      | Compressed | InnoDB |          9 |        8192 |
| page_props                     | Compressed | InnoDB |   24868477 |   689963008 |
| page_restrictions              | Compressed | InnoDB |     108768 |     3416064 |
| pagelinks                      | Compressed | InnoDB |  948487119 | 32607043584 |
| pagetriage_log                 | Compressed | InnoDB |      51564 |     1843200 |
| pagetriage_page                | Compressed | InnoDB |      98180 |     3940352 |
| pagetriage_page_tags           | Compressed | InnoDB |    1686130 |    38600704 |
| pagetriage_tags                | Compressed | InnoDB |         17 |        8192 |
| pif_edits                      | Compressed | InnoDB |     989448 |    19701760 |
| pr_index                       | Compressed | InnoDB |          0 |        8192 |
| prefstats                      | Compressed | InnoDB |    2864492 |    83828736 |
| prefswitch_survey              | Compressed | InnoDB |    3510115 |   191889408 |
| profiling                      | Fixed      | MEMORY |          0 |           0 |
| protected_titles               | Compressed | InnoDB |      50976 |     3416064 |
| querycache                     | Compressed | InnoDB |     129390 |     4988928 |
| querycache_info                | Compressed | InnoDB |         41 |        8192 |
| querycachetwo                  | Compressed | InnoDB |     119501 |     4464640 |
| recentchanges                  | Compressed | InnoDB |    9785692 |  1773682688 |
| redirect                       | Compressed | InnoDB |    8897134 |   294125568 |
| revision                       | Compressed | InnoDB |  685502459 | 72446640128 |
| securepoll_cookie_match        | Compressed | InnoDB |         73 |        8192 |
| securepoll_elections           | Compressed | InnoDB |         19 |        8192 |
| securepoll_entity              | Compressed | InnoDB |        230 |        8192 |
| securepoll_lists               | Compressed | InnoDB |    2135578 |    48570368 |
| securepoll_msgs                | Compressed | InnoDB |       1049 |      180224 |
| securepoll_options             | Compressed | InnoDB |        196 |        8192 |
| securepoll_properties          | Compressed | InnoDB |        200 |       32768 |
| securepoll_questions           | Compressed | InnoDB |         14 |        8192 |
| securepoll_strike              | Compressed | InnoDB |        280 |       32768 |
| securepoll_voters              | Compressed | InnoDB |      40004 |     9723904 |
| securepoll_votes               | Compressed | InnoDB |       6594 |     3940352 |
| site_identifiers               | Compressed | InnoDB |        590 |       24576 |
| site_stats                     | Compressed | InnoDB |          1 |        8192 |
| sites                          | Compressed | InnoDB |        899 |      122880 |
| spoofuser                      | Compressed | InnoDB |   24326659 |   694681600 |
| tag_summary                    | Compact    | InnoDB |   16185876 |  1037041664 |
| templatelinks                  | Compressed | InnoDB |  627065618 | 16820707328 |
| text                           | Compressed | InnoDB |  773007366 | 29250027520 |
| titlekey                       | Compressed | InnoDB |   34689100 |  1077936128 |
| transcache                     | Compressed | InnoDB |          0 |        8192 |
| transcode                      | Compressed | InnoDB |        816 |       73728 |
| updatelog                      | Compressed | InnoDB |         33 |        8192 |
| updates                        | Compressed | InnoDB |   49281402 |  1327497216 |
| uploadstash                    | Compressed | InnoDB |         15 |        8192 |
| user                           | Compact    | InnoDB |   28244262 |  8720891904 |
| user_former_groups             | Compressed | InnoDB |       5062 |      122880 |
| user_groups                    | Compressed | InnoDB |      49066 |     1318912 |
| user_newtalk                   | Compressed | InnoDB |    7211356 |   196067328 |
| user_properties                | Compact    | InnoDB |  113643343 |  8630796288 |
| valid_tag                      | Compressed | InnoDB |          5 |        8192 |
| vote_log                       | Compressed | InnoDB |        554 |      794624 |
| watchlist                      | Compressed | InnoDB |   76167766 |  5826936832 |
| wbc_entity_usage               | Compressed | InnoDB |   12766856 |   393216000 |
| wikigrok_claims                | Compressed | InnoDB |          0 |        8192 |
| wikigrok_questions             | Compressed | InnoDB |     332432 |    42287104 |
| wikigrok_responses             | Compressed | InnoDB |          0 |        8192 |
| wikilove_image_log             | Compressed | InnoDB |      17734 |      794624 |
| wikilove_log                   | Compressed | InnoDB |      91149 |    15499264 |
+--------------------------------+------------+--------+------------+-------------+

I am compressing S4 now in dbstore2001, without compression it is currently 1.3T
We will see how it decreases.
I am compressing in a asc order, so the smallest tables first.

The compression finished and the dataset of commonswiki went from 1.3T to 467G

root@dbstore2001:/srv# mysql --skip-ssl information_schema -e "select TABLE_NAME,ROW_FORMAT,ENGINE,TABLE_ROWS,DATA_LENGTH from TABLES where TABLE_SCHEMA='commonswiki'"
+--------------------------------+------------+--------+------------+-------------+
| TABLE_NAME                     | ROW_FORMAT | ENGINE | TABLE_ROWS | DATA_LENGTH |
+--------------------------------+------------+--------+------------+-------------+
| __wmf_checksums                | Compressed | InnoDB |      76740 |    14032896 |
| abuse_filter                   | Compressed | InnoDB |        158 |       81920 |
| abuse_filter_action            | Compressed | InnoDB |        190 |        8192 |
| abuse_filter_history           | Compressed | InnoDB |       1303 |      794624 |
| abuse_filter_log               | Compressed | InnoDB |    1604301 |   115752960 |
| archive                        | Compressed | InnoDB |   14298879 |  1859125248 |
| archive_save                   | Compressed | InnoDB |    7840255 |   929038336 |
| babel                          | Compressed | InnoDB |       6276 |      131072 |
| betafeatures_user_counts       | Compressed | InnoDB |         15 |        8192 |
| blob_orphans                   | Compressed | InnoDB |     272004 |     4988928 |
| blob_tracking                  | Compressed | InnoDB |    9598144 |   338690048 |
| blobs                          | Compressed | InnoDB |          0 |        8192 |
| bv2009_edits                   | Compressed | InnoDB |     768662 |    14450688 |
| bv2011_edits                   | Compressed | InnoDB |    1554400 |    28622848 |
| bv2013_edits                   | Compressed | InnoDB |    2902901 |    52764672 |
| bv2015_edits                   | Compressed | InnoDB |    4678042 |    84262912 |
| category                       | Compressed | InnoDB |    5837444 |   250601472 |
| categorylinks                  | Compressed | InnoDB |  216481799 | 20030423040 |
| change_tag                     | Compressed | InnoDB |    3923056 |    97910784 |
| click_tracking                 | Compressed | InnoDB |      12194 |      794624 |
| click_tracking_events          | Compressed | InnoDB |        215 |        8192 |
| click_tracking_user_properties | Compressed | InnoDB |          0 |        8192 |
| cu_changes                     | Compressed | InnoDB |    8954109 |  1539751936 |
| cu_log                         | Compressed | InnoDB |      49737 |     2891776 |
| cur                            | Compressed | InnoDB |     162359 |    56426496 |
| edit_page_tracking             | Compressed | InnoDB |          0 |        8192 |
| externallinks                  | Compressed | InnoDB |  220633086 | 19602604032 |
| filearchive                    | Compressed | InnoDB |    6429340 |  3022520320 |
| filejournal                    | Compressed | InnoDB |   16519427 |  1762656256 |
| geo_tags                       | Compressed | InnoDB |    6560780 |   236445696 |
| global_block_whitelist         | Compressed | InnoDB |          1 |        8192 |
| globalimagelinks               | Compressed | InnoDB |  330100632 | 18202214400 |
| hidden                         | Compressed | InnoDB |        169 |       32768 |
| image                          | Compressed | InnoDB |   12129822 | 38679871488 |
| imagelinks                     | Compressed | InnoDB |  188571970 |  6018301952 |
| interwiki                      | Compressed | InnoDB |          0 |        8192 |
| ipblocks                       | Compressed | InnoDB |      46384 |     3416064 |
| iwlinks                        | Compressed | InnoDB |  153050151 |  3689414656 |
| job                            | Compressed | InnoDB |          0 |        8192 |
| job_explosion_tmp              | Compressed | InnoDB |       9115 |     1843200 |
| l10n_cache                     | Compressed | InnoDB |     136293 |    11821056 |
| langlinks                      | Compressed | InnoDB |   10398570 |   332922880 |
| linkscc                        | Compressed | InnoDB |          0 |        8192 |
| localisation                   | Compressed | InnoDB |          0 |        8192 |
| localisation_file_hash         | Compressed | InnoDB |          0 |        8192 |
| log_search                     | Compressed | InnoDB |    5356475 |   137887744 |
| logging                        | Compressed | InnoDB |  170240345 | 19932905472 |
| logging_pre_1_10               | Compressed | InnoDB |    4110693 |   491782144 |
| mark_as_helpful                | Compressed | InnoDB |          0 |        8192 |
| math                           | Compressed | InnoDB |      26568 |     2891776 |
| mathoid                        | Compressed | InnoDB |        584 |     4464640 |
| module_deps                    | Compressed | InnoDB |      79915 |    28696576 |
| moodbar_feedback               | Compressed | InnoDB |          0 |        8192 |
| moodbar_feedback_response      | Compressed | InnoDB |          0 |        8192 |
| msg_resource                   | Compressed | InnoDB |     140724 |    37019648 |
| msg_resource_links             | Compressed | InnoDB |       2844 |      122880 |
| objectcache                    | Compressed | InnoDB |     455632 |    21848064 |
| oldimage                       | Compressed | InnoDB |    5696147 |  3661103104 |
| page                           | Compressed | InnoDB |   48973904 |  3768049664 |
| page_props                     | Compressed | InnoDB |   10770547 |  1226309632 |
| page_restrictions              | Compressed | InnoDB |      21317 |      794624 |
| pagelinks                      | Compressed | InnoDB |  359876819 |  9688834048 |
| pif_edits                      | Compressed | InnoDB |      88407 |     1843200 |
| pr_index                       | Compressed | InnoDB |          0 |        8192 |
| prefstats                      | Compressed | InnoDB |     188127 |     5529600 |
| prefswitch_survey              | Compressed | InnoDB |      11310 |      794624 |
| profiling                      | Fixed      | MEMORY |          0 |           0 |
| protected_titles               | Compressed | InnoDB |        668 |       65536 |
| querycache                     | Compressed | InnoDB |     157413 |     7610368 |
| querycache_info                | Compressed | InnoDB |         40 |        8192 |
| querycachetwo                  | Compressed | InnoDB |      28830 |     1220608 |
| recentchanges                  | Compressed | InnoDB |   33402702 | 12309241856 |
| redirect                       | Compressed | InnoDB |     803910 |    38068224 |
| revision                       | Compressed | InnoDB |  191186799 | 18990759936 |
| revtag                         | Compressed | InnoDB |     127922 |     3416064 |
| searchindex                    | Dynamic    | MyISAM |          0 |           0 |
| securepoll_cookie_match        | Compressed | InnoDB |          2 |        8192 |
| securepoll_elections           | Compressed | InnoDB |          6 |        8192 |
| securepoll_entity              | Compressed | InnoDB |         60 |        8192 |
| securepoll_lists               | Compressed | InnoDB |     147848 |     3940352 |
| securepoll_msgs                | Compressed | InnoDB |        813 |      139264 |
| securepoll_options             | Compressed | InnoDB |         49 |        8192 |
| securepoll_properties          | Compressed | InnoDB |         58 |        8192 |
| securepoll_questions           | Compressed | InnoDB |          5 |        8192 |
| securepoll_strike              | Compressed | InnoDB |          0 |        8192 |
| securepoll_voters              | Compressed | InnoDB |       1343 |      262144 |
| securepoll_votes               | Compressed | InnoDB |          0 |        8192 |
| site_identifiers               | Compressed | InnoDB |        590 |       24576 |
| site_stats                     | Compressed | InnoDB |          1 |        8192 |
| sites                          | Compressed | InnoDB |        899 |      122880 |
| spoofuser                      | Compressed | InnoDB |    1473862 |    43352064 |
| tag_summary                    | Compact    | InnoDB |    3768980 |   190103552 |
| templatelinks                  | Compressed | InnoDB | 1306034264 | 31287410688 |
| text                           | Compressed | InnoDB |  210226901 |  7079460864 |
| titlekey                       | Compressed | InnoDB |   29836142 |  1171783680 |
| trackbacks                     | Compressed | InnoDB |          0 |        8192 |
| transcache                     | Compressed | InnoDB |          0 |        8192 |
| transcode                      | Compressed | InnoDB |    1584909 |   309280768 |
| translate_groupreviews         | Compressed | InnoDB |       7908 |      794624 |
| translate_groupstats           | Compressed | InnoDB |     224593 |     9224192 |
| translate_messageindex         | Compressed | InnoDB |      10312 |      794624 |
| translate_metadata             | Compressed | InnoDB |        744 |       57344 |
| translate_reviews              | Compressed | InnoDB |      17299 |      794624 |
| translate_sections             | Compressed | InnoDB |       9450 |     1318912 |
| updatelog                      | Compressed | InnoDB |         25 |        8192 |
| updates                        | Compressed | InnoDB |   47211801 |  1234173952 |
| uploadstash                    | Compressed | InnoDB |      10608 |    18440192 |
| user                           | Compressed | InnoDB |    5916465 |   515375104 |
| user_former_groups             | Compressed | InnoDB |       2265 |       65536 |
| user_groups                    | Compressed | InnoDB |       8422 |      180224 |
| user_newtalk                   | Compressed | InnoDB |    3604208 |    81108992 |
| user_properties                | Compact    | InnoDB |   10162702 |   775946240 |
| uw_campaign_conf               | Compressed | InnoDB |       2497 |       90112 |
| uw_campaigns                   | Compressed | InnoDB |        364 |       24576 |
| valid_tag                      | Compressed | InnoDB |          1 |        8192 |
| watchlist                      | Compressed | InnoDB |   59295580 |  2807562240 |
| wbc_entity_usage               | Compressed | InnoDB |   28411409 |   789577728 |
| wikilove_image_log             | Compressed | InnoDB |       1090 |       57344 |
| wikilove_log                   | Compressed | InnoDB |       7795 |     1318912 |
+--------------------------------+------------+--------+------------+-------------+

I am probably going to leave compressing the top3 tables in S3 across all the wikis (revision,pagelinks,templatelinks) in dbstore2002 too.

Right now s1 and s4 (both compressed) + s3 non compressed is 2.5T of total size.

I have tested compression on the top3 tables in S3 (revision, pagelinks, templatelinks) and the dataset has been reduced around 500G in that whole shard.

dbstore2001 now contains S1, S3, S4 (S1 and S4 fully compressed) and it is 2.2T

I am closing this ticket now as the tests were successful.
We have created a parent meta task (T150438) so we can create subtasks for further compression tasks and track it goes across production.