Page MenuHomePhabricator
Paste P12222

revision table on wikidata has gone from 244.13 Gigabytes to 140.36 Gigabytes
ActivePublic

Authored by jcrespo on Aug 12 2020, 12:31 PM.
Tags
None
Referenced Files
F32074829: raw.txt
Aug 12 2020, 12:31 PM
Tokens
"Yellow Medal" token, awarded by thiemowmde.
root@db1080.eqiad.wmnet[dbbackups]> select file_path, file_name, size, file_date FROM backup_files where backup_id=7238 ORDER BY size DESC LIMIT 20;
+--------------+-----------------------------+--------------+---------------------+
| file_path | file_name | size | file_date |
+--------------+-----------------------------+--------------+---------------------+
| wikidatawiki | wbt_item_terms.ibd | 174357217280 | 2020-08-12 05:32:01 |
| wikidatawiki | revision.ibd | 150709731328 | 2020-08-12 05:30:16 |
| wikidatawiki | pagelinks.ibd | 148365115392 | 2020-08-12 05:29:55 |
| wikidatawiki | revision_actor_temp.ibd | 113984405504 | 2020-08-12 05:25:03 |
| | ibdata1 | 71565312000 | 2020-08-12 05:32:26 |
| wikidatawiki | content.ibd | 71085064192 | 2020-08-12 05:11:50 |
| wikidatawiki | text.ibd | 53397684224 | 2020-08-12 04:49:37 |
| wikidatawiki | slots.ibd | 51988398080 | 2020-08-12 04:48:29 |
| wikidatawiki | revision_comment_temp.ibd | 42869981184 | 2020-08-12 04:51:48 |
| wikidatawiki | comment.ibd | 32518438912 | 2020-08-12 04:31:27 |
| wikidatawiki | change_tag.ibd | 31042043904 | 2020-08-12 04:29:23 |
| wikidatawiki | wbt_text_in_lang.ibd | 25199378432 | 2020-08-12 04:35:28 |
| wikidatawiki | externallinks.ibd | 23890755584 | 2020-08-12 04:22:08 |
| wikidatawiki | wbt_term_in_lang.ibd | 23647485952 | 2020-08-12 04:22:52 |
| wikidatawiki | page_props.ibd | 23467130880 | 2020-08-12 04:18:12 |
| wikidatawiki | cu_changes.ibd | 22498246656 | 2020-08-12 04:16:20 |
| wikidatawiki | wbt_text.ibd | 16420700160 | 2020-08-12 04:07:52 |
| wikidatawiki | page.ibd | 11672748032 | 2020-08-12 04:00:39 |
| wikidatawiki | wb_changes_subscription.ibd | 11047796736 | 2020-08-12 04:02:16 |
| wikidatawiki | watchlist.ibd | 9097445376 | 2020-08-12 04:13:33 |
+--------------+-----------------------------+--------------+---------------------+
20 rows in set (0.002 sec)
root@db1080.eqiad.wmnet[dbbackups]> select file_path, file_name, size, file_date FROM backup_files where backup_id=7180 ORDER BY size DESC LIMIT 20;
+--------------+-----------------------------+--------------+---------------------+
| file_path | file_name | size | file_date |
+--------------+-----------------------------+--------------+---------------------+
| wikidatawiki | revision.ibd | 262127222784 | 2020-08-09 21:04:48 |
| wikidatawiki | wbt_item_terms.ibd | 174336245760 | 2020-08-09 21:04:48 |
| wikidatawiki | pagelinks.ibd | 148365115392 | 2020-08-09 21:04:48 |
| wikidatawiki | revision_actor_temp.ibd | 113929879552 | 2020-08-09 21:04:48 |
| | ibdata1 | 71565312000 | 2020-08-09 21:04:58 |
| wikidatawiki | content.ibd | 71047315456 | 2020-08-09 21:04:45 |
| wikidatawiki | text.ibd | 53397684224 | 2020-08-09 21:04:42 |
| wikidatawiki | slots.ibd | 51963232256 | 2020-08-09 21:04:48 |
| wikidatawiki | revision_comment_temp.ibd | 42849009664 | 2020-08-09 21:04:46 |
| wikidatawiki | comment.ibd | 32493273088 | 2020-08-09 21:04:48 |
| wikidatawiki | change_tag.ibd | 31021072384 | 2020-08-09 21:04:47 |
| wikidatawiki | wbt_text_in_lang.ibd | 25190989824 | 2020-08-09 21:04:45 |
| wikidatawiki | externallinks.ibd | 23882366976 | 2020-08-09 21:04:48 |
| wikidatawiki | wbt_term_in_lang.ibd | 23643291648 | 2020-08-09 21:04:48 |
| wikidatawiki | page_props.ibd | 23467130880 | 2020-08-09 21:04:48 |
| wikidatawiki | cu_changes.ibd | 22498246656 | 2020-08-09 21:04:48 |
| wikidatawiki | wbt_text.ibd | 16412311552 | 2020-08-09 21:04:48 |
| wikidatawiki | page.ibd | 11672748032 | 2020-08-09 21:04:48 |
| wikidatawiki | wb_changes_subscription.ibd | 11047796736 | 2020-08-09 21:04:48 |
| wikidatawiki | watchlist.ibd | 9093251072 | 2020-08-09 21:04:45 |
+--------------+-----------------------------+--------------+---------------------+
20 rows in set (0.002 sec)

Event Timeline

What caused this drop? Or (because it was certainly the other way around) what wasted so much space before? Did the entity serialization change to not serialize certain things any more? Did you removed millions of labels that can be auto-generated?