Page MenuHomePhabricator

dev-ops - create server and mysql server
Open, Needs TriagePublic3 Estimated Story Points

Description

Event Timeline

notconfusing renamed this task from dev-ops to dev-ops - create server and mysql server.Sep 23 2020, 10:31 PM
notconfusing created this task.
notconfusing set the point value for this task to 3.

@Envlh do you use a dockerize image of mysql 8, or install it from package, or use a general cli-installer?

notconfusing updated the task description. (Show Details)Sep 25 2020, 5:14 PM
Envlh added a comment.Oct 2 2020, 7:10 PM

On Denelezh, I installed MySQL using Oracle's apt repository: https://dev.mysql.com/downloads/repo/apt/

Here is an excerpt of my notes:

wget https://dev.mysql.com/get/mysql-apt-config_0.8.15-1_all.deb
dpkg -i mysql-apt-config_0.8.15-1_all.deb
apt update
apt upgrade
apt install mysql-server

It may conflict with MariaDB if it is already installed.

This is just one way to install MySQL, it is probably perfectly fine to use a Docker image.

Either way, you will have to customize MySQL configuration. The most important is innodb_buffer_pool_size variable for when you will work with a complete dump. To sum up, it's the RAM that MySQL will use to work with index and data in memory. Its usual value is 70-80% of the available RAM on a MySQL dedicated server. It's 10G on Denelezh. You can find MySQL configuration on Denelezh in the file /etc/mysql/my.cnf.

notconfusing updated the task description. (Show Details)Sun, Nov 8, 8:59 PM
notconfusing updated the task description. (Show Details)Sun, Nov 8, 11:41 PM

@Envlh , see T267499 . In order to get more disk space we need to propose a custom size. Do remember the GB sizes of the different tables in denelezh? We're going to need much more since we'll be storing all our dump history as well. Maybe a better way to estimate is from the CSV sizes per week?

Envlh added a comment.EditedSun, Nov 15, 8:42 PM

The copy of Denelezh's MySQL data files is 365G. The denelezh schema is 324G. I'm working on setting-up a local MySQL server to restore it (and carefully, as this is the only backup at the moment). One idea is to export only a subset of it, so it will be easier to transfer it to Humaniki's server and to restore it.

Here are the details for each file on denelezh schema:

12K     country.frm
96K     country.ibd
1.0K    db.opt
12K     dump.frm
96K     dump.ibd
12K     human.frm
640M    human.ibd
12K     human_country.frm
272M    human_country.ibd
12K     human_occupation.frm
2.7G    human_occupation.ibd
12K     human_sitelink.frm
872M    human_sitelink.ibd
96K     kpi#P#dump20010101.ibd
1.9G    kpi#P#dump20170102.ibd
1.9G    kpi#P#dump20170206.ibd
1.9G    kpi#P#dump20170306.ibd
2.0G    kpi#P#dump20170403.ibd
2.0G    kpi#P#dump20170501.ibd
2.1G    kpi#P#dump20170605.ibd
2.2G    kpi#P#dump20170703.ibd
2.3G    kpi#P#dump20170807.ibd
2.3G    kpi#P#dump20170904.ibd
2.4G    kpi#P#dump20171002.ibd
2.4G    kpi#P#dump20171106.ibd
2.2G    kpi#P#dump20171204.ibd
2.4G    kpi#P#dump20180101.ibd
2.4G    kpi#P#dump20180205.ibd
2.4G    kpi#P#dump20180305.ibd
2.5G    kpi#P#dump20180402.ibd
2.6G    kpi#P#dump20180604.ibd
2.7G    kpi#P#dump20180806.ibd
2.8G    kpi#P#dump20180903.ibd
2.8G    kpi#P#dump20181001.ibd
3.0G    kpi#P#dump20181105.ibd
3.0G    kpi#P#dump20181203.ibd
3.0G    kpi#P#dump20190107.ibd
3.0G    kpi#P#dump20190204.ibd
3.1G    kpi#P#dump20190304.ibd
3.2G    kpi#P#dump20190401.ibd
3.2G    kpi#P#dump20190408.ibd
3.3G    kpi#P#dump20190415.ibd
3.3G    kpi#P#dump20190422.ibd
3.2G    kpi#P#dump20190429.ibd
3.3G    kpi#P#dump20190506.ibd
3.3G    kpi#P#dump20190513.ibd
3.3G    kpi#P#dump20190520.ibd
3.3G    kpi#P#dump20190527.ibd
3.3G    kpi#P#dump20190603.ibd
3.3G    kpi#P#dump20190610.ibd
3.4G    kpi#P#dump20190617.ibd
3.5G    kpi#P#dump20190704.ibd
3.8G    kpi#P#dump20190708.ibd
3.5G    kpi#P#dump20190717.ibd
3.5G    kpi#P#dump20190722.ibd
3.5G    kpi#P#dump20190729.ibd
3.5G    kpi#P#dump20190805.ibd
3.5G    kpi#P#dump20190812.ibd
3.6G    kpi#P#dump20190819.ibd
3.6G    kpi#P#dump20190826.ibd
3.6G    kpi#P#dump20190902.ibd
3.5G    kpi#P#dump20190909.ibd
3.6G    kpi#P#dump20190916.ibd
3.6G    kpi#P#dump20190923.ibd
3.6G    kpi#P#dump20190930.ibd
3.6G    kpi#P#dump20191111.ibd
3.7G    kpi#P#dump20191118.ibd
3.7G    kpi#P#dump20191125.ibd
3.7G    kpi#P#dump20191202.ibd
3.7G    kpi#P#dump20191209.ibd
3.7G    kpi#P#dump20191216.ibd
3.7G    kpi#P#dump20200106.ibd
3.7G    kpi#P#dump20200113.ibd
2.2G    kpi#P#dump20200127.ibd
3.8G    kpi#P#dump20200203.ibd
3.9G    kpi#P#dump20200210.ibd
3.9G    kpi#P#dump20200217.ibd
3.9G    kpi#P#dump20200224.ibd
3.9G    kpi#P#dump20200302.ibd
3.9G    kpi#P#dump20200309.ibd
4.0G    kpi#P#dump20200330.ibd
4.0G    kpi#P#dump20200406.ibd
4.0G    kpi#P#dump20200413.ibd
4.0G    kpi#P#dump20200420.ibd
4.0G    kpi#P#dump20200427.ibd
4.0G    kpi#P#dump20200504.ibd
4.2G    kpi#P#dump20200511.ibd
4.1G    kpi#P#dump20200518.ibd
4.1G    kpi#P#dump20200525.ibd
4.2G    kpi#P#dump20200601.ibd
4.2G    kpi#P#dump20200608.ibd
4.2G    kpi#P#dump20200615.ibd
4.2G    kpi#P#dump20200622.ibd
4.2G    kpi#P#dump20200629.ibd
4.3G    kpi#P#dump20200706.ibd
4.3G    kpi#P#dump20200713.ibd
4.3G    kpi#P#dump20200720.ibd
4.3G    kpi#P#dump20200727.ibd
4.4G    kpi#P#dump20200803.ibd
4.4G    kpi#P#dump20200810.ibd
4.4G    kpi#P#dump20200817.ibd
4.4G    kpi#P#dump20200824.ibd
4.4G    kpi#P#dump20200831.ibd
4.4G    kpi#P#dump20200914.ibd
4.5G    kpi#P#dump20200921.ibd
20K     kpi.frm
12K     label.frm
13G     label.ibd
12K     occupation.frm
128K    occupation.ibd
12K     occupation_parent.frm
11M     occupation_parent.ibd
12K     project.frm
176K    project.ibd
12K     year.frm
224K    year.ibd
notconfusing updated the task description. (Show Details)Mon, Nov 16, 1:52 AM
notconfusing updated the task description. (Show Details)

For reference, the WHGI snapshot data.

maximilianklein@whgi:~/snapshot_data$ du -sh .
75G     .
maximilianklein@whgi:~/snapshot_data$ du -sh *
148M    2014-09-17
150M    2014-10-13
167M    2015-07-28
171M    2015-08-03
197M    2015-08-09
197M    2015-08-12
197M    2015-08-16
198M    2015-08-21
197M    2015-08-23
198M    2015-09-06
...
541M    2020-07-28
544M    2020-08-04
549M    2020-08-18
549M    2020-08-25
553M    2020-09-08
553M    2020-09-14
561M    2020-09-22
565M    2020-09-29
574M    2020-10-20
575M    2020-10-27

@Envlh I have made some calculations about how much space we may need in the future here. If you could please check the assumptions, that'd be good.

https://docs.google.com/spreadsheets/d/1jSNl1Ju_94TBBXCi6TojypdjXPdRuwLfBLCITTVS3P8/edit?usp=sharing

The final numbers are:

scenariofuture yearsraw table history keptTB
Maximum5 yearsfull (5 years)3.2
Minimum3 years1 year window1.45
notconfusing updated the task description. (Show Details)Fri, Nov 20, 6:31 PM