Page MenuHomePhabricator

Create production databases for mailman3
Closed, ResolvedPublic

Description

Needed for deploying the new mailman.

Size: 34GB with 4GB growth per year. Estimated from T278609: Import several public mailing lists archives from mailman2 to lists-next to measure database size

Update: Deployed on m5

Event Timeline

jijiki triaged this task as Medium priority.Mar 29 2021, 9:13 PM
Marostegui changed the task status from Open to Stalled.Apr 5 2021, 6:57 AM
Marostegui moved this task from Triage to Blocked on the DBA board.
Marostegui subscribed.

Let's stall this until we are fully sure we want to deploy this in production. I want to see the results on T256538

Yes. We are currently importing several mailing lists and some numbers can be found T278609: Import several public mailing lists archives from mailman2 to lists-next to measure database size and they are hopeful news (like the fact that mailman2 stores nine copies of most email so it means it clearly wouldn't be 175GB in database) and I hope to give you some numbers soonTM but we don't have a choice about mailman3, mailman2 is dropped from bullseye, it's security nightmare (T181803 is one of several), running EOL python, the encoding issues are small set of many many issues it has (look at the "mailman2 limitations" column in Wikimedia-Mailing-lists )

So size of the mbox files are in total 45GB and I assume when we import those, mostly it'll get smaller (a 14MB mbox shrunk to 10MB in the database) but for the sake of pessimistic estimate, 45GB will be the upper bound and 35GB will be a middle-ground estimate.

It'll grow faster with a better mailing list system in general (Jevons paradox) so it's hard to estimate but if we consider it as linear growth, the pessimistic growth rate would be 5.3 GB/year and optimistic growth would be 4.11 GB/year. We also can improve the archives and tidy it up a bit (e.g. deleting wikidata-bugs archives gives us 1.3GB and I'm sure we can find some other places to clean up as well).

If you give me green light, I can import some rather large mailing lists in the test database to see how it'll look.

That's good news Amir. Thanks for testing it.
Let's go for the large mailing lists import to see how it looks indeed.

So with importing pywikibot-bugs (More details: T278609#6978553), the better estimation is 27GB size with 3.2GB/year growth (assuming linear growth)

That's a very doable number, thanks @Ladsgroup!

With the wikitech-l imported my last offer is now: 34GB.

With the wikitech-l imported my last offer is now: 34GB.

This is a pretty decent size and 4GB per year is also ok with the current HW. Thanks for the research
Let me know when we can un-stall this task and go ahead and start looking for a misc section and create them (I would prefer to do this once we've deleted the ones used for testing.

Thanks. We will likely bother you in two or three weeks. Most of the work is done.

Hi! Can it be done? We are planning to deploy early next week.

Hey,

I thought this was meant to be done in 2-3 weeks :) (T278614#6992922)
Also, I thought we wanted to delete the temporary databases before proceeding. Keep in mind that we need to also drop+create the users as the current ones have the test string on them.

Hey,

I thought this was meant to be done in 2-3 weeks :) (T278614#6992922)

Well, it was one week ago :D. But it's still partially is my fault. My amazing estimation skills strike again.

Also, I thought we wanted to delete the temporary databases before proceeding. Keep in mind that we need to also drop+create the users as the current ones have the test string on them.

hmm, that's a tough one. So we are enabling mailman3 on lists1001.wikimedia.org (lists.wikimedia.org) but we are not upgrading all mailing lists at the same time and it'll take a while. We want admins of mailing lists to be able to test it before if they want to (of course not all mailing lists). Is it okay if the test one stays around for maybe a couple weeks more (alongside with the production one)? Also keep in mind that the production one take a really long time to actually reach its estimated 30GB (at least a month).

hmm, that's a tough one. So we are enabling mailman3 on lists1001.wikimedia.org (lists.wikimedia.org) but we are not upgrading all mailing lists at the same time and it'll take a while. We want admins of mailing lists to be able to test it before if they want to (of course not all mailing lists). Is it okay if the test one stays around for maybe a couple weeks more (alongside with the production one)? Also keep in mind that the production one take a really long time to actually reach its estimated 30GB (at least a month).

That's ok, but let's not close this task until we have removed the test databases, otherwise we'll forget.
However, I do want the new users to use the new databases, so once we are done with the testing we can drop the databases and the users.

Can you confirm the database names and the user names? I will clone the grants but change the user/pass.

Marostegui changed the task status from Stalled to Open.Apr 21 2021, 6:53 AM

Sure. I will make sure to remind you to delete it.

Yes. mailman3 with user mailman3 and mailman3web with user mailman3web. They should bound to a different host though: lists1001.wikimedia.org. I think ferm is fine but let me know if it's not.

oh and this one needs backups but that can happen later.

Change 681575 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] production-m5.sql.erb: Add new mailman3 final users

https://gerrit.wikimedia.org/r/681575

Databases created on m5 (same IP where the test databases are)
Ferm needs updating to be able to reach m5-master:

root@lists1001:~# telnet db1128.eqiad.wmnet 3306
Trying 10.64.0.98...

Users:

+------------------------------------------------------------------------------------------------------------------------+
| Grants for mailman3web@208.80.154.31                                                                                   |
+------------------------------------------------------------------------------------------------------------------------+
| GRANT USAGE ON *.* TO `mailman3web`@`208.80.154.31` IDENTIFIED BY PASSWORD '*x' |
| GRANT ALL PRIVILEGES ON `mailman3web`.* TO `mailman3web`@`208.80.154.31`                                               |
+------------------------------------------------------------------------------------------------------------------------+
2 rows in set (0.001 sec)

+---------------------------------------------------------------------------------------------------------------------+
| Grants for mailman3@208.80.154.31                                                                                   |
+---------------------------------------------------------------------------------------------------------------------+
| GRANT USAGE ON *.* TO `mailman3`@`208.80.154.31` IDENTIFIED BY PASSWORD '*x' |
| GRANT ALL PRIVILEGES ON `mailman3`.* TO `mailman3`@`208.80.154.31`                                                  |
+---------------------------------------------------------------------------------------------------------------------+
2 rows in set (0.001 sec)

root@cumin1001:/home/marostegui# host 208.80.154.31
31.154.80.208.in-addr.arpa domain name pointer lists1001.wikimedia.org.

@Legoktm passwords need to be added the the private repo, I have left them at:

root@lists1001:/home/legoktm# ls mailman
mailman

Change 681575 merged by Marostegui:

[operations/puppet@production] production-m5.sql.erb: Add new mailman3 final users

https://gerrit.wikimedia.org/r/681575

oh and this one needs backups but that can happen later.

@jcrespo can you handle this? Thank you!

@jcrespo can you handle this? Thank you!

Which section is this in? I cannot find it on the task.

Change 681584 had a related patch set uploaded (by Jcrespo; author: Jcrespo):

[operations/puppet@production] mariadb: Add mailman3 and mailman3web to the list of host to be backed up

https://gerrit.wikimedia.org/r/681584

oh and this one needs backups but that can happen later.

@Ladsgroup please check above patch. As database backups were already setup for other m5 databases, it only has to be added to the list.

I saw the dbs currently do not contain any tables. Please consider creating the tables compressed, specially for those that may store lots of data, to save db and backup space.

oh and this one needs backups but that can happen later.

@Ladsgroup please check above patch. As database backups were already setup for other m5 databases, it only has to be added to the list.

Thanks. Look good to me. I let Kunal take a look too.

I saw the dbs currently do not contain any tables. Please consider creating the tables compressed, specially for those that may store lots of data, to save db and backup space.

mailman3 should be small but mailman3web can get rather big. According to some random documentation found in the internet. It happens automatically once it gets bigger than a certain threshold: https://django-mysql.readthedocs.io/en/latest/cache.html#compression but it should be double checked of course. (btw this project has the best logo: https://pypi.org/project/django-mysql/)

mailman3 should be small but mailman3web can get rather big. According to some random documentation found in the internet. It happens automatically once it gets bigger than a certain threshold: https://django-mysql.readthedocs.io/en/latest/cache.html#compression but it should be double checked of course. (btw this project has the best logo: https://pypi.org/project/django-mysql/)

Oh, I meant database-level compression. If there is already application-level compression, we may want to skip the database layer one, to not increase overhead with little benefit, like we do, for example, with External Storage servers. Let's just keep monitoring the storage footprint.

Change 681753 had a related patch set uploaded (by Legoktm; author: Legoktm):

[operations/puppet@production] mariadb: Allow lists1001.wikimedia.org to talk to m5

https://gerrit.wikimedia.org/r/681753

Change 681584 merged by Jcrespo:

[operations/puppet@production] mariadb: Add mailman3 and mailman3web to the list of hosts to be backed up

https://gerrit.wikimedia.org/r/681584

Mentioned in SAL (#wikimedia-operations) [2021-04-21T17:43:00Z] <jynus> deploy grant changes on m5 backup sources (db1117 and db2078) T278614

Backups have been enabled and access seems correct. I saw the dbs are right now empty, but please ping me at some point in the future to compare the list of tables backed up and its sizes with the ones expected.

It slipped my mind that we need to test the new packages first, I filed T280887: Upgrade lists-next to bullseye mailman versions for that.

If we can get that upgrade done this week (or early next week), then I think we can schedule a time to set up production/real mailman3 a day after.

mailman3 should be small but mailman3web can get rather big. According to some random documentation found in the internet. It happens automatically once it gets bigger than a certain threshold: https://django-mysql.readthedocs.io/en/latest/cache.html#compression but it should be double checked of course. (btw this project has the best logo: https://pypi.org/project/django-mysql/)

I don't think we're using django-mysql, and also that documentation seems to apply just for caching, not actual storage.

Oh, I meant database-level compression. If there is already application-level compression, we may want to skip the database layer one, to not increase overhead with little benefit, like we do, for example, with External Storage servers. Let's just keep monitoring the storage footprint.

I'm not familiar with database-level compression, which one of https://mariadb.com/kb/en/optimization-and-tuning-compression/ are you referring to?

Change 681753 merged by Legoktm:

[operations/puppet@production] mariadb: Allow lists1001.wikimedia.org to talk to m5

https://gerrit.wikimedia.org/r/681753

Backups have been enabled and access seems correct. I saw the dbs are right now empty, but please ping me at some point in the future to compare the list of tables backed up and its sizes with the ones expected.

@jcrespo, the tables have been created and have some content in them now if you could take a look.

I assume the size is for the search index.

@Ladsgroup if we are going to keep track of the testing database deletion on T281548: Delete lists-next.wikimedia.org, we can probably ignore T278614#7022985 and close this if you are ready for it.

Ladsgroup assigned this task to Marostegui.

Let's call it done "Create production databases for mailman3" is clearly done.