Page MenuHomePhabricator

makeMailingList.php creates 30GB of data
Closed, ResolvedPublic

Description

While investigating a disk space warning on deploy1003, I found that most of the space was taken by the generated files from makeMailingList.php (30 out of the 65GB used).

image.png (613×2 px, 49 KB)

I can add some more space to the disk as there is still unallocated space on the VolumeGroup

cgoubert@deploy1003:~$ sudo vgdisplay | grep Free
  Free  PE / Size       15193 / <59.35 GiB

I however can't allocate all of that, since we want to keep some leeway for emergency resizes, so I do wonder how much will be enough to handle the additional output file from deduplicateMailingList.php.

Do we have history on how big the deduped data is?

Event Timeline

Clement_Goubert triaged this task as High priority.

This is now critical, so I'll add 20GB of space to /.

Can we start a discussion on how to rework these scripts to avoid these kinds of issues?

Mentioned in SAL (#wikimedia-operations) [2025-09-10T09:28:10Z] <claime> cgoubert@deploy1003:/home$ sudo lvextend -L +20G /dev/vg0/root && sudo resize2fs /dev/vg0/root - T404060

I'll shut them down. Working with Tim on what happened here

I have deleted all the ml-* files, some of which were incredibly big. So hopefully that resolves the issue.

I'm running into a problem cancelling the jobs:

foks@deploy1003:~$ kube_env mw-script-restricted-deploy eqiad
bash: kube_env: command not found

I've reached out to Reuven about it.

Got to the bottom of it. All jobs should have been terminated now. There was an issue with the script that was compiling lists with waaaay too many users due to a missed step in the procedure. My apologies for that.

The election was missing the needs-central-list property, so it was trying to mail all users, not just qualified users. All files generated so far should be deleted. The files should be smaller when the script is run again with the property correctly set.

Got to the bottom of it. All jobs should have been terminated now. There was an issue with the script that was compiling lists with waaaay too many users due to a missed step in the procedure. My apologies for that.

The election was missing the needs-central-list property, so it was trying to mail all users, not just qualified users. All files generated so far should be deleted. The files should be smaller when the script is run again with the property correctly set.

Fantastic, thank you both. I'm resolving the task.

The election was missing the needs-central-list property, so it was trying to mail all users, not just qualified users. All files generated so far should be deleted. The files should be smaller when the script is run again with the property correctly set.

Following up on this at T404537