Page MenuHomePhabricator

IA-Upload does not take care of disk space when accepting jobs
Open, HighPublicBUG REPORT

Description

List of steps to reproduce and what happens? (step by step, including full links if applicable):

  • Creates upload jobs with conversion of huge books (~5 1GB books should do the job)
  • The VPS server disk becomes full and the website down with error 500

What should have happened instead?:
When attempting to upload a book, IA-upload should check the disk size and reject the upload if there is not enough space for it (and/or left for the other users).

Event Timeline

Tpt triaged this task as High priority.Oct 3 2021, 12:40 PM
Tpt created this task.

Most of these large books have large numbers of pages, which causes issues other than just running out of disk space.

But anyway, one fix for this could be to move the jobqueue directory to a separate, larger, filesystem. The VPS disk is only 20 GB (and the jobqueue is currently 12 GB); we could ask for a new volume (or rather, ask for more quota in the wikisource VPS project).

I've requested 30 GB extra: T293165

Not that having more space means we shouldn't still fix this problem of course.

I wonder if we can check the size of the IA item, and reject it early if it's too big?

Okay, we've got more quota; I'll move the /var/www/tool/jobqueue directory to a new volume now.

New filesystem now in use:

samwilson@ia-upload-prod:/var/www/tool$ df -h
Filesystem      Size  Used Avail Use% Mounted on
udev            3.9G     0  3.9G   0% /dev
tmpfs           799M   81M  718M  11% /run
/dev/sda1        20G  2.4G   17G  13% /
tmpfs           3.9G     0  3.9G   0% /dev/shm
tmpfs           5.0M     0  5.0M   0% /run/lock
tmpfs           3.9G     0  3.9G   0% /sys/fs/cgroup
tmpfs           799M     0  799M   0% /run/user/0
tmpfs           799M     0  799M   0% /run/user/3205
/dev/sdb         30G   12G   17G  40% /ia-upload

And I've tested the job queue operation and all seems fine.

The bug happend again on the new disk with the "exhibitorsherald2*" files. I removed them to put the tool back on. We should maybe prevent files >1GB to be downloaded and processed.