Page MenuHomePhabricator

Fix "running out of storage" issue on the pixel server
Closed, ResolvedPublic

Event Timeline

I think I fixed one of the issues related to this today

The /srv/clean.sh file on the server had a line break before one of its "&&" operators. The doc had it too:

https://wikitech.wikimedia.org/w/index.php?title=Pixel&diff=2163787&oldid=2159178

This prevented /srv/clean.sh from executing ./pixel.js clean because it crashed on that newline && combo

This meant ./pixel.js clean never ran and thus space was never reclaimed by it

I suspect that line break was accidentally added at some point

I have this PR as well which losslessly optimizes PNG as they are created:

https://github.com/wikimedia/pixel/pull/252

I'm also looking at if it might be safe to add a call to docker system prune -af to either the cron script or the pixel clean func...

Edit - I did it:

https://github.com/wikimedia/pixel/pull/259

Watching the docker folder's size change live as it runs with this:

while true; do clear && du -h /var/lib/docker --max-depth=1 | sort -hr; sleep 5; done

It's dancing pretty close to 20G during the run...

Going to experiment with this locally and look more into perhaps disposing of docker image build artifacts sooner rather than waiting for the cron job.

Edit: Disregard. I need to look into this more. I think I don't have a handle on exactly when things get kicked off...

Another issue is priority 3 runs every single group, and that takes a couple hours. So it kicks off at 00:27 and it's still running a couple hours later, which causes the run after it to clobber it:

00:27 - node pixel.js runAll --priority "3" --directory /var/www/html
01:27 - node pixel.js runAll --priority "1" --directory /var/www/html
02:27 - node pixel.js runAll --priority "1" --directory /var/www/html
03:27 - node pixel.js runAll --priority "1" --directory /var/www/html
04:27 - node pixel.js runAll --priority "1" --directory /var/www/html
05:27 - node pixel.js runAll --priority "1" --directory /var/www/html
06:27 - node pixel.js runAll --priority "2" --directory /var/www/html
07:27 - node pixel.js runAll --priority "1" --directory /var/www/html
08:00 - /srv/clean.sh
08:27 - node pixel.js runAll --priority "1" --directory /var/www/html
09:27 - node pixel.js runAll --priority "1" --directory /var/www/html
10:27 - node pixel.js runAll --priority "1" --directory /var/www/html
11:27 - node pixel.js runAll --priority "1" --directory /var/www/html
12:27 - node pixel.js runAll --priority "1" --directory /var/www/html
13:27 - node pixel.js runAll --priority "1" --directory /var/www/html
14:27 - node pixel.js runAll --priority "1" --directory /var/www/html
15:27 - node pixel.js runAll --priority "1" --directory /var/www/html
16:27 - node pixel.js runAll --priority "1" --directory /var/www/html
17:27 - node pixel.js runAll --priority "1" --directory /var/www/html
18:27 - node pixel.js runAll --priority "1" --directory /var/www/html
19:27 - node pixel.js runAll --priority "1" --directory /var/www/html
20:27 - node pixel.js runAll --priority "1" --directory /var/www/html
21:27 - node pixel.js runAll --priority "1" --directory /var/www/html
22:27 - node pixel.js runAll --priority "1" --directory /var/www/html
23:27 - node pixel.js runAll --priority "1" --directory /var/www/html

I need to figure out why it's taking hours. It probably shouldn't be. Look at the logs

Double check that it's intentional that "3" causes every group to be run

This comment was removed by Mhurd.
Mhurd renamed this task from Fix issue "running out of storage" on the pixel server issue to Fix "running out of storage" issue on the pixel server.Mar 29 2024, 6:45 AM

More aggressively remove Docker artifacts on clean
https://github.com/wikimedia/pixel/pull/259

Simplify Dockerfiles
https://github.com/wikimedia/pixel/pull/254

Lossless PNGs optimization upon their appearance in "report" and its subdirectories
https://github.com/wikimedia/pixel/pull/252

Closing for now. May add more notes as further changes are made which could further reduce disk usage.

This should help too:

https://phabricator.wikimedia.org/T360440