Related Objects
- Mentioned Here
- T360440: Setup a new server for pixel
Event Timeline
I think I fixed one of the issues related to this today
The /srv/clean.sh file on the server had a line break before one of its "&&" operators. The doc had it too:
https://wikitech.wikimedia.org/w/index.php?title=Pixel&diff=2163787&oldid=2159178
This prevented /srv/clean.sh from executing ./pixel.js clean because it crashed on that newline && combo
This meant ./pixel.js clean never ran and thus space was never reclaimed by it
I suspect that line break was accidentally added at some point
I'm also looking at if it might be safe to add a call to docker system prune -af to either the cron script or the pixel clean func...
Edit - I did it:
Watching the docker folder's size change live as it runs with this:
while true; do clear && du -h /var/lib/docker --max-depth=1 | sort -hr; sleep 5; done
It's dancing pretty close to 20G during the run...
Going to experiment with this locally and look more into perhaps disposing of docker image build artifacts sooner rather than waiting for the cron job.
Edit: Disregard. I need to look into this more. I think I don't have a handle on exactly when things get kicked off...
Another issue is priority 3 runs every single group, and that takes a couple hours. So it kicks off at 00:27 and it's still running a couple hours later, which causes the run after it to clobber it:
00:27 - node pixel.js runAll --priority "3" --directory /var/www/html
01:27 - node pixel.js runAll --priority "1" --directory /var/www/html
02:27 - node pixel.js runAll --priority "1" --directory /var/www/html
03:27 - node pixel.js runAll --priority "1" --directory /var/www/html
04:27 - node pixel.js runAll --priority "1" --directory /var/www/html
05:27 - node pixel.js runAll --priority "1" --directory /var/www/html
06:27 - node pixel.js runAll --priority "2" --directory /var/www/html
07:27 - node pixel.js runAll --priority "1" --directory /var/www/html
08:00 - /srv/clean.sh
08:27 - node pixel.js runAll --priority "1" --directory /var/www/html
09:27 - node pixel.js runAll --priority "1" --directory /var/www/html
10:27 - node pixel.js runAll --priority "1" --directory /var/www/html
11:27 - node pixel.js runAll --priority "1" --directory /var/www/html
12:27 - node pixel.js runAll --priority "1" --directory /var/www/html
13:27 - node pixel.js runAll --priority "1" --directory /var/www/html
14:27 - node pixel.js runAll --priority "1" --directory /var/www/html
15:27 - node pixel.js runAll --priority "1" --directory /var/www/html
16:27 - node pixel.js runAll --priority "1" --directory /var/www/html
17:27 - node pixel.js runAll --priority "1" --directory /var/www/html
18:27 - node pixel.js runAll --priority "1" --directory /var/www/html
19:27 - node pixel.js runAll --priority "1" --directory /var/www/html
20:27 - node pixel.js runAll --priority "1" --directory /var/www/html
21:27 - node pixel.js runAll --priority "1" --directory /var/www/html
22:27 - node pixel.js runAll --priority "1" --directory /var/www/html
23:27 - node pixel.js runAll --priority "1" --directory /var/www/html
I need to figure out why it's taking hours. It probably shouldn't be. Look at the logs
Double check that it's intentional that "3" causes every group to be run
More aggressively remove Docker artifacts on clean
https://github.com/wikimedia/pixel/pull/259
Simplify Dockerfiles
https://github.com/wikimedia/pixel/pull/254
Lossless PNGs optimization upon their appearance in "report" and its subdirectories
https://github.com/wikimedia/pixel/pull/252
Closing for now. May add more notes as further changes are made which could further reduce disk usage.
This should help too: