It seems like this is not yet critical but won't last long:
tools-static-10
/dev/mapper/vd-cdnjs--disk ext4 139G 119G 14G 91% /srv
tools-static-11
/dev/mapper/vd-cdnjs--disk ext4 139G 119G 14G 91% /srv
• chasemp | |
Dec 11 2017, 4:34 PM |
F12890835: Screen Shot 2018-01-30 at 8.43.30 AM.png | |
Jan 30 2018, 2:44 PM |
F12890822: Screen Shot 2018-01-30 at 8.40.26 AM.png | |
Jan 30 2018, 2:41 PM |
F12890823: Screen Shot 2018-01-30 at 8.40.19 AM.png | |
Jan 30 2018, 2:41 PM |
It seems like this is not yet critical but won't last long:
tools-static-10
/dev/mapper/vd-cdnjs--disk ext4 139G 119G 14G 91% /srv
tools-static-11
/dev/mapper/vd-cdnjs--disk ext4 139G 119G 14G 91% /srv
Mentioned in SAL (#wikimedia-cloud) [2017-12-11T17:07:12Z] <bd808> git gc --aggressive on tools-static-11 (T182604)
Mentioned in SAL (#wikimedia-cloud) [2017-12-11T19:32:48Z] <bd808> git gc on tools-static-11; --aggressive was killed by system (T182604)
After a couple rounds of git-gc: /dev/mapper/vd-cdnjs--disk 139G 118G 14G 90% (functionally no change)
The "fix" for this is either to make more disk available or to change how https://tools-static.wmflabs.org/cdnjs/ works. The latter would probably be easy to do by following the pattern used for T110027: Create a fonts CDN for use on Tool Labs.
I overlooked this ticket and tried the gc at tools-static-11. But the process got killed at 20% by the kernel OOM killer.
First recorded here on Dec 11th at 91% and now it is at /dev/mapper/vd-cdnjs--disk ext4 139G 123G 8.3G 94% /srv after about 50 days.
More than half of the directories have a modification date of July 3 2017: P6646 (which I don't find anything special going on in the repo...)
The same date is also visible after zooming-in in graphite: https://graphite-labs.wikimedia.org/render/?width=965&height=475&_salt=1517322803.145&target=cactiStyle(tools.tools-static-10.diskspace._srv.byte_free)&target=cactiStyle(tools.tools-static-11.diskspace._srv.byte_free)&lineWidth=5&title=Tools%20Static%20%2Fsrv%20Byte%20Free&from=-213d&until=-210d&yMin=0
I checked the sizes:
root@tools-static-11:/srv/cdnjs/ajax/libs# du -hd 1 | tee ~/srv-cdnjs-ajax-libs-du-hd-1 [...] root@tools-static-11:/srv/cdnjs/ajax/libs# sort -h ~/srv-cdnjs-ajax-libs-du-hd-1 | tail 2.0G ./plotly.js 2.2G ./forerunnerdb 3.1G ./hola_player 3.8G ./antd 4.0G ./blackbaud-skyux 4.1G ./mathjax 5.3G ./material-design-icons 11G ./pdf.js 18G ./browser-logos 118G . root@tools-static-11:/srv/cdnjs/ajax/libs# df -h /srv Filesystem Size Used Avail Use% Mounted on /dev/mapper/vd-cdnjs--disk 139G 124G 8.0G 94% /srv root@tools-static-11:/srv/cdnjs/ajax/libs# du ../../.git -sh 5.5G ../../.git
The majority of disk usage is cumulated from the smaller libraries, and there isn't a single 'directory-to-nuke' to gain lots of disk space. (and .git is only 5.5G so gc may not have much gain either). Unless we can find a way to de-duplicate (change duplicates to hard links) and compress all the files, I guess T182604#3916214 is the way to go.
The SAL entries on that day are relevant:
04:26 <bd808> cdnjs on tools-static-10 is up to date 03:38 <bd808> cdnjs on tools-static-11 is up to date 02:19 <bd808> Cleaning up stuck merges for cdnjs clones on tools-static-10 and tools-static-11
I was working on the proxy-to-upstream method, some backwards-incompatibilities found:
Tested a shallow bare clone:
zhuyifei1999@gfg01 /srv/zhuyifei1999 git clone --bare --depth 1 https://github.com/cdnjs/cdnjs.git Cloning into bare repository 'cdnjs.git'... remote: Counting objects: 953009, done. remote: Compressing objects: 100% (494914/494914), done. remote: Total 953009 (delta 446410), reused 944437 (delta 445669), pack-reused 0 Receiving objects: 100% (953009/953009), 5.34 GiB | 18.30 MiB/s, done. Resolving deltas: 100% (446410/446410), done. Checking connectivity... done. zhuyifei1999@gfg01 /srv/zhuyifei1999 du -h cdnjs.git 44K cdnjs.git/hooks 4.0K cdnjs.git/refs/tags 4.0K cdnjs.git/refs/heads 12K cdnjs.git/refs 5.4G cdnjs.git/objects/pack 4.0K cdnjs.git/objects/info 5.4G cdnjs.git/objects 8.0K cdnjs.git/info 4.0K cdnjs.git/branches 5.4G cdnjs.git
Considering that a shallow bare clone is only 5.4G. Is it possible and good performance-wise to somehow mount the shallow clone?
Apparently, this is a shallow clone: https://phabricator.wikimedia.org/source/operations-puppet/browse/production/modules/toollabs/manifests/static.pp;88c43843c3d886b90b5756ad7fb6a7ee4d84da7e$35
Just FYI.
Yes, it's a shallow clone, but not a bare clone nor a no-checkout clone. All the information needed to reconstruct a shallow clone is only 5.4G, but to actually perform the checkout, git likely uncompresses all the objects and duplicates many identical files (instead of creating hard links). This creates the 118G madness in /srv/cdnjs/ajax/libs (see T182604#3931024).
I was thinking if nginx can performant-ly serve the files without requiring a full checkout, perhaps using a dynamic fuse or some daemons what can access git objects. That's probably one way if we don't proxy to upstream, considering T182604#3931564.
This isn't saying that we shouldn't proxy to upstream; it's just not as smooth as I thought it would be.
Ok, I'm thinking that a mount is likely to decompress a lot of things into memory if it doesn't do it on disk. We'd likely be trading one problem for another, though investigating the option was fun. I've sorted out some patches on the generator scripts so that we won't need a local checkout to generate the front page, using a modified version of @zhuyifei1999 's idea with the cdnjs API. Just going to test behavior of the reverse proxy and stuff, then I'll submit patches and see if we can make it work.
Change 409416 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[labs/tools/cdnjs-index@beta] tools-cdnjs-beta: switching to a reverse proxy solution
Change 409416 merged by Bstorm:
[labs/tools/cdnjs-index@beta] tools-cdnjs-beta: switching to a reverse proxy solution
Change 409448 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[labs/tools/cdnjs-index@beta] tools-cdnjs-beta: switching to a reverse proxy solution
Change 409448 merged by Bstorm:
[labs/tools/cdnjs-index@beta] tools-cdnjs-beta: switching to a reverse proxy solution
The beta project is up here: https://toolsadmin.wikimedia.org/tools/id/cdnjs-beta
Servers are up to host the json file and do the reverse proxying. Those need configuration, a couple jobs and a webservice later and we can see how well this will work.
That's up as of now for people to take a look at. I admit, it isn't pretty. It is fairly functional, though.
Is there a way to enable quick link copying instead of having users to select the uri then ctrl-c manually? I'd say even adding an <a> would be an improvement. The less mouse moves, the better.
I'm sure there is! Since we have separate files to load the modals, it might not crash a browser, too (this did crash browsers until I changed to separate files). I think I ought to prioritize getting the disk freed up first, though. For now, triple-click works anyway, right? That's why they are in separate panes. It would certainly benefit from more iterations and maybe a designer's touch in the future.
General plan to fix this at this point:
https://etherpad.wikimedia.org/p/cdnjs-proxy-plan
Change 413009 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[labs/tools/cdnjs-index@master] Merge branch 'beta' into master for deploy on main project
Change 413009 merged by Bstorm:
[labs/tools/cdnjs-index@master] Merge branch 'beta' into master for deploy on main project
The beta branch is now the master branch. The new version of the site is up. Moving on with the next steps.
Change 413197 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[operations/puppet@production] tools-static: Change to reverse proxy of cdnjs
Change 413197 merged by Bstorm:
[operations/puppet@production] tools-static: Change to reverse proxy of cdnjs
tools-static URLs are now using the new server. When we are satisfied with it working well, I'll remove the old ones and the space issue will be fixed.
In my own testing, all functions work with the new server.
Change 413469 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[operations/puppet@production] tools-static: Remove problematic headers from proxy responses
Change 413469 merged by Bstorm:
[operations/puppet@production] tools-static: Remove problematic headers from proxy responses
If it all looks good now, I'll start the cleanup and stand up the second new tools-static.
I am deleting tools-static-10 and 11 at 4pm PST today if there are no objections, I think.
Let's wait until Monday just to avoid upsetting the weekend alerting guardian spirits.
Mentioned in SAL (#wikimedia-cloud) [2018-02-26T21:18:40Z] <bstorm_> Deleted tools-static-10 and tools-static-11 now that they are replaced with the much smaller 12 and 13 https://phabricator.wikimedia.org/T182604
Since the tools-static-10 and 11 servers are now deleted, replaced by a reverse proxy solution on tools-static-12 and 13. This issue is getting closed. If there are any further problems, please open a new task.