Page MenuHomePhabricator

tools-static is throwing space warnings due to cdnjs git repo size
Closed, ResolvedPublic

Description

It seems like this is not yet critical but won't last long:

tools-static-10
/dev/mapper/vd-cdnjs--disk ext4 139G 119G 14G 91% /srv

tools-static-11
/dev/mapper/vd-cdnjs--disk ext4 139G 119G 14G 91% /srv

Event Timeline

chasemp created this task.

Mentioned in SAL (#wikimedia-cloud) [2017-12-11T17:07:12Z] <bd808> git gc --aggressive on tools-static-11 (T182604)

Mentioned in SAL (#wikimedia-cloud) [2017-12-11T19:32:48Z] <bd808> git gc on tools-static-11; --aggressive was killed by system (T182604)

After a couple rounds of git-gc: /dev/mapper/vd-cdnjs--disk 139G 118G 14G 90% (functionally no change)

bd808 renamed this task from tools-static is throwing space warnings to tools-static is throwing space warnings due to cdnjs git repo size.Dec 12 2017, 3:23 AM

The "fix" for this is either to make more disk available or to change how https://tools-static.wmflabs.org/cdnjs/ works. The latter would probably be easy to do by following the pattern used for T110027: Create a fonts CDN for use on Tool Labs.

I overlooked this ticket and tried the gc at tools-static-11. But the process got killed at 20% by the kernel OOM killer.

Why did this take such a sharp turn at the end of june?

Screen Shot 2018-01-30 at 8.43.30 AM.png (476×962 px, 42 KB)

I checked the sizes:

root@tools-static-11:/srv/cdnjs/ajax/libs# du -hd 1 | tee ~/srv-cdnjs-ajax-libs-du-hd-1
[...]
root@tools-static-11:/srv/cdnjs/ajax/libs# sort -h ~/srv-cdnjs-ajax-libs-du-hd-1 | tail
2.0G	./plotly.js
2.2G	./forerunnerdb
3.1G	./hola_player
3.8G	./antd
4.0G	./blackbaud-skyux
4.1G	./mathjax
5.3G	./material-design-icons
11G	./pdf.js
18G	./browser-logos
118G	.
root@tools-static-11:/srv/cdnjs/ajax/libs# df -h /srv
Filesystem                  Size  Used Avail Use% Mounted on
/dev/mapper/vd-cdnjs--disk  139G  124G  8.0G  94% /srv
root@tools-static-11:/srv/cdnjs/ajax/libs# du ../../.git -sh
5.5G	../../.git

The majority of disk usage is cumulated from the smaller libraries, and there isn't a single 'directory-to-nuke' to gain lots of disk space. (and .git is only 5.5G so gc may not have much gain either). Unless we can find a way to de-duplicate (change duplicates to hard links) and compress all the files, I guess T182604#3916214 is the way to go.

More than half of the directories have a modification date of July 3 2017: P6646 (which I don't find anything special going on in the repo...)

The SAL entries on that day are relevant:

04:26	<bd808>	cdnjs on tools-static-10 is up to date
03:38	<bd808>	cdnjs on tools-static-11 is up to date
02:19	<bd808>	Cleaning up stuck merges for cdnjs clones on tools-static-10 and tools-static-11

I was working on the proxy-to-upstream method, some backwards-incompatibilities found:

Tested a shallow bare clone:

 zhuyifei1999@gfg01  /srv/zhuyifei1999  git clone --bare --depth 1 https://github.com/cdnjs/cdnjs.git
Cloning into bare repository 'cdnjs.git'...
remote: Counting objects: 953009, done.
remote: Compressing objects: 100% (494914/494914), done.
remote: Total 953009 (delta 446410), reused 944437 (delta 445669), pack-reused 0
Receiving objects: 100% (953009/953009), 5.34 GiB | 18.30 MiB/s, done.
Resolving deltas: 100% (446410/446410), done.
Checking connectivity... done.
 zhuyifei1999@gfg01  /srv/zhuyifei1999  du -h cdnjs.git
44K	cdnjs.git/hooks
4.0K	cdnjs.git/refs/tags
4.0K	cdnjs.git/refs/heads
12K	cdnjs.git/refs
5.4G	cdnjs.git/objects/pack
4.0K	cdnjs.git/objects/info
5.4G	cdnjs.git/objects
8.0K	cdnjs.git/info
4.0K	cdnjs.git/branches
5.4G	cdnjs.git

Considering that a shallow bare clone is only 5.4G. Is it possible and good performance-wise to somehow mount the shallow clone?

I thought we were doing a shallow clone already so that's pretty interesting.

Yes, it's a shallow clone, but not a bare clone nor a no-checkout clone. All the information needed to reconstruct a shallow clone is only 5.4G, but to actually perform the checkout, git likely uncompresses all the objects and duplicates many identical files (instead of creating hard links). This creates the 118G madness in /srv/cdnjs/ajax/libs (see T182604#3931024).

I was thinking if nginx can performant-ly serve the files without requiring a full checkout, perhaps using a dynamic fuse or some daemons what can access git objects. That's probably one way if we don't proxy to upstream, considering T182604#3931564.

This isn't saying that we shouldn't proxy to upstream; it's just not as smooth as I thought it would be.

Ok, I'm thinking that a mount is likely to decompress a lot of things into memory if it doesn't do it on disk. We'd likely be trading one problem for another, though investigating the option was fun. I've sorted out some patches on the generator scripts so that we won't need a local checkout to generate the front page, using a modified version of @zhuyifei1999 's idea with the cdnjs API. Just going to test behavior of the reverse proxy and stuff, then I'll submit patches and see if we can make it work.

Change 409416 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[labs/tools/cdnjs-index@beta] tools-cdnjs-beta: switching to a reverse proxy solution

https://gerrit.wikimedia.org/r/409416

Change 409416 merged by Bstorm:
[labs/tools/cdnjs-index@beta] tools-cdnjs-beta: switching to a reverse proxy solution

https://gerrit.wikimedia.org/r/409416

Change 409448 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[labs/tools/cdnjs-index@beta] tools-cdnjs-beta: switching to a reverse proxy solution

https://gerrit.wikimedia.org/r/409448

That version should be much better ^^

Change 409448 merged by Bstorm:
[labs/tools/cdnjs-index@beta] tools-cdnjs-beta: switching to a reverse proxy solution

https://gerrit.wikimedia.org/r/409448

The beta project is up here: https://toolsadmin.wikimedia.org/tools/id/cdnjs-beta

Servers are up to host the json file and do the reverse proxying. Those need configuration, a couple jobs and a webservice later and we can see how well this will work.

That's up as of now for people to take a look at. I admit, it isn't pretty. It is fairly functional, though.

Is there a way to enable quick link copying instead of having users to select the uri then ctrl-c manually? I'd say even adding an <a> would be an improvement. The less mouse moves, the better.

I'm sure there is! Since we have separate files to load the modals, it might not crash a browser, too (this did crash browsers until I changed to separate files). I think I ought to prioritize getting the disk freed up first, though. For now, triple-click works anyway, right? That's why they are in separate panes. It would certainly benefit from more iterations and maybe a designer's touch in the future.

Change 413009 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[labs/tools/cdnjs-index@master] Merge branch 'beta' into master for deploy on main project

https://gerrit.wikimedia.org/r/413009

Change 413009 merged by Bstorm:
[labs/tools/cdnjs-index@master] Merge branch 'beta' into master for deploy on main project

https://gerrit.wikimedia.org/r/413009

The beta branch is now the master branch. The new version of the site is up. Moving on with the next steps.

Change 413197 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[operations/puppet@production] tools-static: Change to reverse proxy of cdnjs

https://gerrit.wikimedia.org/r/413197

Change 413197 merged by Bstorm:
[operations/puppet@production] tools-static: Change to reverse proxy of cdnjs

https://gerrit.wikimedia.org/r/413197

tools-static URLs are now using the new server. When we are satisfied with it working well, I'll remove the old ones and the space issue will be fixed.
In my own testing, all functions work with the new server.

Change 413469 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[operations/puppet@production] tools-static: Remove problematic headers from proxy responses

https://gerrit.wikimedia.org/r/413469

Change 413469 merged by Bstorm:
[operations/puppet@production] tools-static: Remove problematic headers from proxy responses

https://gerrit.wikimedia.org/r/413469

If it all looks good now, I'll start the cleanup and stand up the second new tools-static.

I am deleting tools-static-10 and 11 at 4pm PST today if there are no objections, I think.

I am deleting tools-static-10 and 11 at 4pm PST today if there are no objections, I think.

Let's wait until Monday just to avoid upsetting the weekend alerting guardian spirits.

Mentioned in SAL (#wikimedia-cloud) [2018-02-26T21:18:40Z] <bstorm_> Deleted tools-static-10 and tools-static-11 now that they are replaced with the much smaller 12 and 13 https://phabricator.wikimedia.org/T182604

Since the tools-static-10 and 11 servers are now deleted, replaced by a reverse proxy solution on tools-static-12 and 13. This issue is getting closed. If there are any further problems, please open a new task.