Our mirror of CDNjs seems to be out of sync. Specifically, although TensorFlow.js was added in April (see https://github.com/cdnjs/cdnjs/search?o=asc&q=tensorflow&s=committer-date&type=Commits) and is available on the main mirror (https://cdnjs.com/libraries/tensorflow) but not in ours. Could there be a problem with the syncing?
Description
Details
Subject | Repo | Branch | Lines +/- | |
---|---|---|---|---|
cdnjs-index: force encoding on writing files | labs/tools/cdnjs-index | master | +3 -2 |
Event Timeline
So the issue is that memory tends to run out when deserializing JSON to python objects. I'll test just asking for more memory from gridengine, but it is likely that a pretty big rework is needed. The python streaming deserialization story is not great. I've done this in golang with little difficulty, but here, the libraries are a bit weird and often depend on C libs. The tricky thing is sorting by Github stars, since that sort of depends on having them all in memory. It's easy to just dump to files as the stream comes in. It may be a matter of constructing the html files during streaming, while determining the sort by using a much smaller data structure of stubs--or using something to store the data in temporarily like Redis.
If I can just ask for 3GB of memory, and mess with the ulimits, we might be fine.
Good news! Asking for more memory from gridengine did succeed in not getting a memory error so far (got past the whole JSON deserialization). However, I now get:
UnicodeEncodeError: 'ascii' codec can't encode character '\u2019' in position 14224560: ordinal not in range(128)
Somebody introduced a fancy quote character. That's an easy fix, at least.
Yay! Hm, I wonder if the memory limit be an occasionally-recurring issue as more and more libraries (and their versions) are added to CDNjs over time.
Thank you so much for looking into this and fixing it, @Bstorm! :)
Change 439380 had a related patch set uploaded (by Bstorm; owner: Brooke Storm):
[labs/tools/cdnjs-index@master] cdnjs-index: force encoding on writing files
Change 439380 merged by Bstorm:
[labs/tools/cdnjs-index@master] cdnjs-index: force encoding on writing files