The download link for Google-News word vectors binary is https://drive.google.com/file/d/0B7XkCwpI5KDYNlNUTTlSS21pQmM/edit and is a google drive link not suitable for downloading from CLI. We should host the 3GB binary somewhere public so that we can easily download it on an ores server without having to rely on our individual connections.
I found a few existing mirrors:
- https://github.com/mmihaltz/word2vec-GoogleNews-vectors (requires git-lfs)
- https://s3.amazonaws.com/dl4j-distribution/GoogleNews-vectors-negative300.bin.gz (can use wget)
It would be safest to calculate the SHA-256 and verify content from our Makefile.