Page MenuHomePhabricator

Find somewhere else (not NFS) to store Quarry's resultsets
Open, Needs TriagePublic

Description

Right now they are stored as sqlite database files in NFS /data/project. The total usage is 112G:

zhuyifei1999@quarry-main-01:/data/project/quarry$ du -sh results
112G	results

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 18 2017, 5:59 PM

The query runners must somehow store the results to somewhere the web server can access. Celery does not support sending large tables as a result of a job; doing so would flood the redis server, forever, as the query results are expected to be kept indefinitely.

Halfak added a subscriber: Halfak.Mar 15 2018, 8:42 PM

We've confirmed that there is a bigdisk instance available with 300GB of direct access disk space.

We'll need a backup strategy. Here's a couple that seem sane-ish:

Two big disks Start up to bigdisk instances and ensure that they are on separate physical machines. Backup one to the other with a regular rsync.

Backup to NFS The primary write will go to a bigdisk instance and a script will periodically backup new data to NFS.

Any suggestions on how to send results back from the query runners to the bigdisk?

Good Q. Are there basic file services that we could utilize for this?

Maybe something like this:
https://medium.com/@keagileageek/paramiko-how-to-ssh-and-file-transfers-with-python-75766179de73

Seems like SFTP was made for this. What's the basic storage format now? I know at one point yuvipanda was considering using sqLite. I wonder if we could simply host the sqlite database and write to it remotely -- from the celery workers.

Good Q. Are there basic file services that we could utilize for this?
Maybe something like this:
https://medium.com/@keagileageek/paramiko-how-to-ssh-and-file-transfers-with-python-75766179de73
Seems like SFTP was made for this.

Yes, but sftp is, well, slow. scp may be better. For those to work one might have to mess with cloud ssh key authentication and I'm not sure I'm prepared for that.

What's the basic storage format now? I know at one point yuvipanda was considering using sqLite.

zhuyifei1999@quarry-runner-01:/data/project/quarry/results$ tree | head -n 20
.
├── 1
│   ├── 1
│   │   └── 3.sqlite
│   ├── 109
│   │   └── 293.sqlite
│   ├── 111
│   │   └── 297.sqlite
│   ├── 113
│   │   └── 308.sqlite
│   ├── 116
│   │   └── 320.sqlite
│   ├── 13300
│   │   └── 119301.sqlite
│   ├── 13587
│   │   ├── 121044.sqlite
│   │   ├── 121050.sqlite
│   │   └── 121052.sqlite
│   ├── 13952
│   │   ├── 124745.sqlite

I wonder if we could simply host the sqlite database and write to it remotely -- from the celery workers.

I don't know how that would work. Does sqlite have a remote-access protocol?

Another thing: gotta make sure whatever method we use to store the results remotely is faster than NFS or we'll do a lot of work and end up worse...

I think it'll be hard to beat NFS at that game ;)

Talked to @yuvipanda and he suggested we look into https://docs.min.io

It uses the same storage interface as s3.

Another alternative is to use swift. See https://wiki.openstack.org/wiki/Swift I hear that @godog is working on swift so he might be able to advise.

Generally, we want horizontal scalability so that we can add servers as disk fills up. We also want a good backup scheme. This backup scheme could implement high availability, but that's not critical.

It sounds like minio supports both horizontal scalability, high availability, and built in reduncancies. It's not clear to me exactly what swift supports because the docs are ... sparse.

Just talked to @Krenair for a bit about swift and he asked a me a good question: Why are we looking to get off of NFS again? Just how slow is it? Are we sure it is a performance bottleneck? How big of a bottleneck?

We should add this to the task description.

It is a bottleneck only when people attempt to push our limits. T188564#4778741 has some graphs (the Wait I/O is D-sleep, NFS lag). I'm currently unable to reproduce so much NFS lag as in the graphs (maybe the situation with NFS has improved?)

Krenair added a comment.EditedMay 20 2019, 12:23 AM

Is the NFS server you're using being shared with other projects e.g. tools?
If you're simply treating the sqlite DB files as objects, downloading them to the local instance and then processing locally then you'll probably be okay for smaller datasets. If you're actually trying to run databases on top of NFS then things are not likely to be particularly fast.
Bare in mind Swift has a default 5GB object size limit. I don't think any existing Wikimedia deployment of it changes that.

Is the NFS server you're using being shared with other projects e.g. tools?

The mount is nfs-tools-project.svc.eqiad.wmnet:/project/quarry/project, the directory is not shared, but I would assume the mount on the NFS server is.

If you're simply treating the sqlite DB files as objects, downloading them to the local instance and then processing locally then you'll probably be okay for smaller datasets. If you're actually trying to run databases on top of NFS then things are not likely to be particularly fast.

Good point. Might need some fork magic to make sure clean ups are done properly though.

Bare in mind Swift has a default 5GB object size limit. I don't think any existing Wikimedia deployment of it changes that.

That is fine. The largest database belongs to https://quarry.wmflabs.org/query/34406, at 1.8 GiB:

zhuyifei1999@quarry-db-01:/data/project/quarry/results$ find . -type f -printf "%s\t%p\n" | sort -nr | head
1882435584	./3159/34406/351222.sqlite
1876454400	./613/14029/133261.sqlite
1405534208	./588/3710/23282.sqlite
1389223936	./1717/14679/133339.sqlite
1257127936	./613/15791/147128.sqlite
1257125888	./613/15791/147059.sqlite
1256070144	./613/15791/146622.sqlite
1251917824	./613/15791/146564.sqlite
1251917824	./613/15791/146516.sqlite
1251917824	./613/15791/146494.sqlite
zhuyifei1999@quarry-db-01:/data/project/quarry/results$ ls -lh ./3159/34406/
total 2.3G
-rw-r--r-- 1 498 498 1.8G Mar 16 12:15 351222.sqlite
-rw-r--r-- 1 498 498 470M Mar 16 12:49 351255.sqlite
-rw-r--r-- 1 498 498 9.6K Mar 16 12:49 351255.sqlite-journal