Blazegraph (the application that serves WDQS) stores all its data in a single JNL file. WDQS' JNL file is very large (~1.2TB) so moving it on and off the hosts tends to be difficult (see T344732 and this blog post . )
We've had to do this more than once, and my general rule is that if you have to do something more than twice, you need to automate it.
Creating this ticket to:
- Document the process of extracting a JNL file from a wdqs hosts
- Solicit feedback from co-workers/community members, and make a decision on whether to automate this process. Note that this does not mean we'll run this process on a schedule, like we do for the TTL dumps; just that we'll have a ready-made script to run that starts with a JNL file on a wdqs server and ends up with a JNL file in in a place where it can be publicly downloaded.
- There is a separate discussion on whether or not to include the JNL files in our regular dumps; see T344905 for more details.