Page MenuHomePhabricator

WDQS Graph Split Manual Data Load Notes

Authored By
RKemper
Dec 7 2023, 8:12 PM
Size
1 KB
Referenced Files
None
Subscribers
None

WDQS Graph Split Manual Data Load Notes

# Downtime host(s) to reduce noise
ryankemper@cumin1001:~$ sudo -E cookbook sre.hosts.downtime --days 7 -r 'graph split experiments T350106' wdqs102[2-4].eqiad.wmnet
# Set permissions on files if not already sufficient
chmod 555 /srv/T350106/gzips/gzips/gzips/nt_wd_schol/*
# Run from `/srv/T350106/gzips/gzips/gzips/nt_wd_schol/*` to change file ext from .txt.gz to .ttl.gz
for FILE in *; do NEW_FILE="$(echo $FILE | sed 's~.txt.gz~.ttl.gz~';)"; sudo mv $FILE $NEW_FILE; done
# Disable puppet, stop blazegraph, clear out jnl file, start blazegraph, restart exporter
sudo disable-puppet "T350106" && sudo systemctl stop wdqs-blazegraph && sleep 5 && rm -fv /srv/wdqs/wikidata.jnl && sleep 5 && sudo systemctl start wdqs-blazegraph && sudo systemctl restart prometheus-blazegraph-exporter-wdqs-blazegraph.service
# Get slightly-modified loadData.sh into place if not present
scp loadData.sh ryankemper@wdqs1023.eqiad.wmnet:/home/ryankemper/loadData.sh
# Modify further to match file format of /srv/T350106/gzips/gzips/gzips/nt_wd_schol/* (in this case) if necessary
vi /srv/T350106/loadData.sh
# Run on first chunk
sudo /srv/T350106/loadData.sh -n wdq -d /srv/T350106/gzips/gzips/gzips/nt_wd_schol -s 0 -e 0
# Run on all remaining chunks
sudo /srv/T350106/loadData.sh -n wdq -d /srv/T350106/gzips/gzips/gzips/nt_wd_schol

File Metadata

Mime Type
text/plain; charset=utf-8
Storage Engine
blob
Storage Format
Raw Data
Storage Handle
14403366
Default Alt Text
WDQS Graph Split Manual Data Load Notes (1 KB)

Event Timeline