Some links
Alert link => https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=wdqs1005&service=Disk+space
Graph of triples not updating (due to lack of disk) => https://grafana.wikimedia.org/d/000000489/wikidata-query-service?viewPanel=7&orgId=1&var-cluster_name=wdqs&from=1608044124030&to=1608105553755
General context
For whatever reason, the wikidata.jnl on wdqs1005 appears to be ~140GB larger than the vast majority of the rest of the fleet:
ryankemper@cumin1001:~$ sudo cumin 'P{wdqs*}' 'du -h /srv/wdqs/wikidata.jnl' 19 hosts will be targeted: wdqs[2001-2008].codfw.wmnet,wdqs[1003-1013].eqiad.wmnet Confirm to continue [y/n]? y ===== NODE GROUP ===== (1) wdqs1005.eqiad.wmnet ----- OUTPUT of 'du -h /srv/wdqs/wikidata.jnl' ----- 1021G /srv/wdqs/wikidata.jnl ===== NODE GROUP ===== (1) wdqs1007.eqiad.wmnet ----- OUTPUT of 'du -h /srv/wdqs/wikidata.jnl' ----- 948G /srv/wdqs/wikidata.jnl ===== NODE GROUP ===== (17) wdqs[2001-2008].codfw.wmnet,wdqs[1003-1004,1006,1008-1013].eqiad.wmnet ----- OUTPUT of 'du -h /srv/wdqs/wikidata.jnl' ----- 886G /srv/wdqs/wikidata.jnl ================ PASS |█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 100% (19/19) [00:00<00:00, 19.42hosts/s] FAIL | | 0% (0/19) [00:00<?, ?hosts/s] 100.0% (19/19) success ratio (>= 100.0% threshold) for command: 'du -h /srv/wdqs/wikidata.jnl'. 100.0% (19/19) success ratio (>= 100.0% threshold) of nodes successfully executed all commands.
This does not seem to be attributable to a higher triple count (see the graph link in the top section)