Following the data-transfer of the most recent wikidata.jnl, we've hit low enough disk space to trigger the warning threshold.
DISK WARNING - free space: /srv 45621 MB (4% inode=99%)
While Blazegraph's need for free space for compaction specifically is quite low compared to other datastores, the raw amount of space left gives us an unacceptably low amount of headroom for our journal file(s) to keep expanding.
We should take short-term action to address the lack of available disk space. We can double our existing space by migrating from raid10 to raid0. This will cost us redundancy, but it's an acceptable tradeoff in the short term. Medium-term, our newer instances will have more storage and in particular will have at least 4 expansion slots free each if we use the same spec we used for WCQS.
- Migrated to raid0
- Switch partman recipe to raid0
- Re-image each server
- Do a combover of all the current servers, verifying which hosts this issue applies to (currently looks like it might be every server except potentially wdqs101[1-3])
[EQIAD PUBLIC] wdqs1004.eqiad.wmnet => (FAILED [x3], N/A) wdqs1005.eqiad.wmnet => (DON'T_REIMAGE_TILL_LATER, NEW_JOURNAL) wdqs1006.eqiad.wmnet => (SUCCESS, NEW_JOURNAL) wdqs1007.eqiad.wmnet => (REIMAGING, NEW_JOURNAL) wdqs1012.eqiad.wmnet => (NOT_REIMAGED, NEW_JOURNAL) wdqs1013.eqiad.wmnet => (SUCCESS, NEW_JOURNAL) [EQIAD INTERNAL] wdqs1003.eqiad.wmnet => (NOT_REIMAGED, NEW_JOURNAL) wdqs1008.eqiad.wmnet => (NOT_REIMAGED, NEW_JOURNAL) wdqs1011.eqiad.wmnet => (SUCCESS, NEW_JOURNAL) [CODFW PUBLIC] wdqs2001.codfw.wmnet => (NOT_REIMAGED, NEW_JOURNAL) wdqs2002.codfw.wmnet => (NOT_REIMAGED, NEW_JOURNAL) wdqs2003.codfw.wmnet => (NOT_REIMAGED, NEW_JOURNAL) wdqs2004.codfw.wmnet => (REIMAGING, N/A) wdqs2007.codfw.wmnet => (REIMAGED [but HW FAILURE], NEW_JOURNAL) [CODFW INTERNAL] wdqs2005.codfw.wmnet => (NOT_REIMAGED, OLD_JOURNAL) wdqs2006.codfw.wmnet => (NOT_REIMAGED, OLD_JOURNAL) wdqs2008.codfw.wmnet => (NOT_REIMAGED, NEW_JOURNAL) [TEST] wdqs1009.eqiad.wmnet => (DON'T_REIMAGE_TILL_LATER) wdqs1010.eqiad.wmnet => (SUCCESS, NEW_JOURNAL)