Page MenuHomePhabricator

Blazegraph journal too large on wdqs1012
Closed, ResolvedPublic1 Estimated Story Points

Description

On wdqs1012, the Blazegraph journal is > 2TB. This is an indication that something is wrong and the journal needs to be scraped and recovered from another node.

No further investigation is needed at this point, this is a known issue and will be solved by moving away from Blazegraph.

Event Timeline

Mentioned in SAL (#wikimedia-operations) [2021-06-08T02:37:40Z] <ryankemper> T284445 after manually stopping blazegraph/wdqs-updater, sudo rm -fv /srv/wdqs/wikidata.jnl on wdqs1012 (clearing old overinflated journal file away before xferring new one)

Mentioned in SAL (#wikimedia-operations) [2021-06-08T02:38:40Z] <ryankemper> T284445 sudo -i cookbook sre.wdqs.data-transfer --source wdqs1011.eqiad.wmnet --dest wdqs1012.eqiad.wmnet --reason "repairing overinflated blazegraph journal" --blazegraph_instance blazegraph on ryankemper@cumin1001 tmux session wdqs

RKemper added a subscriber: RKemper.

This is done. New wikidata.jnl is 975G as expected:

-rw-rw-r-- 1 blazegraph blazegraph 975G Jun 8 19:01 wikidata.jnl

RKemper updated the task description. (Show Details)