https://graphite.wikimedia.org/render/?width=586&height=308&_salt=1426245937.7&target=servers.labstore1001.network.bond0.rx_byte.value&target=servers.labstore1001.network.bond0.tx_byte.value has data (ganglia doesn't again). My bet is on some tool maxing out NFS bandwidth.
Description
Description
Related Objects
Related Objects
Event Timeline
Comment Actions
Turns out it was a tool that had started up a thousand or so jobs that all hit NFS, saturating everything. I've killed all the jobs, and will notify the bot authors soon.
Also, we need to have an alert for network utilization on labstores.