Just now I noticed a non-paging alert about a puppet failure on tools-sgebastion-10. When investigating I discovered that nfs was frozen for that host; I checked tools-bastion-13 and it too could not access project or user NFS.
$ systemctl restart nfs-server
on tools-nfs-2.tools.eqiad1.wikimedia.cloud seems to have resolved the issue, but I don't have any theory of cause.
I first logged in to investigate at 04:44:16 but I see @Anomie on IRC complaining about not being able to log in to sgebastion-10 starting earlier, around 04:00.
So we have two mysteries: why NFS stopped responding, and why this didn't alert.