Page MenuHomePhabricator

rm'ing a specific file on NFS hangs on (dev|login).toolforge.org
Closed, ResolvedPublicBUG REPORT

Description

Steps to replicate the issue (include links if applicable):

tools.xlinks@tools-sgebastion-11:~$ pwd
/data/project/xlinks
tools.xlinks@tools-sgebastion-11:~$ time rm xlinks
^C

real	1m9.641s
user	0m0.001s
sys	0m0.005s

What happens?:
Never completes.

What should have happened instead?:
Completes almost instantaneously.

This is bad because is it is the main binary for my xlinks tool. mv'ing it hangs well. This prevents me from updating it.

Event Timeline

$ ssh root@tools-nfs-2.tools.eqiad1.wikimedia.cloud
$ cd /srv/tools/project/xlinks
$ file xlinks
xlinks: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=733a8162031de7779dff6302ebed0ee5dce0fade, for GNU/Linux 3.2.0, with debug_info, not stripped
$ ls -lh xlinks
-rwxr-xr-x 1 tools.xlinks tools.xlinks 246M Jan 30 15:20 xlinks
$ lsattr -l xlinks
xlinks                       Extents
$ rm xlinks
(seems to just hang)
$ mv xlinks xlinks.please.delete
(seems to just hang)

Things seem to hang in the same way as T357098: [tools.meta] can't delete file inside cache/wikimedia-wikis.dat:

root@tools-nfs-2:/srv/tools/project/xlinks# rm xlinks &
[1] 3894371
root@tools-nfs-2:/srv/tools/project/xlinks# strace -p 3894371
strace: Process 3894371 attached
unlinkat(AT_FDCWD, "xlinks", 0
Count_Count claimed this task.

Not sure what happened but for some reason the file is gone now.