[ceph-users] strange error on link() for nfs over cephfs

Jens-U. Mozdzen jmozdzen at nde.ag
Wed Nov 29 03:44:58 PST 2017


Hi *,

we recently have switched to using CephFS (with Luminous 12.2.1). On  
one node, we're kernel-mounting the CephFS (kernel 4.4.75, openSUSE  
version) and export it via kernel nfsd. As we're transitioning right  
now, a number of machines still auto-mount users home directories from  
that nfsd.

A strange error that was not present when using the same nfsd  
exporting local-disk-based file systems, has recently surfaced. The  
problem is most visible to the user when doing a ssh-keygen operation  
to remove old keys from their "known_hosts", but it seems likely that  
this error will occur in other constellations, too.

The error report from "ssh_keygen" is:

--- cut here ---
user at host:~> ssh-keygen -R somehost -f /home/user/.ssh/known_hosts
# Host somehost found: line 232
link /home/user/.ssh/known_hosts to /home/user/.ssh/known_hosts.old:  
Not a directory
user at host:~>
--- cut here ---

This error persists... until the user lists the contents of the  
directory containing the "known_hosts" file (~/.ssh). Once that is  
done (i.e. "ls -l ~/.ssh"), ssh_keygen works as expected.

We've strace'd ssh_keygen and see the following steps (and more, of course):

- the original known_hosts file is opened successfully
- a temp file is created in .ssh (successfully)
- a previous backup copy (known_hosts.old) is unlink()ed (not  
successful, since not present)
- a link() from known_hosts to known_hosts.old is tried - ENOTDIR

--- cut here ---
[...]
unlink("/home/user/.ssh/known_hosts.old") = -1 ENOENT (No such file or  
directory)
link("/home/user/.ssh/known_hosts", "/home/user/.ssh/known_hosts.old")  
= -1 ENOTDIR (Not a directory)
--- cut here ---

Once the directory was listed, the link() call works nicely:

--- cut here ---
unlink("/home/user/.ssh/known_hosts.old") = -1 ENOENT (No such file or  
directory)
link("/home/user/.ssh/known_hosts", "/home/user/.ssh/known_hosts.old") = 0
rename("/home/user/.ssh/known_hosts.5trpXBpIgB",  
"/home/user/.ssh/known_hosts") = 0
--- cut here ---

When link() returns an error, the rename is not called, leaving the  
user with ("try" times) temporary files in .ssh - they never got  
renamed.

This does sound like a bug to me, has anybody else stumbled across  
similar symptoms as well?

Regards,
Jens




More information about the ceph-users mailing list