[ceph-users] "failed to open ino"

David C dcsysengineer at gmail.com
Wed Nov 29 12:32:21 PST 2017


On Tue, Nov 28, 2017 at 1:50 PM, Jens-U. Mozdzen <jmozdzen at nde.ag> wrote:

> Hi David,
>
> Zitat von David C <dcsysengineer at gmail.com>:
>
>> On 27 Nov 2017 1:06 p.m., "Jens-U. Mozdzen" <jmozdzen at nde.ag> wrote:
>>
>> Hi David,
>>
>> Zitat von David C <dcsysengineer at gmail.com>:
>>
>> Hi Jens
>>
>>>
>>> We also see these messages quite frequently, mainly the "replicating
>>> dir...". Only seen "failed to open ino" a few times so didn't do any real
>>> investigation. Our set up is very similar to yours, 12.2.1,
>>> active/standby
>>> MDS and exporting cephfs through KNFS (hoping to replace with Ganesha
>>> soon).
>>>
>>>
>> been there, done that - using Ganesha more than doubled the run-time of
>> our
>> jobs, while with knfsd, the run-time is about the same for CephFS-based
>> and
>> "local disk"-based files. But YMMV, so if you see speeds with Ganesha that
>> are similar to knfsd, please report back with details...
>>
>>
>> I'd be interested to know if you tested Ganesha over a cephfs kernel mount
>> (ie using the VFS fsal) or if you used the Ceph fsal. Also the server and
>> client versions you tested.
>>
>
> I had tested Ganesha only via the Ceph FSAL. Our Ceph nodes (including the
> one used as a Ganesha server) are running ceph-12.2.1+git.1507910930.aea79b8b7a
> on OpenSUSE 42.3, SUSE's kernel 4.4.76-1-default (which has a number of
> back-ports in it), Ganesha is at version nfs-ganesha-2.5.2.0+git.150427
> 5777.a9d23b98f.
>
> The NFS clients are a broad mix of current and older systems.
>
> Prior to Luminous, Ganesha writes were terrible due to a bug with fsync
>> calls in the mds code. The fix went into the mds and client code. If
>> you're
>> doing Ganesha over the top of the kernel mount you'll need a pretty recent
>> kernel to see the write improvements.
>>
>
> As we were testing the Ceph FSAL, this should not be the cause.
>
> From my limited Ganesha testing so far, reads are better when exporting the
>> kernel mount, writes are much better with the Ceph fsal. But that's
>> expected for me as I'm using the CentOS kernel. I was hoping the
>> aforementioned fix would make it into the rhel 7.4 kernel but doesn't look
>> like it has.
>>
>
> When exporting the kernel-mounted CephFS via kernel nfsd, we see similar
> speeds to serving the same set of files from a local bcache'd RAID1 array
> on SAS disks. This is for a mix of reads and writes, mostly small files
> (compile jobs, some packaging).
>

I'm surprised your knfs writes are that good on a 4.4 kernel (assuming your
exports aren't async). At least when I tested with the mainline 4.4 kernel
it was still super slow for me. It's only in 4.12 or 4.13 where they
improve. It sounds like Suse have potentially backported some good stuff!



>
> From what I can see, it would have to be A/A/P, since MDS demands at least
>> one stand-by.
>>
>>
>> That's news to me.
>>
>
> From http://docs.ceph.com/docs/master/cephfs/multimds/ :
>
> "Each CephFS filesystem has a max_mds setting, which controls how many
> ranks will be created. The actual number of ranks in the filesystem will
> only be increased if a spare daemon is available to take on the new rank.
> For example, if there is only one MDS daemon running, and max_mds is set to
> two, no second rank will be created."
>
> Might well be I was mis-reading this... I had first read it to mean that a
> spare daemon needs to be available *while running* A/A, but the example
> sounds like the spare is required when *switching to* A/A.
>

Yep I think you're right. Further down that page it states: "Even with
multiple active MDS daemons, a highly available system still requires
standby daemons to take over if any of the servers running an active daemon
fail."

I assumed if an active MDS failed, the surviving MDS(s) would just pick up
the workload. The question is, would losing an MDS in a cluster with no
standbys stop all metadata IO or would it just be a health warning? I need
to do some playing around with this at some point.



> Is it possible you still had standby config in your ceph.conf?
>>
>
> Not sure what you're asking for, is this related to active/active or to
> our Ganesha tests? We have not yet tried to switch to A/A, so our config
> actually contains standby parameters.
>

It was in relation to A/A, but query answered above.

>
> Regards,
> Jens
>
> --
> Jens-U. Mozdzen                         voice   : +49-40-559 51 75
> NDE Netzdesign und -entwicklung AG      fax     : +49-40-559 51 77
> Postfach 61 03 15                       mobile  : +49-179-4 98 21 98
> D-22423 Hamburg                         e-mail  : jmozdzen at nde.ag
>
>         Vorsitzende des Aufsichtsrates: Angelika Torlée-Mozdzen
>           Sitz und Registergericht: Hamburg, HRB 90934
>                   Vorstand: Jens-U. Mozdzen
>                    USt-IdNr. DE 814 013 983
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20171129/36a5cefd/attachment.html>


More information about the ceph-users mailing list