[ceph-users] Ceph or Gluster for implementing big NAS

Premysl Kouril premysl.kouril at gmail.com
Mon Nov 12 06:51:40 PST 2018


Some kind of single point will always be there I guess. Because even if we
go with the distributed filesystem, it will be mounted to the access VM and
this access VM will be providing NFS/CIFS protocol access. So this machine
is single point of failure (indeed we would be running two of them for
active-passive HA setup. In case of distributed filesystem approach the
failure of the access VM would mean re-mounting the filesystem on the
passive access VM. In case of "monster VM" approach, in case of the VM
failure it would mean reattaching all block volumes to a new VM.

On Mon, Nov 12, 2018 at 3:40 PM Ashley Merrick <singapore at amerrick.co.uk>
wrote:

> My 2 cents would be depends how H/A you need.
>
> Going with the monster VM you have a single point of failure and a single
> point of network congestion.
>
> If you go the CephFS route you remove that single point of failure if you
> mount to clients directly. And also can remove that single point of network
> congestion.
>
> Guess depends on the performance and uptime required , as I’d say that
> could factory into your decisions.
>
> On Mon, 12 Nov 2018 at 10:36 PM, Premysl Kouril <premysl.kouril at gmail.com>
> wrote:
>
>> Hi Kevin,
>>
>> I should have also said, that we are internally inclined towards the
>> "monster VM" approach due to seemingly simpler architecture (data
>> distribution on block layer rather than on file system layer). So my
>> original question is more about comparing the two approaches (distribution
>> on block layer vs distribution on filesystem layer). "Monster VM" approach
>> being the one where we just keep mounting block volumes to a single VM
>> with normal non-distributed filesystem and then exporting via NFS/CIFS.
>>
>> Regards,
>> Prema
>>
>> On Mon, Nov 12, 2018 at 3:17 PM Kevin Olbrich <ko at sv01.de> wrote:
>>
>>> Hi Dan,
>>>
>>> ZFS without sync would be very much identical to ext2/ext4 without
>>> journals or XFS with barriers disabled.
>>> The ARC cache in ZFS is awesome but disbaling sync on ZFS is a very high
>>> risk (using ext4 with kvm-mode unsafe would be similar I think).
>>>
>>> Also, ZFS only works as expected with scheduler set to noop as it is
>>> optimized to consume whole, non-shared devices.
>>>
>>> Just my 2 cents ;-)
>>>
>>> Kevin
>>>
>>>
>>> Am Mo., 12. Nov. 2018 um 15:08 Uhr schrieb Dan van der Ster <
>>> dan at vanderster.com>:
>>>
>>>> We've done ZFS on RBD in a VM, exported via NFS, for a couple years.
>>>> It's very stable and if your use-case permits you can set zfs
>>>> sync=disabled to get very fast write performance that's tough to beat.
>>>>
>>>> But if you're building something new today and have *only* the NAS
>>>> use-case then it would make better sense to try CephFS first and see
>>>> if it works for you.
>>>>
>>>> -- Dan
>>>>
>>>> On Mon, Nov 12, 2018 at 3:01 PM Kevin Olbrich <ko at sv01.de> wrote:
>>>> >
>>>> > Hi!
>>>> >
>>>> > ZFS won't play nice on ceph. Best would be to mount CephFS directly
>>>> with the ceph-fuse driver on the endpoint.
>>>> > If you definitely want to put a storage gateway between the data and
>>>> the compute nodes, then go with nfs-ganesha which can export CephFS
>>>> directly without local ("proxy") mount.
>>>> >
>>>> > I had such a setup with nfs and switched to mount CephFS directly. If
>>>> using NFS with the same data, you must make sure your HA works well to
>>>> avoid data corruption.
>>>> > With ceph-fuse you directly connect to the cluster, one component
>>>> less that breaks.
>>>> >
>>>> > Kevin
>>>> >
>>>> > Am Mo., 12. Nov. 2018 um 12:44 Uhr schrieb Premysl Kouril <
>>>> premysl.kouril at gmail.com>:
>>>> >>
>>>> >> Hi,
>>>> >>
>>>> >>
>>>> >> We are planning to build NAS solution which will be primarily used
>>>> via NFS and CIFS and workloads ranging from various archival application to
>>>> more “real-time processing”. The NAS will not be used as a block storage
>>>> for virtual machines, so the access really will always be file oriented.
>>>> >>
>>>> >>
>>>> >> We are considering primarily two designs and I’d like to kindly ask
>>>> for any thoughts, views, insights, experiences.
>>>> >>
>>>> >>
>>>> >> Both designs utilize “distributed storage software at some level”.
>>>> Both designs would be built from commodity servers and should scale as we
>>>> grow. Both designs involve virtualization for instantiating "access virtual
>>>> machines" which will be serving the NFS and CIFS protocol - so in this
>>>> sense the access layer is decoupled from the data layer itself.
>>>> >>
>>>> >>
>>>> >> First design is based on a distributed filesystem like Gluster or
>>>> CephFS. We would deploy this software on those commodity servers and mount
>>>> the resultant filesystem on the “access virtual machines” and they would be
>>>> serving the mounted filesystem via NFS/CIFS.
>>>> >>
>>>> >>
>>>> >> Second design is based on distributed block storage using CEPH. So
>>>> we would build distributed block storage on those commodity servers, and
>>>> then, via virtualization (like OpenStack Cinder) we would allocate the
>>>> block storage into the access VM. Inside the access VM we would deploy ZFS
>>>> which would aggregate block storage into a single filesystem. And this
>>>> filesystem would be served via NFS/CIFS from the very same VM.
>>>> >>
>>>> >>
>>>> >> Any advices and insights highly appreciated
>>>> >>
>>>> >>
>>>> >> Cheers,
>>>> >>
>>>> >> Prema
>>>> >>
>>>> >> _______________________________________________
>>>> >> ceph-users mailing list
>>>> >> ceph-users at lists.ceph.com
>>>> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>> >
>>>> > _______________________________________________
>>>> > ceph-users mailing list
>>>> > ceph-users at lists.ceph.com
>>>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users at lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users at lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20181112/da88cbaa/attachment.html>


More information about the ceph-users mailing list