[ceph-users] CephFS desync

Andrey Klimentyev andrey.klimentyev at flant.com
Thu Nov 23 01:49:27 PST 2017


The workload is... really common. It's just a bunch of PHP scripts being
executed via php-fpm, that sometimes write a couple of files (some
e-commerce reports). There were concerns with mmap(2) being used, but it's
not the case, I've checked with strace.
I am using the kernel client with a relatively fresh kernel - 4.10.0-28.

I think, the simplest thing to do, would be updating to Luminous, to be
honest. The problem is elusive and a PITA to resolve after it occurs.
"touch" does not work, I have to change the contents of a file to
forcefully synchronize it on every cephfs client.

20 нояб. 2017 г. 6:26 пользователь "Yan, Zheng" <ukernel at gmail.com> написал:

> ceph-fuse or kernel client? which version of ceph-fuse/kernel? This
> issue can happen on ceph-fuse if fuse_disable_pagecache config is
> false. Old version kernel has a bug that can cause this issue. the bug
> is in splice_{read,write} and readahead code.
>
> On Sun, Nov 19, 2017 at 5:52 PM, Gregory Farnum <gfarnum at redhat.com>
> wrote:
> > Hmm, are you mounting the filesystem using ceph-fuse? Can you describe
> your
> > workload?
> > -Greg
> >
> > On Fri, Nov 3, 2017 at 6:42 PM Andrey Klimentyev
> > <andrey.klimentyev at flant.com> wrote:
> >>
> >> I am absolutely incorrect, my apologies.
> >>
> >> caps: [mds] allow rw
> >> caps: [mon] allow r
> >> caps: [osd] allow rwx pool=cephfs_metadata, allow rwx pool=cephfs_data
> >>
> >> On 3 November 2017 at 10:40, Henrik Korkuc <lists at kirneh.eu> wrote:
> >>>
> >>> On 17-11-03 09:29, Andrey Klimentyev wrote:
> >>>
> >>> Thanks for a swift response.
> >>>
> >>> We are using 10.2.10.
> >>>
> >>> They all share the same set of permissions (and one key, too). Haven't
> >>> found anything incriminating in logs, too.
> >>>
> >>> caps: [mon] allow r
> >>> caps: [osd] allow class-read object_prefix rbd_children, allow rwx
> >>> pool=rbd
> >>>
> >>> Are you sure you pasted correct user permissions? It looks like you are
> >>> using RBD permissions for CephFS and this seems to be the problem.
> >>>
> >>> On 3 November 2017 at 00:56, Gregory Farnum <gfarnum at redhat.com>
> wrote:
> >>>>
> >>>> On Thu, Nov 2, 2017 at 9:05 AM Andrey Klimentyev
> >>>> <andrey.klimentyev at flant.com> wrote:
> >>>>>
> >>>>> Hi,
> >>>>>
> >>>>> we've recently hit a problem in a production cluster. The gist of it
> is
> >>>>> that sometimes file will be changed on one machine, but only the
> "change
> >>>>> time" would propagate to others. The checksum is different. Contents,
> >>>>> obviously, differ as well. How can I debug this?
> >>>>>
> >>>>> In other words, how would I approach such problem with "stuck files"?
> >>>>> Haven't found anything on Google or troubleshooting docs.
> >>>>
> >>>>
> >>>> What versions are you running?
> >>>> The only way I can think of this happening is if one of the clients
> had
> >>>> permission to access the CephFS namespace on the MDS, but not to
> write to
> >>>> the OSDs which store the file data. Have you checked that the clients
> all
> >>>> have the same caps? ("ceph auth list" or one of the related
> more-specific
> >>>> commands will let you compare.)
> >>>> -Greg
> >>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> Andrey Klimentyev,
> >>>>> DevOps engineer @ JSC «Flant»
> >>>>> http://flant.com/
> >>>>> _______________________________________________
> >>>>> ceph-users mailing list
> >>>>> ceph-users at lists.ceph.com
> >>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >>>
> >>>
> >>>
> >>>
> >>> --
> >>> Andrey Klimentyev,
> >>> DevOps engineer @ JSC «Flant»
> >>> http://flant.com/
> >>>
> >>>
> >>> _______________________________________________
> >>> ceph-users mailing list
> >>> ceph-users at lists.ceph.com
> >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >>>
> >>>
> >>
> >>
> >>
> >> --
> >> Andrey Klimentyev,
> >> DevOps engineer @ JSC «Flant»
> >> http://flant.com/
> >> +7 (495) 721-10-27, ext. 487
> >> +7 (960) 180-38-98
> >
> >
> > _______________________________________________
> > ceph-users mailing list
> > ceph-users at lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20171123/902dbc00/attachment.html>


More information about the ceph-users mailing list