[ceph-users] objects degraded higher than 100%

Gregory Farnum gfarnum at redhat.com
Thu Oct 12 10:56:50 PDT 2017


On Thu, Oct 12, 2017 at 10:52 AM Florian Haas <florian at hastexo.com> wrote:

> On Thu, Oct 12, 2017 at 7:22 PM, Gregory Farnum <gfarnum at redhat.com>
> wrote:
> >
> >
> > On Thu, Oct 12, 2017 at 3:50 AM Florian Haas <florian at hastexo.com>
> wrote:
> >>
> >> On Mon, Sep 11, 2017 at 8:13 PM, Andreas Herrmann <andreas at mx20.org>
> >> wrote:
> >> > Hi,
> >> >
> >> > how could this happen:
> >> >
> >> >         pgs: 197528/1524 objects degraded (12961.155%)
> >> >
> >> > I did some heavy failover tests, but a value higher than 100% looks
> >> > strange
> >> > (ceph version 12.2.0). Recovery is quite slow.
> >> >
> >> >   cluster:
> >> >     health: HEALTH_WARN
> >> >             3/1524 objects misplaced (0.197%)
> >> >             Degraded data redundancy: 197528/1524 objects degraded
> >> > (12961.155%), 1057 pgs unclean, 1055 pgs degraded, 3 pgs undersized
> >> >
> >> >   data:
> >> >     pools:   1 pools, 2048 pgs
> >> >     objects: 508 objects, 1467 MB
> >> >     usage:   127 GB used, 35639 GB / 35766 GB avail
> >> >     pgs:     197528/1524 objects degraded (12961.155%)
> >> >              3/1524 objects misplaced (0.197%)
> >> >              1042 active+recovery_wait+degraded
> >> >              991  active+clean
> >> >              8    active+recovering+degraded
> >> >              3    active+undersized+degraded+remapped+backfill_wait
> >> >              2    active+recovery_wait+degraded+remapped
> >> >              2    active+remapped+backfill_wait
> >> >
> >> >   io:
> >> >     recovery: 340 kB/s, 80 objects/s
> >>
> >> Did you ever get to the bottom of this? I'm seeing something very
> >> similar on a 12.2.1 reference system:
> >>
> >> https://gist.github.com/fghaas/f547243b0f7ebb78ce2b8e80b936e42c
> >>
> >> I'm also seeing an unusual MISSING_ON_PRIMARY count in "rados df":
> >> https://gist.github.com/fghaas/59cd2c234d529db236c14fb7d46dfc85
> >>
> >> The odd thing in there is that the "bench" pool was empty when the
> >> recovery started (that pool had been wiped with "rados cleanup"), so
> >> the number of objects deemed to be missing from the primary really
> >> ought to be zero.
> >>
> >> It seems like it's considering these deleted objects to still require
> >> replication, but that sounds rather far fetched to be honest.
> >
> >
> > Actually, that makes some sense. This cluster had an OSD down while (some
> > of) the deletes were happening?
>
> I thought of exactly that too, but no it didn't. That's the problem.
>

Okay, in that case I've no idea. What was the timeline for the recovery
versus the rados bench and cleanup versus the degraded object counts, then?


>
> > I haven't dug through the code but I bet it is considering those as
> degraded
> > objects because the out-of-date OSD knows it doesn't have the latest
> > versions on them! :)
>
> Yeah I bet against that. :)
>
> Another tidbit: these objects were not deleted with rados rm, they
> were cleaned up after rados bench. In the case quoted above, this was
> an explicit "rados cleanup" after "rados bench --no-cleanup"; in
> another, I saw the same behavior after a regular "rados bench" that
> included the automatic cleanup.
>
> So there are two hypotheses here:
> (1) The deletion in rados bench is neglecting to do something that a
> regular object deletion does do. Given the fact that at least one
> other thing is fishy in rados bench
> (http://tracker.ceph.com/issues/21375), this may be due to some simple
> oversight in the Luminous cycle, and thus would constitute a fairly
> minor (if irritating) issue.
> (2) Regular object deletion is buggy in some previously unknown
> fashion. That would would be a rather major problem.
>

These both seem exceedingly unlikely. *shrug*


>
> By the way, *deleting the pool* altogether makes the degraded object
> count drop to expected levels immediately. Probably no surprise there,
> though.
>
> Cheers,
> Florian
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20171012/825f2697/attachment.html>


More information about the ceph-users mailing list