[ceph-users] PG auto repair with BlueStore

Paul Emmerich paul.emmerich at croit.io
Sat Nov 17 12:05:57 PST 2018


While I also believe it to be perfectly safe on a bluestore cluster
(especially since there's osd_scrub_auto_repair_num_errors if there's
more wrong than your usual bit rot), we also don't run any cluster
with this option at the moment. We had it enabled for some time before
we backported the OOM-read-error stuff on some clusters.

But there's a small operational issue with auto repair at the moment:
this option will occasionally set the repair flag on a PG without any
scrub errors during scrubbing for some reason which triggers a health
error.

We've had a quick look at the code and couldn't figure out how the
repair flag gets set in some cases on perfectly healthy PGs. Does it
maybe only get set for a very short time while finishing up the scrub
and that's not always picked up in time?
Anyways, a potential work-around for this would be to maybe remove the
repair state from the conditions for the PG_DAMAGED warning?

Paul

-- 
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90

Am Fr., 16. Nov. 2018 um 08:49 Uhr schrieb Mark Schouten <mark at tuxis.nl>:
>
>
> Which, as a user, is very surprising to me too..
> --
>
> Mark Schouten  | Tuxis Internet Engineering
> KvK: 61527076  | http://www.tuxis.nl/
> T: 0318 200208 | info at tuxis.nl
>
>
>
>
> ----- Original Message -----
>
>
> From: Wido den Hollander (wido at 42on.com)
> Date: 16-11-2018 08:25
> To: Mark Schouten (mark at tuxis.nl)
> Cc: Ceph Users (ceph-users at ceph.com)
> Subject: Re: [ceph-users] PG auto repair with BlueStore
>
>
> On 11/15/18 7:45 PM, Mark Schouten wrote:
> > As a user, I’m very surprised that this isn’t a default setting.
> >
>
> That is because you can also have FileStore OSDs in a cluster on which
> such a auto-repair is not safe.
>
> Wido
>
> > Mark Schouten
> >
> >> Op 15 nov. 2018 om 18:40 heeft Wido den Hollander <wido at 42on.com> het volgende geschreven:
> >>
> >> Hi,
> >>
> >> This question is actually still outstanding. Is there any good reason to
> >> keep auto repair for scrub errors disabled with BlueStore?
> >>
> >> I couldn't think of a reason when using size=3 and min_size=2, so just
> >> wondering.
> >>
> >> Thanks!
> >>
> >> Wido
> >>
> >>> On 8/24/18 8:55 AM, Wido den Hollander wrote:
> >>> Hi,
> >>>
> >>> osd_scrub_auto_repair still defaults to false and I was wondering how we
> >>> think about enabling this feature by default.
> >>>
> >>> Would we say it's safe to enable this with BlueStore?
> >>>
> >>> Wido
> >>> _______________________________________________
> >>> ceph-users mailing list
> >>> ceph-users at lists.ceph.com
> >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >>>
> >> _______________________________________________
> >> ceph-users mailing list
> >> ceph-users at lists.ceph.com
> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users at lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


More information about the ceph-users mailing list