[ceph-users] How to repair active+clean+inconsistent?

Brad Hubbard bhubbard at redhat.com
Sun Nov 11 22:58:33 PST 2018


On Mon, Nov 12, 2018 at 4:21 PM Ashley Merrick <singapore at amerrick.co.uk> wrote:
>
> Your need to run "ceph pg deep-scrub 1.65" first

Right, thanks Ashley. That's what the "Note that you may have to do a
deep scrub to populate the output." part of my answer meant but
perhaps I needed to go further?

The system has a record of a scrub error on a previous scan but
subsequent activity in the cluster has invalidated the specifics. You
need to run another scrub to get the specific information for this pg
at this point in time (the information does not remain valid
indefinitely and therefore may need to be renewed depending on
circumstances).

>
> On Mon, Nov 12, 2018 at 2:20 PM K.C. Wong <kcwong at verseon.com> wrote:
>>
>> Hi Brad,
>>
>> I got the following:
>>
>> [root at mgmt01 ~]# ceph health detail
>> HEALTH_ERR 1 pgs inconsistent; 1 scrub errors
>> pg 1.65 is active+clean+inconsistent, acting [62,67,47]
>> 1 scrub errors
>> [root at mgmt01 ~]# rados list-inconsistent-obj 1.65
>> No scrub information available for pg 1.65
>> error 2: (2) No such file or directory
>> [root at mgmt01 ~]# rados list-inconsistent-snapset 1.65
>> No scrub information available for pg 1.65
>> error 2: (2) No such file or directory
>>
>> Rather odd output, I’d say; not that I understand what
>> that means. I also tried ceph list-inconsistent-pg:
>>
>> [root at mgmt01 ~]# rados lspools
>> rbd
>> cephfs_data
>> cephfs_metadata
>> .rgw.root
>> default.rgw.control
>> default.rgw.data.root
>> default.rgw.gc
>> default.rgw.log
>> ctrl-p
>> prod
>> corp
>> camp
>> dev
>> default.rgw.users.uid
>> default.rgw.users.keys
>> default.rgw.buckets.index
>> default.rgw.buckets.data
>> default.rgw.buckets.non-ec
>> [root at mgmt01 ~]# for i in $(rados lspools); do rados list-inconsistent-pg $i; done
>> []
>> ["1.65"]
>> []
>> []
>> []
>> []
>> []
>> []
>> []
>> []
>> []
>> []
>> []
>> []
>> []
>> []
>> []
>> []
>>
>> So, that’d put the inconsistency in the cephfs_data pool.
>>
>> Thank you for your help,
>>
>> -kc
>>
>> K.C. Wong
>> kcwong at verseon.com
>> M: +1 (408) 769-8235
>>
>> -----------------------------------------------------
>> Confidentiality Notice:
>> This message contains confidential information. If you are not the
>> intended recipient and received this message in error, any use or
>> distribution is strictly prohibited. Please also notify us
>> immediately by return e-mail, and delete this message from your
>> computer system. Thank you.
>> -----------------------------------------------------
>>
>> 4096R/B8995EDE  E527 CBE8 023E 79EA 8BBB  5C77 23A6 92E9 B899 5EDE
>>
>> hkps://hkps.pool.sks-keyservers.net
>>
>> On Nov 11, 2018, at 5:43 PM, Brad Hubbard <bhubbard at redhat.com> wrote:
>>
>> What does "rados list-inconsistent-obj <pg>" say?
>>
>> Note that you may have to do a deep scrub to populate the output.
>> On Mon, Nov 12, 2018 at 5:10 AM K.C. Wong <kcwong at verseon.com> wrote:
>>
>>
>> Hi folks,
>>
>> I would appreciate any pointer as to how I can resolve a
>> PG stuck in “active+clean+inconsistent” state. This has
>> resulted in HEALTH_ERR status for the last 5 days with no
>> end in sight. The state got triggered when one of the drives
>> in the PG returned I/O error. I’ve since replaced the failed
>> drive.
>>
>> I’m running Jewel (out of centos-release-ceph-jewel) on
>> CentOS 7. I’ve tried “ceph pg repair <pg>” and it didn’t seem
>> to do anything. I’ve tried even more drastic measures such as
>> comparing all the files (using filestore) under that PG_head
>> on all 3 copies and then nuking the outlier. Nothing worked.
>>
>> Many thanks,
>>
>> -kc
>>
>> K.C. Wong
>> kcwong at verseon.com
>> M: +1 (408) 769-8235
>>
>> -----------------------------------------------------
>> Confidentiality Notice:
>> This message contains confidential information. If you are not the
>> intended recipient and received this message in error, any use or
>> distribution is strictly prohibited. Please also notify us
>> immediately by return e-mail, and delete this message from your
>> computer system. Thank you.
>> -----------------------------------------------------
>> 4096R/B8995EDE  E527 CBE8 023E 79EA 8BBB  5C77 23A6 92E9 B899 5EDE
>> hkps://hkps.pool.sks-keyservers.net
>>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users at lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>>
>>
>> --
>> Cheers,
>> Brad
>>
>>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users at lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad


More information about the ceph-users mailing list