[ceph-users] PGs inconsistent, do I fear data loss?

Denes Dolhay denke at denkesys.com
Thu Nov 2 16:05:18 PDT 2017


Hi Greg,

Accepting the fact, that an osd with outdated data can never accept 
write, or io of any kind, how is it possible, that the system goes into 
this state?

-All osds are Bluestore, checksum, mtime etc.

-All osds are up and in

-No hw failures, lost disks, damaged journals or databases etc.

-The data became inconsistent


Thanks,

Denke.


On 11/02/2017 11:51 PM, Gregory Farnum wrote:
>
> On Thu, Nov 2, 2017 at 1:21 AM koukou73gr <koukou73gr at yahoo.com 
> <mailto:koukou73gr at yahoo.com>> wrote:
>
>     The scenario is actually a bit different, see:
>
>     Let's assume size=2, min_size=1
>     -We are looking at pg "A" acting [1, 2]
>     -osd 1 goes down
>     -osd 2 accepts a write for pg "A"
>     -osd 2 goes down
>     -osd 1 comes back up, while osd 2 still down
>     -osd 1 has no way to know osd 2 accepted a write in pg "A"
>     -osd 1 accepts a new write to pg "A"
>     -osd 2 comes back up.
>
>     bang! osd 1 and 2 now have different views of pg "A" but both claim to
>     have current data.
>
>
> In this case, OSD 1 will not accept IO precisely because it can not 
> prove it has the current data. That is the basic purpose of OSD 
> peering and holds in all cases.
> -Greg
>
>
>
>     -K.
>
>     On 2017-11-01 20:27, Denes Dolhay wrote:
>     > Hello,
>     >
>     > I have a trick question for Mr. Turner's scenario:
>     > Let's assume size=2, min_size=1
>     > -We are looking at pg "A" acting [1, 2]
>     > -osd 1 goes down, OK
>     > -osd 1 comes back up, backfill of pg "A" commences from osd 2 to
>     osd 1, OK
>     > -osd 2 goes down (and therefore pg "A" 's backfill to osd 1 is
>     > incomplete and stopped) not OK, but this is the case...
>     > --> In this event, why does osd 1 accept IO to pg "A" knowing
>     full well,
>     > that it's data is outdated and will cause an inconsistent state?
>     > Wouldn't it be prudent to deny io to pg "A" until either
>     > -osd 2 comes back (therefore we have a clean osd in the acting
>     group)
>     > ... backfill would continue to osd 1 of course
>     > -or data in pg "A" is manually marked as lost, and then continues
>     > operation from osd 1 's (outdated) copy?
>     _______________________________________________
>     ceph-users mailing list
>     ceph-users at lists.ceph.com <mailto:ceph-users at lists.ceph.com>
>     http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users at lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20171103/80e823aa/attachment.html>


More information about the ceph-users mailing list