[ceph-users] PGs inconsistent, do I fear data loss?

David Turner drakonstein at gmail.com
Wed Nov 1 11:51:58 PDT 2017


In that thread, I really like how Wido puts it.  He takes out any bit of
code paths, bugs, etc...  In reference to size=3 min_size=1 he says,
"Loosing two disks at the same time is something which doesn't happen that
much, but if it happens you don't want to modify any data on the only copy
which you still have left.  Setting min_size to 1 should be a manual action
imho when size = 3 and you loose two copies. In that case YOU decide at
that moment if it is the right course of action."

On Wed, Nov 1, 2017 at 2:40 PM Denes Dolhay <denke at denkesys.com> wrote:

> Thanks!
>
> On 11/01/2017 07:30 PM, Gregory Farnum wrote:
>
> On Wed, Nov 1, 2017 at 11:27 AM Denes Dolhay <denke at denkesys.com> wrote:
>
>> Hello,
>> I have a trick question for Mr. Turner's scenario:
>> Let's assume size=2, min_size=1
>> -We are looking at pg "A" acting [1, 2]
>> -osd 1 goes down, OK
>> -osd 1 comes back up, backfill of pg "A" commences from osd 2 to osd 1, OK
>> -osd 2 goes down (and therefore pg "A" 's backfill to osd 1 is incomplete
>> and stopped) not OK, but this is the case...
>> --> In this event, why does osd 1 accept IO to pg "A" knowing full well,
>> that it's data is outdated and will cause an inconsistent state?
>> Wouldn't it be prudent to deny io to pg "A" until either
>> -osd 2 comes back (therefore we have a clean osd in the acting group) ...
>> backfill would continue to osd 1 of course
>> -or data in pg "A" is manually marked as lost, and then continues
>> operation from osd 1 's (outdated) copy?
>>
>
> It does deny IO in that case. I think David was pointing out that if OSD 2
> is actually dead and gone, you've got data loss despite having only lost
> one OSD.
> -Greg
>
>
>>
>> Thanks in advance, I'm really curious!
>>
>> Denes.
>>
>>
>>
>> On 11/01/2017 06:33 PM, Mario Giammarco wrote:
>>
>> I have read your post then read the thread you suggested, very
>> interesting.
>> Then I read again your post and understood better.
>> The most important thing is that even with min_size=1 writes are
>> acknowledged after ceph wrote size=2 copies.
>> In the thread above there is:
>>
>> As David already said, when all OSDs are up and in for a PG Ceph will wait for ALL OSDs to Ack the write. Writes in RADOS are always synchronous.
>>
>> Only when OSDs go down you need at least min_size OSDs up before writes or reads are accepted.
>>
>> So if min_size = 2 and size = 3 you need at least 2 OSDs online for I/O to take place.
>>
>>
>> You then show me a sequence of events that may happen in some use cases.
>> I tell you my use case which is quite different. We use ceph under
>> proxmox. The servers have disks on raid 5 (I agree that it is better to
>> expose single disks to Ceph but it is late).
>> So it is unlikely that a ceph disk fails because of raid. If a disks fail
>> probabably is because the entire server has failed (and we need to provide
>> business availability in this case) and so it will never come up again so
>> in my situation your sequence of events will never happen.
>> What shocked me is that I did not expect to see so many inconsistencies.
>> Thanks,
>> Mario
>>
>>
>> 2017-11-01 16:45 GMT+01:00 David Turner <drakonstein at gmail.com>:
>>
>>> It looks like you're running with a size = 2 and min_size = 1 (the
>>> min_size is a guess, the size is based on how many osds belong to your
>>> problem PGs).  Here's some good reading for you.
>>> https://www.spinics.net/lists/ceph-users/msg32895.html
>>>
>>> Basically the jist is that when running with size = 2 you should assume
>>> that data loss is an eventuality and choose that it is ok for your use
>>> case.  This can be mitigated by using min_size = 2, but then your pool will
>>> block while an OSD is down and you'll have to manually go in and change the
>>> min_size temporarily to perform maintenance.
>>>
>>> All it takes for data loss is that an osd on server 1 is marked down and
>>> a write happens to an osd on server 2.  Now the osd on server 2 goes down
>>> before the osd on server 1 has finished backfilling and the first osd
>>> receives a request to modify data in the object that it doesn't know the
>>> current state of.  Tada, you have data loss.
>>>
>>> How likely is this to happen... eventually it will.  PG subfolder
>>> splitting (if you're using filestore) will occasionally take long enough to
>>> perform the task that the osd is marked down while it's still running, and
>>> this usually happens for some time all over the cluster when it does.
>>> Another option is something that causes segfaults in the osds; another is
>>> restarting a node before all pgs are done backfilling/recovering; OOM
>>> killer; power outages; etc; etc.
>>>
>>> Why does min_size = 2 prevent this?  Because for a write to be
>>> acknowledged by the cluster, it has to be written to every OSD that is up
>>> as long as there are at least min_size available.  This means that every
>>> write is acknowledged by at least 2 osds every time.  If you're running
>>> with size = 2, then both copies of the data need to be online for a write
>>> to happen and thus can never have a write that the other does not.  If
>>> you're running with size = 3, then you always have a majority of the OSDs
>>> online receiving a write and they can both agree on the correct data to
>>> give to the third when it comes back up.
>>>
>>> On Wed, Nov 1, 2017 at 3:31 AM Mario Giammarco <mgiammarco at gmail.com>
>>> wrote:
>>>
>>>> Sure here it is ceph -s:
>>>>
>>>> cluster:
>>>>    id:     8bc45d9a-ef50-4038-8e1b-1f25ac46c945
>>>>    health: HEALTH_ERR
>>>>            100 scrub errors
>>>>            Possible data damage: 56 pgs inconsistent
>>>>
>>>>  services:
>>>>    mon: 3 daemons, quorum 0,1,pve3
>>>>    mgr: pve3(active)
>>>>    osd: 3 osds: 3 up, 3 in
>>>>
>>>>  data:
>>>>    pools:   1 pools, 256 pgs
>>>>    objects: 269k objects, 1007 GB
>>>>    usage:   2050 GB used, 1386 GB / 3436 GB avail
>>>>    pgs:     200 active+clean
>>>>             56  active+clean+inconsistent
>>>>
>>>> ---
>>>>
>>>> ceph health detail :
>>>>
>>>> PG_DAMAGED Possible data damage: 56 pgs inconsistent
>>>>    pg 2.6 is active+clean+inconsistent, acting [1,0]
>>>>    pg 2.19 is active+clean+inconsistent, acting [1,2]
>>>>    pg 2.1e is active+clean+inconsistent, acting [1,2]
>>>>    pg 2.1f is active+clean+inconsistent, acting [1,2]
>>>>    pg 2.24 is active+clean+inconsistent, acting [0,2]
>>>>    pg 2.25 is active+clean+inconsistent, acting [2,0]
>>>>    pg 2.36 is active+clean+inconsistent, acting [1,0]
>>>>    pg 2.3d is active+clean+inconsistent, acting [1,2]
>>>>    pg 2.4b is active+clean+inconsistent, acting [1,0]
>>>>    pg 2.4c is active+clean+inconsistent, acting [0,2]
>>>>    pg 2.4d is active+clean+inconsistent, acting [1,2]
>>>>    pg 2.4f is active+clean+inconsistent, acting [1,2]
>>>>    pg 2.50 is active+clean+inconsistent, acting [1,2]
>>>>    pg 2.52 is active+clean+inconsistent, acting [1,2]
>>>>    pg 2.56 is active+clean+inconsistent, acting [1,0]
>>>>    pg 2.5b is active+clean+inconsistent, acting [1,2]
>>>>    pg 2.5c is active+clean+inconsistent, acting [1,2]
>>>>    pg 2.5d is active+clean+inconsistent, acting [1,0]
>>>>    pg 2.5f is active+clean+inconsistent, acting [1,2]
>>>>    pg 2.71 is active+clean+inconsistent, acting [0,2]
>>>>    pg 2.75 is active+clean+inconsistent, acting [1,2]
>>>>    pg 2.77 is active+clean+inconsistent, acting [1,2]
>>>>    pg 2.79 is active+clean+inconsistent, acting [1,2]
>>>>    pg 2.7e is active+clean+inconsistent, acting [1,2]
>>>>    pg 2.83 is active+clean+inconsistent, acting [1,0]
>>>>    pg 2.8a is active+clean+inconsistent, acting [1,0]
>>>>    pg 2.92 is active+clean+inconsistent, acting [1,2]
>>>>    pg 2.98 is active+clean+inconsistent, acting [1,0]
>>>>    pg 2.9a is active+clean+inconsistent, acting [1,0]
>>>>    pg 2.9e is active+clean+inconsistent, acting [1,0]
>>>>    pg 2.9f is active+clean+inconsistent, acting [1,2]
>>>>    pg 2.c6 is active+clean+inconsistent, acting [0,2]
>>>>    pg 2.c7 is active+clean+inconsistent, acting [1,0]
>>>>    pg 2.c8 is active+clean+inconsistent, acting [1,2]
>>>>    pg 2.cb is active+clean+inconsistent, acting [1,2]
>>>>    pg 2.cd is active+clean+inconsistent, acting [1,2]
>>>>    pg 2.ce is active+clean+inconsistent, acting [1,2]
>>>>    pg 2.d2 is active+clean+inconsistent, acting [2,1]
>>>>    pg 2.da is active+clean+inconsistent, acting [1,0]
>>>>    pg 2.de is active+clean+inconsistent, acting [1,2]
>>>>    pg 2.e1 is active+clean+inconsistent, acting [1,2]
>>>>    pg 2.e4 is active+clean+inconsistent, acting [1,0]
>>>>    pg 2.e6 is active+clean+inconsistent, acting [0,2]
>>>>    pg 2.e8 is active+clean+inconsistent, acting [1,2]
>>>>    pg 2.ee is active+clean+inconsistent, acting [1,0]
>>>>    pg 2.f9 is active+clean+inconsistent, acting [1,2]
>>>>    pg 2.fa is active+clean+inconsistent, acting [1,0]
>>>>    pg 2.fb is active+clean+inconsistent, acting [1,2]
>>>>    pg 2.fc is active+clean+inconsistent, acting [1,2]
>>>>    pg 2.fe is active+clean+inconsistent, acting [1,0]
>>>>    pg 2.ff is active+clean+inconsistent, acting [1,0]
>>>>
>>>>
>>>> and ceph pg 2.6 query:
>>>>
>>>> {
>>>>    "state": "active+clean+inconsistent",
>>>>    "snap_trimq": "[]",
>>>>    "epoch": 1513,
>>>>    "up": [
>>>>        1,
>>>>        0
>>>>    ],
>>>>    "acting": [
>>>>        1,
>>>>        0
>>>>    ],
>>>>    "actingbackfill": [
>>>>        "0",
>>>>        "1"
>>>>    ],
>>>>    "info": {
>>>>        "pgid": "2.6",
>>>>        "last_update": "1513'89145",
>>>>        "last_complete": "1513'89145",
>>>>        "log_tail": "1503'87586",
>>>>        "last_user_version": 330583,
>>>>        "last_backfill": "MAX",
>>>>        "last_backfill_bitwise": 0,
>>>>        "purged_snaps": [
>>>>            {
>>>>                "start": "1",
>>>>                "length": "178"
>>>>            },
>>>>            {
>>>>                "start": "17a",
>>>>                "length": "3d"
>>>>            },
>>>>            {
>>>>                "start": "1b8",
>>>>                "length": "1"
>>>>            },
>>>>            {
>>>>                "start": "1ba",
>>>>                "length": "1"
>>>>            },
>>>>            {
>>>>                "start": "1bc",
>>>>                "length": "1"
>>>>            },
>>>>            {
>>>>                "start": "1be",
>>>>                "length": "44"
>>>>            },
>>>>            {
>>>>                "start": "205",
>>>>                "length": "12c"
>>>>            },
>>>>            {
>>>>                "start": "332",
>>>>                "length": "1"
>>>>            },
>>>>            {
>>>>                "start": "334",
>>>>                "length": "1"
>>>>            },
>>>>            {
>>>>                "start": "336",
>>>>                "length": "1"
>>>>            },
>>>>            {
>>>>                "start": "338",
>>>>                "length": "1"
>>>>            },
>>>>            {
>>>>                "start": "33a",
>>>>                "length": "1"
>>>>            }
>>>>        ],
>>>>        "history": {
>>>>            "epoch_created": 90,
>>>>            "epoch_pool_created": 90,
>>>>            "last_epoch_started": 1339,
>>>>            "last_interval_started": 1338,
>>>>            "last_epoch_clean": 1339,
>>>>            "last_interval_clean": 1338,
>>>>            "last_epoch_split": 0,
>>>>            "last_epoch_marked_full": 0,
>>>>            "same_up_since": 1338,
>>>>            "same_interval_since": 1338,
>>>>            "same_primary_since": 1338,
>>>>            "last_scrub": "1513'89112",
>>>>            "last_scrub_stamp": "2017-11-01 05:52:21.259654",
>>>>            "last_deep_scrub": "1513'89112",
>>>>            "last_deep_scrub_stamp": "2017-11-01 05:52:21.259654",
>>>>            "last_clean_scrub_stamp": "2017-10-25 04:25:09.830840"
>>>>        },
>>>>        "stats": {
>>>>            "version": "1513'89145",
>>>>            "reported_seq": "422820",
>>>>            "reported_epoch": "1513",
>>>>            "state": "active+clean+inconsistent",
>>>>            "last_fresh": "2017-11-01 08:11:38.411784",
>>>>            "last_change": "2017-11-01 05:52:21.259789",
>>>>            "last_active": "2017-11-01 08:11:38.411784",
>>>>            "last_peered": "2017-11-01 08:11:38.411784",
>>>>            "last_clean": "2017-11-01 08:11:38.411784",
>>>>            "last_became_active": "2017-10-15 20:36:33.644567",
>>>>            "last_became_peered": "2017-10-15 20:36:33.644567",
>>>>            "last_unstale": "2017-11-01 08:11:38.411784",
>>>>            "last_undegraded": "2017-11-01 08:11:38.411784",
>>>>            "last_fullsized": "2017-11-01 08:11:38.411784",
>>>>            "mapping_epoch": 1338,
>>>>            "log_start": "1503'87586",
>>>>            "ondisk_log_start": "1503'87586",
>>>>            "created": 90,
>>>>            "last_epoch_clean": 1339,
>>>>            "parent": "0.0",
>>>>            "parent_split_bits": 0,
>>>>            "last_scrub": "1513'89112",
>>>>            "last_scrub_stamp": "2017-11-01 05:52:21.259654",
>>>>            "last_deep_scrub": "1513'89112",
>>>>            "last_deep_scrub_stamp": "2017-11-01 05:52:21.259654",
>>>>            "last_clean_scrub_stamp": "2017-10-25 04:25:09.830840",
>>>>            "log_size": 1559,
>>>>            "ondisk_log_size": 1559,
>>>>            "stats_invalid": false,
>>>>            "dirty_stats_invalid": false,
>>>>            "omap_stats_invalid": false,
>>>>            "hitset_stats_invalid": false,
>>>>            "hitset_bytes_stats_invalid": false,
>>>>            "pin_stats_invalid": false,
>>>>            "stat_sum": {
>>>>                "num_bytes": 3747886080 <374%20788%206080>,
>>>>                "num_objects": 958,
>>>>                "num_object_clones": 295,
>>>>                "num_object_copies": 1916,
>>>>                "num_objects_missing_on_primary": 0,
>>>>                "num_objects_missing": 0,
>>>>                "num_objects_degraded": 0,
>>>>                "num_objects_misplaced": 0,
>>>>                "num_objects_unfound": 0,
>>>>                "num_objects_dirty": 958,
>>>>                "num_whiteouts": 0,
>>>>                "num_read": 333428,
>>>>                "num_read_kb": 135550185,
>>>>                "num_write": 79221,
>>>>                "num_write_kb": 13441239,
>>>>                "num_scrub_errors": 1,
>>>>                "num_shallow_scrub_errors": 0,
>>>>                "num_deep_scrub_errors": 1,
>>>>                "num_objects_recovered": 245,
>>>>                "num_bytes_recovered": 1012833792,
>>>>                "num_keys_recovered": 6,
>>>>                "num_objects_omap": 0,
>>>>                "num_objects_hit_set_archive": 0,
>>>>                "num_bytes_hit_set_archive": 0,
>>>>                "num_flush": 0,
>>>>                "num_flush_kb": 0,
>>>>                "num_evict": 0,
>>>>                "num_evict_kb": 0,
>>>>                "num_promote": 0,
>>>>                "num_flush_mode_high": 0,
>>>>                "num_flush_mode_low": 0,
>>>>                "num_evict_mode_some": 0,
>>>>                "num_evict_mode_full": 0,
>>>>                "num_objects_pinned": 0,
>>>>                "num_legacy_snapsets": 0
>>>>            },
>>>>            "up": [
>>>>                1,
>>>>                0
>>>>            ],
>>>>            "acting": [
>>>>                1,
>>>>                0
>>>>            ],
>>>>            "blocked_by": [],
>>>>            "up_primary": 1,
>>>>            "acting_primary": 1
>>>>        },
>>>>        "empty": 0,
>>>>        "dne": 0,
>>>>        "incomplete": 0,
>>>>        "last_epoch_started": 1339,
>>>>        "hit_set_history": {
>>>>            "current_last_update": "0'0",
>>>>            "history": []
>>>>        }
>>>>    },
>>>>    "peer_info": [
>>>>        {
>>>>            "peer": "0",
>>>>            "pgid": "2.6",
>>>>            "last_update": "1513'89145",
>>>>            "last_complete": "1513'89145",
>>>>            "log_tail": "1274'68440",
>>>>            "last_user_version": 315687,
>>>>            "last_backfill": "MAX",
>>>>            "last_backfill_bitwise": 0,
>>>>            "purged_snaps": [
>>>>                {
>>>>                    "start": "1",
>>>>                    "length": "178"
>>>>                },
>>>>                {
>>>>                    "start": "17a",
>>>>                    "length": "3d"
>>>>                },
>>>>                {
>>>>                    "start": "1b8",
>>>>                    "length": "1"
>>>>                },
>>>>                {
>>>>                    "start": "1ba",
>>>>                    "length": "1"
>>>>                },
>>>>                {
>>>>                    "start": "1bc",
>>>>                    "length": "1"
>>>>                },
>>>>                {
>>>>                    "start": "1be",
>>>>                    "length": "44"
>>>>                },
>>>>                {
>>>>                    "start": "205",
>>>>                    "length": "82"
>>>>                },
>>>>                {
>>>>                    "start": "288",
>>>>                    "length": "1"
>>>>                },
>>>>                {
>>>>                    "start": "28a",
>>>>                    "length": "1"
>>>>                },
>>>>                {
>>>>                    "start": "28c",
>>>>                    "length": "1"
>>>>                },
>>>>                {
>>>>                    "start": "28e",
>>>>                    "length": "1"
>>>>                },
>>>>                {
>>>>                    "start": "290",
>>>>                    "length": "1"
>>>>                }
>>>>            ],
>>>>            "history": {
>>>>                "epoch_created": 90,
>>>>                "epoch_pool_created": 90,
>>>>                "last_epoch_started": 1339,
>>>>                "last_interval_started": 1338,
>>>>                "last_epoch_clean": 1339,
>>>>                "last_interval_clean": 1338,
>>>>                "last_epoch_split": 0,
>>>>                "last_epoch_marked_full": 0,
>>>>                "same_up_since": 1338,
>>>>                "same_interval_since": 1338,
>>>>                "same_primary_since": 1338,
>>>>                "last_scrub": "1513'89112",
>>>>                "last_scrub_stamp": "2017-11-01 05:52:21.259654",
>>>>                "last_deep_scrub": "1513'89112",
>>>>                "last_deep_scrub_stamp": "2017-11-01 05:52:21.259654",
>>>>                "last_clean_scrub_stamp": "2017-10-25 04:25:09.830840"
>>>>            },
>>>>            "stats": {
>>>>                "version": "1337'71465",
>>>>                "reported_seq": "347015",
>>>>                "reported_epoch": "1338",
>>>>                "state": "active+undersized+degraded",
>>>>                "last_fresh": "2017-10-15 20:35:36.930611",
>>>>                "last_change": "2017-10-15 20:30:35.752042",
>>>>                "last_active": "2017-10-15 20:35:36.930611",
>>>>                "last_peered": "2017-10-15 20:35:36.930611",
>>>>                "last_clean": "2017-10-15 20:30:01.443288",
>>>>                "last_became_active": "2017-10-15 20:30:35.752042",
>>>>                "last_became_peered": "2017-10-15 20:30:35.752042",
>>>>                "last_unstale": "2017-10-15 20:35:36.930611",
>>>>                "last_undegraded": "2017-10-15 20:30:35.749043",
>>>>                "last_fullsized": "2017-10-15 20:30:35.749043",
>>>>                "mapping_epoch": 1338,
>>>>                "log_start": "1274'68440",
>>>>                "ondisk_log_start": "1274'68440",
>>>>                "created": 90,
>>>>                "last_epoch_clean": 1331,
>>>>                "parent": "0.0",
>>>>                "parent_split_bits": 0,
>>>>                "last_scrub": "1294'71370",
>>>>                "last_scrub_stamp": "2017-10-15 09:27:31.756027",
>>>>                "last_deep_scrub": "1284'70813",
>>>>                "last_deep_scrub_stamp": "2017-10-14 06:35:57.556773",
>>>>                "last_clean_scrub_stamp": "2017-10-15 09:27:31.756027",
>>>>                "log_size": 3025,
>>>>                "ondisk_log_size": 3025,
>>>>                "stats_invalid": false,
>>>>                "dirty_stats_invalid": false,
>>>>                "omap_stats_invalid": false,
>>>>                "hitset_stats_invalid": false,
>>>>                "hitset_bytes_stats_invalid": false,
>>>>                "pin_stats_invalid": false,
>>>>                "stat_sum": {
>>>>                    "num_bytes": 3555027456 <355%20502%207456>,
>>>>                    "num_objects": 917,
>>>>                    "num_object_clones": 255,
>>>>                    "num_object_copies": 1834,
>>>>                    "num_objects_missing_on_primary": 0,
>>>>                    "num_objects_missing": 0,
>>>>                    "num_objects_degraded": 917,
>>>>                    "num_objects_misplaced": 0,
>>>>                    "num_objects_unfound": 0,
>>>>                    "num_objects_dirty": 917,
>>>>                    "num_whiteouts": 0,
>>>>                    "num_read": 275095,
>>>>                    "num_read_kb": 111713846,
>>>>                    "num_write": 64324,
>>>>                    "num_write_kb": 11365374,
>>>>                    "num_scrub_errors": 0,
>>>>                    "num_shallow_scrub_errors": 0,
>>>>                    "num_deep_scrub_errors": 0,
>>>>                    "num_objects_recovered": 243,
>>>>                    "num_bytes_recovered": 1008594432,
>>>>                    "num_keys_recovered": 6,
>>>>                    "num_objects_omap": 0,
>>>>                    "num_objects_hit_set_archive": 0,
>>>>                    "num_bytes_hit_set_archive": 0,
>>>>                    "num_flush": 0,
>>>>                    "num_flush_kb": 0,
>>>>                    "num_evict": 0,
>>>>                    "num_evict_kb": 0,
>>>>                    "num_promote": 0,
>>>>                    "num_flush_mode_high": 0,
>>>>                    "num_flush_mode_low": 0,
>>>>                    "num_evict_mode_some": 0,
>>>>                    "num_evict_mode_full": 0,
>>>>                    "num_objects_pinned": 0,
>>>>                    "num_legacy_snapsets": 0
>>>>                },
>>>>                "up": [
>>>>                    1,
>>>>                    0
>>>>                ],
>>>>                "acting": [
>>>>                    1,
>>>>                    0
>>>>                ],
>>>>                "blocked_by": [],
>>>>                "up_primary": 1,
>>>>                "acting_primary": 1
>>>>            },
>>>>            "empty": 0,
>>>>            "dne": 0,
>>>>            "incomplete": 0,
>>>>            "last_epoch_started": 1339,
>>>>            "hit_set_history": {
>>>>                "current_last_update": "0'0",
>>>>                "history": []
>>>>            }
>>>>        }
>>>>    ],
>>>>    "recovery_state": [
>>>>        {
>>>>            "name": "Started/Primary/Active",
>>>>            "enter_time": "2017-10-15 20:36:33.574915",
>>>>            "might_have_unfound": [
>>>>                {
>>>>                    "osd": "0",
>>>>                    "status": "already probed"
>>>>                }
>>>>            ],
>>>>            "recovery_progress": {
>>>>                "backfill_targets": [],
>>>>                "waiting_on_backfill": [],
>>>>                "last_backfill_started": "MIN",
>>>>                "backfill_info": {
>>>>                    "begin": "MIN",
>>>>                    "end": "MIN",
>>>>                    "objects": []
>>>>                },
>>>>                "peer_backfill_info": [],
>>>>                "backfills_in_flight": [],
>>>>                "recovering": [],
>>>>                "pg_backend": {
>>>>                    "pull_from_peer": [],
>>>>                    "pushing": []
>>>>                }
>>>>            },
>>>>            "scrub": {
>>>>                "scrubber.epoch_start": "1338",
>>>>                "scrubber.active": false,
>>>>                "scrubber.state": "INACTIVE",
>>>>                "scrubber.start": "MIN",
>>>>                "scrubber.end": "MIN",
>>>>                "scrubber.subset_last_update": "0'0",
>>>>                "scrubber.deep": false,
>>>>                "scrubber.seed": 0,
>>>>                "scrubber.waiting_on": 0,
>>>>                "scrubber.waiting_on_whom": []
>>>>            }
>>>>        },
>>>>        {
>>>>            "name": "Started",
>>>>            "enter_time": "2017-10-15 20:36:32.592892"
>>>>        }
>>>>    ],
>>>>    "agent_state": {}
>>>> }
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> 2017-10-30 23:30 GMT+01:00 Gregory Farnum <gfarnum at redhat.com>:
>>>>
>>>>> You'll need to tell us exactly what error messages you're seeing, what
>>>>> the output of ceph -s is, and the output of pg query for the relevant PGs.
>>>>> There's not a lot of documentation because much of this tooling is
>>>>> new, it's changing quickly, and most people don't have the kinds of
>>>>> problems that turn out to be unrepairable. We should do better about that,
>>>>> though.
>>>>> -Greg
>>>>>
>>>>> On Mon, Oct 30, 2017, 11:40 AM Mario Giammarco <mgiammarco at gmail.com>
>>>>> wrote:
>>>>>
>>>>>>  >[Questions to the list]
>>>>>>  >How is it possible that the cluster cannot repair itself with ceph
>>>>>> pg
>>>>>> repair?
>>>>>>  >No good copies are remaining?
>>>>>>  >Cannot decide which copy is valid or up-to date?
>>>>>>  >If so, why not, when there is checksum, mtime for everything?
>>>>>>  >In this inconsistent state which object does the cluster serve when
>>>>>> it
>>>>>> doesn't know which one is the valid?
>>>>>>
>>>>>>
>>>>>> I am asking the same questions too, it seems strange to me that in a
>>>>>> fault tolerant clustered file storage like Ceph there is no
>>>>>> documentation about this.
>>>>>>
>>>>>> I know that I am pedantic but please note that saying "to be sure use
>>>>>> three copies" is not enough because I am not sure what Ceph really
>>>>>> does
>>>>>> when three copies are not matching.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> ceph-users mailing list
>>>>>> ceph-users at lists.ceph.com
>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>>
>>>>>
>>>> _______________________________________________
>>>> ceph-users mailing list
>>>> ceph-users at lists.ceph.com
>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>
>>>
>>
>>
>> _______________________________________________
>> ceph-users mailing listceph-users at lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users at lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users at lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20171101/8fd9034a/attachment.html>


More information about the ceph-users mailing list