[ceph-users] fixing another remapped+incomplete EC 4+2 pg

Graham Allan gta at umn.edu
Tue Oct 9 16:55:19 PDT 2018



On 10/09/2018 01:14 PM, Graham Allan wrote:
> 
> 
> On 10/9/2018 12:19 PM, Gregory Farnum wrote:
>>
>> I think unfortunately the easiest thing for you to fix this will be to 
>> set the min_size back to 4 until the PG is recovered (or at least has 
>> 5 shards done). This will be fixed in a later version of Ceph and 
>> probably backported, but sadly it's not done yet.
>> -Greg

> Thanks Greg, though sadly I've tried that; whatever I do, one of the 4 
> osds involved will simply crash (not just the ones I previously tried to 
> re-import via ceph-objectstore-tool). I just spend time chasing them 
> around but never succeeding in having a complete set run long enough to 
> make progress. They seem to crash when starting backfill on the next 
> object. There has to be something in the current set of shards which it 
> can't handle.
> 
> Since then I've been focusing on trying to get the pg to revert to an 
> earlier interval using osd_find_best_info_ignore_history_les, though the 
> information I find around it is minimal.

Since my experiments with osd_find_best_info_ignore_history_les have not 
borne any results, I'm looking again at the osd crashes when I get 
enough of them running for backfill to start.

They all crash in the same way; with "debug osd=10", the very last bit is:

>     -2> 2018-10-09 16:27:25.425004 7faa866bd700 10 osd.190 pg_epoch: 710808 pg[70.82ds2( v 710799'704745 (586066'698574,710799'704745] local-lis/les=710807/710808 n=102929 ec=21494/21494 lis/c 710807/588565 les/c/f 710808/588566/0 710711/710807/710807) [820,761,105,789,562,485]/[2147483647,2147483647,190,448,61,315]p190(2) r=2 lpr=710807 pi=[588565,710807)/39 rops=1 bft=105(2),485(5),562(4),761(1),789(3),820(0) crt=710799'70
> 4745 lcod 0'0 mlcod 0'0 active+undersized+degraded+remapped+backfilling] continue_recovery_op: continuing RecoveryOp(hoid=70:b415ca12:::default.10341087.1__shadow_Subjects%2fsub-NDARINV6BVVAY29%2fses-baselineYear1Arm1%2ffunc%2frun-04_AROMA%2fdenoised_func_data_nonaggr.nii.2~VfI6rAYnU4XtzUcFAoecJDU5TpVR8AP.16_3:head v=579167'695462 missing_on=105(2),485(5),562(4),761(1),789(3),820(0) missing_on_shards=0,1,2,3,4,5 recovery_inf
> o=ObjectRecoveryInfo(70:b415ca12:::default.10341087.1__shadow_Subjects%2fsub-NDARINV6BVVAY29%2fses-baselineYear1Arm1%2ffunc%2frun-04_AROMA%2fdenoised_func_data_nonaggr.nii.2~VfI6rAYnU4XtzUcFAoecJDU5TpVR8AP.16_3:head at 579167'695462, size: 0, copy_subset: [], clone_subset: {}, snapset: 0=[]:[]) recovery_progress=ObjectRecoveryProgress(first, data_recovered_to:0, data_complete:false, omap_recovered_to:, omap_complete:true, error
> :false) obc refcount=3 state=READING waiting_on_pushes= extent_requested=0,8388608)
>     -1> 2018-10-09 16:27:25.425105 7faa866bd700 10 osd.190 pg_epoch: 710808 pg[70.82ds2( v 710799'704745 (586066'698574,710799'704745] local-lis/les=710807/710808 n=102929 ec=21494/21494 lis/c 710807/588565 les/c/f 710808/588566/0 710711/710807/710807) [820,761,105,789,562,485]/[2147483647,2147483647,190,448,61,315]p190(2) r=2 lpr=710807 pi=[588565,710807)/39 rops=1 bft=105(2),485(5),562(4),761(1),789(3),820(0) crt=710799'70
> 4745 lcod 0'0 mlcod 0'0 active+undersized+degraded+remapped+backfilling] continue_recovery_op: before_progress=ObjectRecoveryProgress(first, data_recovered_to:0, data_complete:false, omap_recovered_to:, omap_complete:true, error:false), after_progress=ObjectRecoveryProgress(!first, data_recovered_to:0, data_complete:true, omap_recovered_to:, omap_complete:true, error:false), pop.data.length()=786432, size=0
>      0> 2018-10-09 16:27:25.430239 7faa866bd700 -1 /build/ceph-12.2.8/src/osd/ECBackend.cc: In function 'void ECBackend::continue_recovery_op(ECBackend::RecoveryOp&, RecoveryMessages*)' thread 7faa866bd700 time 2018-10-09 16:27:25.425135
> /build/ceph-12.2.8/src/osd/ECBackend.cc: 619: FAILED assert(pop.data.length() == sinfo.aligned_logical_offset_to_chunk_offset( after_progress.data_recovered_to - op.recovery_progress.data_recovered_to))

It sounds like we're getting an unexpected data chunk size, presumably 
due to some kind of data corruption.

Though if I search on all acting OSDs (up or down) for the object on 
which backfill crashed, I do find all shards and they do at least have a 
consistent size... perhaps that size itself is different from expected.

osd 233:
> root at ceph01:/var/lib/ceph/osd/ceph-233/current# find 70.82ds1_head -name '*NDARINV6BVVAY29*' -type f -ls                                                                                                                                                                                                                                                                                                                                    
> 23624833062  772 -rw-r--r--   1 root     root       786432 Sep 26 15:32 70.82ds1_head/DIR_D/DIR_2/DIR_8/DIR_A/DIR_3/default.10341087.1\\u\\ushadow\\uSubjects\\ssub-NDARINV6BVVAY29\\sses-baselineYear1Arm1\\sfunc\\srun-04\\uAROMA\\sdenoised\\ufunc\\udata\\unonaggr.nii.2~VfI6rAYnU4XtzUcFAoecJDU5TpVR8AP.16\\u3__head_4853A82D__46_ffffffffffffffff_1
osd 190:
> root at ceph07:/var/lib/ceph/osd/ceph-190/current# find 70.82ds2_head -name '*NDARINV6BVVAY29*' -type f -ls                                                                                                                                                                                                                                                                                                                                    
> 23710021684  772 -rw-r--r--   1 root     root       786432 Jun 27 07:13 70.82ds2_head/DIR_D/DIR_2/DIR_8/DIR_A/DIR_3/DIR_5/default.10341087.1\\u\\ushadow\\uSubjects\\ssub-NDARINV6BVVAY29\\sses-baselineYear1Arm1\\sfunc\\srun-04\\uAROMA\\sdenoised\\ufunc\\udata\\unonaggr.nii.2~VfI6rAYnU4XtzUcFAoecJDU5TpVR8AP.16\\u3__head_4853A82D__46_ffffffffffffffff_2
osd 448:
> root at ceph09:/var/lib/ceph/osd/ceph-448/current# find 70.82ds3_head -name '*NDARINV6BVVAY29*' -type f -ls                                                                                                                                                                                                                                                                                                                                    
> 3259920428  772 -rw-r--r--   1 root     root       786432 Jun 27 07:13 70.82ds3_head/DIR_D/DIR_2/DIR_8/DIR_A/DIR_3/DIR_5/default.10341087.1\\u\\ushadow\\uSubjects\\ssub-NDARINV6BVVAY29\\sses-baselineYear1Arm1\\sfunc\\srun-04\\uAROMA\\sdenoised\\ufunc\\udata\\unonaggr.nii.2~VfI6rAYnU4XtzUcFAoecJDU5TpVR8AP.16\\u3__head_4853A82D__46_ffffffffffffffff_3
osd 315:
> root at ceph03:/var/lib/ceph/osd/ceph-315/current# find 70.82ds5_head -name '*NDARINV6BVVAY29*' -type f -ls                                                                                                                                                                                                                                                                                                                                    
> 64439371813  772 -rw-r--r--   1 root     root       786432 Jun 27 07:13 70.82ds5_head/DIR_D/DIR_2/DIR_8/DIR_A/DIR_3/DIR_5/default.10341087.1\\u\\ushadow\\uSubjects\\ssub-NDARINV6BVVAY29\\sses-baselineYear1Arm1\\sfunc\\srun-04\\uAROMA\\sdenoised\\ufunc\\udata\\unonaggr.nii.2~VfI6rAYnU4XtzUcFAoecJDU5TpVR8AP.16\\u3__head_4853A82D__46_ffffffffffffffff_5
osd 98:
> root at ceph05:/var/lib/ceph/osd/ceph-98/current# find 70.82ds0_head -name '*NDARINV6BVVAY29*' -type f -ls                                                                                                                                                                                                                                                                                                                                     
> 2428799806  768 -rw-r--r--   1 root     root       786432 Jun 27 07:13 70.82ds0_head/DIR_D/DIR_2/DIR_8/DIR_A/DIR_3/DIR_5/default.10341087.1\\u\\ushadow\\uSubjects\\ssub-NDARINV6BVVAY29\\sses-baselineYear1Arm1\\sfunc\\srun-04\\uAROMA\\sdenoised\\ufunc\\udata\\unonaggr.nii.2~VfI6rAYnU4XtzUcFAoecJDU5TpVR8AP.16\\u3__head_4853A82D__46_ffffffffffffffff_0
osd 61:
> root at ceph04:~# ls -l /var/lib/ceph/osd/ceph-61/current/70.82ds4_head/DIR_D/DIR_2/DIR_8/DIR_A/DIR_3/DIR_5/*NDARINV6BVVAY29*
> -rw-r--r-- 1 root root 786432 Jun 27 07:13 /var/lib/ceph/osd/ceph-61/current/70.82ds4_head/DIR_D/DIR_2/DIR_8/DIR_A/DIR_3/DIR_5/default.10341087.1\u\ushadow\uSubjects\ssub-NDARINV6BVVAY29\sses-baselineYear1Arm1\sfunc\srun-04\uAROMA\sdenoised\ufunc\udata\unonaggr.nii.2~VfI6rAYnU4XtzUcFAoecJDU5TpVR8AP.16\u3__head_4853A82D__46_ffffffffffffffff_4


Highly motivated to keep looking for a solution... around 1PB of data in 
this pool!

Graham


More information about the ceph-users mailing list