[ceph-users] 1 osd Segmentation fault in test cluster

Gregory Farnum gfarnum at redhat.com
Mon Oct 2 08:15:56 PDT 2017


Please file a tracker ticket with all the info you have for stuff like
this. They’re a lot harder to lose than emails are. ;)
On Sat, Sep 30, 2017 at 8:31 AM Marc Roos <M.Roos at f1-outsourcing.eu> wrote:

> Is this useful for someone?
>
>
>
> [Sat Sep 30 15:51:11 2017] libceph: osd5 192.168.10.113:6809 socket
> closed (con state OPEN)
> [Sat Sep 30 15:51:11 2017] libceph: osd5 192.168.10.113:6809 socket
> closed (con state CONNECTING)
> [Sat Sep 30 15:51:11 2017] libceph: osd5 down
> [Sat Sep 30 15:51:11 2017] libceph: osd5 down
> [Sat Sep 30 15:52:52 2017] libceph: osd5 up
> [Sat Sep 30 15:52:52 2017] libceph: osd5 up
>
>
>
> 2017-09-30 15:48:08.542202 7f7623ce9700  0 log_channel(cluster) log
> [WRN] : slow request 31.456482 seconds old, received at 2017-09-30
> 15:47:37.085589: osd_op(mds.0.9227:1289186 20.2b 20.9af42b6b (undecoded)
> ondisk+write+known_if_redirected+full_force e15675) currently
> queued_for_pg
> 2017-09-30 15:48:08.542207 7f7623ce9700  0 log_channel(cluster) log
> [WRN] : slow request 31.456086 seconds old, received at 2017-09-30
> 15:47:37.085984: osd_op(mds.0.9227:1289190 20.13 20.e44f3f53 (undecoded)
> ondisk+write+known_if_redirected+full_force e15675) currently
> queued_for_pg
> 2017-09-30 15:48:08.542212 7f7623ce9700  0 log_channel(cluster) log
> [WRN] : slow request 31.456005 seconds old, received at 2017-09-30
> 15:47:37.086065: osd_op(mds.0.9227:1289194 20.2b 20.6733bdeb (undecoded)
> ondisk+write+known_if_redirected+full_force e15675) currently
> queued_for_pg
> 2017-09-30 15:51:12.592490 7f7611cc5700  0 log_channel(cluster) log
> [DBG] : 20.3f scrub starts
> 2017-09-30 15:51:24.514602 7f76214e4700 -1 *** Caught signal
> (Segmentation fault) **
>  in thread 7f76214e4700 thread_name:bstore_mempool
>
>  ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous
> (stable)
>  1: (()+0xa29511) [0x7f762e5b2511]
>  2: (()+0xf370) [0x7f762afa5370]
>  3: (BlueStore::TwoQCache::_trim(unsigned long, unsigned long)+0x2df)
> [0x7f762e481a2f]
>  4: (BlueStore::Cache::trim(unsigned long, float, float, float)+0x1d1)
> [0x7f762e4543e1]
>  5: (BlueStore::MempoolThread::entry()+0x14d) [0x7f762e45a71d]
>  6: (()+0x7dc5) [0x7f762af9ddc5]
>  7: (clone()+0x6d) [0x7f762a09176d]
>  NOTE: a copy of the executable, or `objdump -rdS <executable>` is
> needed to interpret this.
>
> --- begin dump of recent events ---
> -10000> 2017-09-30 15:51:05.105915 7f76284ac700  5 --
> 192.168.10.113:0/27661 >> 192.168.10.111:6810/6617 conn(0x7f766b736000
> :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=19 cs=1 l=1). rx
> osd.0 seq 19546 0x7f76a2daf000 osd_ping(ping_reply e15675 stamp
> 2017-09-30 15:51:05.105439) v4
>  -9999> 2017-09-30 15:51:05.105963 7f760fcc1700  1 -- 10.0.0.13:0/27661
> --> 10.0.0.11:6805/6491 -- osd_ping(ping e15675 stamp 2017-09-30
> 15:51:05.105439) v4 -- 0x7f7683e98a00 con 0
>  -9998> 2017-09-30 15:51:05.105960 7f76284ac700  1 --
> 192.168.10.113:0/27661 <== osd.0 192.168.10.111:6810/6617 19546 ====
> osd_ping(ping_reply e15675 stamp 2017-09-30 15:51:05.105439) v4 ====
> 2004+0+0 (1212154800 0 0) 0x7f76a2daf000 con 0x7f766b736000
>  -9997> 2017-09-30 15:51:05.105961 7f76274aa700  5 -- 10.0.0.13:0/27661
> >> 10.0.0.11:6808/6646 conn(0x7f766b745800 :-1
> s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=24 cs=1 l=1). rx osd.3
> seq 19546 0x7f769b95f200 osd_ping(ping_reply e15675 stamp 2017-09-30
> 15:51:05.105439) v4
>  -9996> 2017-09-30 15:51:05.105983 7f760fcc1700  1 --
> 192.168.10.113:0/27661 --> 192.168.10.111:6805/6491 -- osd_ping(ping
> e15675 stamp 2017-09-30 15:51:05.105439) v4 -- 0x7f7683e97600 con 0
>  -9995> 2017-09-30 15:51:05.106001 7f76274aa700  1 -- 10.0.0.13:0/27661
> <== osd.3 10.0.0.11:6808/6646 19546 ==== osd_ping(ping_reply e15675
> stamp 2017-09-30 15:51:05.105439) v4 ==== 2004+0+0 (1212154800 0 0)
> 0x7f769b95f200 con 0x7f766b745800
>  -9994> 2017-09-30 15:51:05.106015 7f760fcc1700  1 -- 10.0.0.13:0/27661
> --> 10.0.0.11:6807/6470 -- osd_ping(ping e15675 stamp 2017-09-30
> 15:51:05.105439) v4 -- 0x7f7683e99800 con 0
>  -9993> 2017-09-30 15:51:05.106035 7f760fcc1700  1 --
> 192.168.10.113:0/27661 --> 192.168.10.111:6808/6470 -- osd_ping(ping
> e15675 stamp 2017-09-30 15:51:05.105439) v4 -- 0x7f763b72a200 con 0
>  -9992> 2017-09-30 15:51:05.106072 7f760fcc1700  1 -- 10.0.0.13:0/27661
> --> 10.0.0.11:6809/6710 -- osd_ping(ping e15675 stamp 2017-09-30
> 15:51:05.105439) v4 -- 0x7f768633dc00 con 0
>  -9991> 2017-09-30 15:51:05.106093 7f760fcc1700  1 --
> 192.168.10.113:0/27661 --> 192.168.10.111:6804/6710 -- osd_ping(ping
> e15675 stamp 2017-09-30 15:51:05.105439) v4 -- 0x7f76667d3600 con 0
>  -9990> 2017-09-30 15:51:05.106114 7f760fcc1700  1 -- 10.0.0.13:0/27661
> --> 10.0.0.12:6805/1949 -- osd_ping(ping e15675 stamp 2017-09-30
> 15:51:05.105439) v4 -- 0x7f768fcd6200 con 0
>  -9989> 2017-09-30 15:51:05.106134 7f760fcc1700  1 --
> 192.168.10.113:0/27661 --> 192.168.10.112:6805/1949 -- osd_ping(ping
> e15675 stamp 2017-09-30 15:51:05.105439) v4 -- 0x7f765f27a800 con 0
>
>
> ...
>
>
>   -29> 2017-09-30 15:51:24.469620 7f7611cc5700  1 --
> 10.0.0.13:6808/27661 --> 10.0.0.12:6800/1947 -- replica scrub(pg:
> 20.3f,from:0'0,to:0'0,epoch:15675/15644,start:20:fc9e27a6:::100008d7d4e.
> 000000cd:0,end:20:fc9e74a2:::100006a91fc.00000000:0,chunky:1,deep:0,seed
> :4294967295,version:7) v7 -- 0x7f76514c1600 con 0
>    -28> 2017-09-30 15:51:24.474230 7f76284ac700  5 --
> 10.0.0.13:6808/27661 >> 10.0.0.12:6800/1947 conn(0x7f76831ac000 :6808
> s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=4 cs=1 l=0). rx osd.10
> seq 28454 0x7f77072a1480 rep_scrubmap(20.3f e15675 from shard 10) v1
>    -27> 2017-09-30 15:51:24.474277 7f76284ac700  1 --
> 10.0.0.13:6808/27661 <== osd.10 10.0.0.12:6800/1947 28454 ====
> rep_scrubmap(20.3f e15675 from shard 10) v1 ==== 39+0+18285 (2345442504
> 0 4245554900) 0x7f77072a1480 con 0x7f76831ac000
>    -26> 2017-09-30 15:51:24.476534 7f7627cab700  5 --
> 10.0.0.13:6808/27661 >> 10.0.0.11:6800/6617 conn(0x7f766bbbe000 :-1
> s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=23 cs=1 l=0). rx osd.0
> seq 153542 0x7f76b1689180 rep_scrubmap(20.3f e15675 from shard 0) v1
>    -25> 2017-09-30 15:51:24.476573 7f7627cab700  1 --
> 10.0.0.13:6808/27661 <== osd.0 10.0.0.11:6800/6617 153542 ====
> rep_scrubmap(20.3f e15675 from shard 0) v1 ==== 39+0+18285 (1049184257 0
> 4245554900) 0x7f76b1689180 con 0x7f766bbbe000
>    -24> 2017-09-30 15:51:24.477065 7f7611cc5700  2 osd.5 pg_epoch: 15675
> pg[20.3f( v 15675'636579 (15675'635005,15675'636579]
> local-lis/les=15644/15645 n=120537 ec=3902/3902 lis/c 15644/15644
> les/c/f 15645/15669/0 15644/15644/14120) [5,0,10] r=0 lpr=15644
> crt=15675'636579 lcod 15675'636578 mlcod 15675'636578
> active+clean+scrubbing] scrub_compare_maps   osd.5 has 25 items
>    -23> 2017-09-30 15:51:24.477097 7f7611cc5700  2 osd.5 pg_epoch: 15675
> pg[20.3f( v 15675'636579 (15675'635005,15675'636579]
> local-lis/les=15644/15645 n=120537 ec=3902/3902 lis/c 15644/15644
> les/c/f 15645/15669/0 15644/15644/14120) [5,0,10] r=0 lpr=15644
> crt=15675'636579 lcod 15675'636578 mlcod 15675'636578
> active+clean+scrubbing] scrub_compare_maps replica 0 has 25 items
>    -22> 2017-09-30 15:51:24.477113 7f7611cc5700  2 osd.5 pg_epoch: 15675
> pg[20.3f( v 15675'636579 (15675'635005,15675'636579]
> local-lis/les=15644/15645 n=120537 ec=3902/3902 lis/c 15644/15644
> les/c/f 15645/15669/0 15644/15644/14120) [5,0,10] r=0 lpr=15644
> crt=15675'636579 lcod 15675'636578 mlcod 15675'636578
> active+clean+scrubbing] scrub_compare_maps replica 10 has 25 items
>    -21> 2017-09-30 15:51:24.478038 7f7611cc5700  2 osd.5 pg_epoch: 15675
> pg[20.3f( v 15675'636579 (15675'635005,15675'636579]
> local-lis/les=15644/15645 n=120537 ec=3902/3902 lis/c 15644/15644
> les/c/f 15645/15669/0 15644/15644/14120) [5,0,10] r=0 lpr=15644
> crt=15675'636579 lcod 15675'636578 mlcod 15675'636578
> active+clean+scrubbing]
>    -20> 2017-09-30 15:51:24.483045 7f7611cc5700  1 --
> 10.0.0.13:6808/27661 --> 10.0.0.11:6800/6617 -- replica scrub(pg:
> 20.3f,from:0'0,to:15675'635985,epoch:15675/15644,start:20:fc9e74a2:::100
> 006a91fc.00000000:0,end:20:fc9ea723:::100000f77c1.00000000:0,chunky:1,de
> ep:0,seed:4294967295,version:7) v7 -- 0x7f76514c18c0 con 0
>    -19> 2017-09-30 15:51:24.483103 7f7611cc5700  1 --
> 10.0.0.13:6808/27661 --> 10.0.0.12:6800/1947 -- replica scrub(pg:
> 20.3f,from:0'0,to:15675'635985,epoch:15675/15644,start:20:fc9e74a2:::100
> 006a91fc.00000000:0,end:20:fc9ea723:::100000f77c1.00000000:0,chunky:1,de
> ep:0,seed:4294967295,version:7) v7 -- 0x7f76514c1b80 con 0
>    -18> 2017-09-30 15:51:24.487785 7f7627cab700  5 --
> 10.0.0.13:6808/27661 >> 10.0.0.11:6800/6617 conn(0x7f766bbbe000 :-1
> s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=23 cs=1 l=0). rx osd.0
> seq 153543 0x7f76b1689900 rep_scrubmap(20.3f e15675 from shard 0) v1
>    -17> 2017-09-30 15:51:24.487816 7f7627cab700  1 --
> 10.0.0.13:6808/27661 <== osd.0 10.0.0.11:6800/6617 153543 ====
> rep_scrubmap(20.3f e15675 from shard 0) v1 ==== 39+0+17701 (1049184257 0
> 1206823045) 0x7f76b1689900 con 0x7f766bbbe000
>    -16> 2017-09-30 15:51:24.487832 7f76284ac700  5 --
> 10.0.0.13:6808/27661 >> 10.0.0.12:6800/1947 conn(0x7f76831ac000 :6808
> s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=4 cs=1 l=0). rx osd.10
> seq 28455 0x7f77072a1200 rep_scrubmap(20.3f e15675 from shard 10) v1
>    -15> 2017-09-30 15:51:24.487869 7f76284ac700  1 --
> 10.0.0.13:6808/27661 <== osd.10 10.0.0.12:6800/1947 28455 ====
> rep_scrubmap(20.3f e15675 from shard 10) v1 ==== 39+0+17701 (2345442504
> 0 1206823045) 0x7f77072a1200 con 0x7f76831ac000
>    -14> 2017-09-30 15:51:24.488423 7f7611cc5700  2 osd.5 pg_epoch: 15675
> pg[20.3f( v 15675'636579 (15675'635005,15675'636579]
> local-lis/les=15644/15645 n=120537 ec=3902/3902 lis/c 15644/15644
> les/c/f 15645/15669/0 15644/15644/14120) [5,0,10] r=0 lpr=15644
> crt=15675'636579 lcod 15675'636578 mlcod 15675'636578
> active+clean+scrubbing] scrub_compare_maps   osd.5 has 25 items
>    -13> 2017-09-30 15:51:24.488451 7f7611cc5700  2 osd.5 pg_epoch: 15675
> pg[20.3f( v 15675'636579 (15675'635005,15675'636579]
> local-lis/les=15644/15645 n=120537 ec=3902/3902 lis/c 15644/15644
> les/c/f 15645/15669/0 15644/15644/14120) [5,0,10] r=0 lpr=15644
> crt=15675'636579 lcod 15675'636578 mlcod 15675'636578
> active+clean+scrubbing] scrub_compare_maps replica 0 has 25 items
>    -12> 2017-09-30 15:51:24.488466 7f7611cc5700  2 osd.5 pg_epoch: 15675
> pg[20.3f( v 15675'636579 (15675'635005,15675'636579]
> local-lis/les=15644/15645 n=120537 ec=3902/3902 lis/c 15644/15644
> les/c/f 15645/15669/0 15644/15644/14120) [5,0,10] r=0 lpr=15644
> crt=15675'636579 lcod 15675'636578 mlcod 15675'636578
> active+clean+scrubbing] scrub_compare_maps replica 10 has 25 items
>    -11> 2017-09-30 15:51:24.489370 7f7611cc5700  2 osd.5 pg_epoch: 15675
> pg[20.3f( v 15675'636579 (15675'635005,15675'636579]
> local-lis/les=15644/15645 n=120537 ec=3902/3902 lis/c 15644/15644
> les/c/f 15645/15669/0 15644/15644/14120) [5,0,10] r=0 lpr=15644
> crt=15675'636579 lcod 15675'636578 mlcod 15675'636578
> active+clean+scrubbing]
>    -10> 2017-09-30 15:51:24.493346 7f7611cc5700  1 --
> 10.0.0.13:6808/27661 --> 10.0.0.11:6800/6617 -- replica scrub(pg:
> 20.3f,from:0'0,to:15675'635865,epoch:15675/15644,start:20:fc9ea723:::100
> 000f77c1.00000000:0,end:20:fc9edf2c:::100008c400a.000005af:0,chunky:1,de
> ep:0,seed:4294967295,version:7) v7 -- 0x7f76514c1e40 con 0
>     -9> 2017-09-30 15:51:24.493400 7f7611cc5700  1 --
> 10.0.0.13:6808/27661 --> 10.0.0.12:6800/1947 -- replica scrub(pg:
> 20.3f,from:0'0,to:15675'635865,epoch:15675/15644,start:20:fc9ea723:::100
> 000f77c1.00000000:0,end:20:fc9edf2c:::100008c400a.000005af:0,chunky:1,de
> ep:0,seed:4294967295,version:7) v7 -- 0x7f76514c2100 con 0
>     -8> 2017-09-30 15:51:24.497410 7f76284ac700  5 --
> 10.0.0.13:6808/27661 >> 10.0.0.12:6800/1947 conn(0x7f76831ac000 :6808
> s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=4 cs=1 l=0). rx osd.10
> seq 28456 0x7f77072a0f80 rep_scrubmap(20.3f e15675 from shard 10) v1
>     -7> 2017-09-30 15:51:24.497456 7f76284ac700  1 --
> 10.0.0.13:6808/27661 <== osd.10 10.0.0.12:6800/1947 28456 ====
> rep_scrubmap(20.3f e15675 from shard 10) v1 ==== 39+0+20357 (2345442504
> 0 574503752) 0x7f77072a0f80 con 0x7f76831ac000
>     -6> 2017-09-30 15:51:24.499268 7f7627cab700  5 --
> 10.0.0.13:6808/27661 >> 10.0.0.11:6800/6617 conn(0x7f766bbbe000 :-1
> s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=23 cs=1 l=0). rx osd.0
> seq 153544 0x7f767d915700 rep_scrubmap(20.3f e15675 from shard 0) v1
>     -5> 2017-09-30 15:51:24.499306 7f7627cab700  1 --
> 10.0.0.13:6808/27661 <== osd.0 10.0.0.11:6800/6617 153544 ====
> rep_scrubmap(20.3f e15675 from shard 0) v1 ==== 39+0+20357 (1049184257 0
> 574503752) 0x7f767d915700 con 0x7f766bbbe000
>     -4> 2017-09-30 15:51:24.499581 7f7611cc5700  2 osd.5 pg_epoch: 15675
> pg[20.3f( v 15675'636579 (15675'635005,15675'636579]
> local-lis/les=15644/15645 n=120537 ec=3902/3902 lis/c 15644/15644
> les/c/f 15645/15669/0 15644/15644/14120) [5,0,10] r=0 lpr=15644
> crt=15675'636579 lcod 15675'636578 mlcod 15675'636578
> active+clean+scrubbing] scrub_compare_maps   osd.5 has 25 items
>     -3> 2017-09-30 15:51:24.499613 7f7611cc5700  2 osd.5 pg_epoch: 15675
> pg[20.3f( v 15675'636579 (15675'635005,15675'636579]
> local-lis/les=15644/15645 n=120537 ec=3902/3902 lis/c 15644/15644
> les/c/f 15645/15669/0 15644/15644/14120) [5,0,10] r=0 lpr=15644
> crt=15675'636579 lcod 15675'636578 mlcod 15675'636578
> active+clean+scrubbing] scrub_compare_maps replica 0 has 25 items
>     -2> 2017-09-30 15:51:24.499629 7f7611cc5700  2 osd.5 pg_epoch: 15675
> pg[20.3f( v 15675'636579 (15675'635005,15675'636579]
> local-lis/les=15644/15645 n=120537 ec=3902/3902 lis/c 15644/15644
> les/c/f 15645/15669/0 15644/15644/14120) [5,0,10] r=0 lpr=15644
> crt=15675'636579 lcod 15675'636578 mlcod 15675'636578
> active+clean+scrubbing] scrub_compare_maps replica 10 has 25 items
>     -1> 2017-09-30 15:51:24.500587 7f7611cc5700  2 osd.5 pg_epoch: 15675
> pg[20.3f( v 15675'636579 (15675'635005,15675'636579]
> local-lis/les=15644/15645 n=120537 ec=3902/3902 lis/c 15644/15644
> les/c/f 15645/15669/0 15644/15644/14120) [5,0,10] r=0 lpr=15644
> crt=15675'636579 lcod 15675'636578 mlcod 15675'636578
> active+clean+scrubbing]
>      0> 2017-09-30 15:51:24.514602 7f76214e4700 -1 *** Caught signal
> (Segmentation fault) **
>  in thread 7f76214e4700 thread_name:bstore_mempool
>
>  ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous
> (stable)
>  1: (()+0xa29511) [0x7f762e5b2511]
>  2: (()+0xf370) [0x7f762afa5370]
>  3: (BlueStore::TwoQCache::_trim(unsigned long, unsigned long)+0x2df)
> [0x7f762e481a2f]
>  4: (BlueStore::Cache::trim(unsigned long, float, float, float)+0x1d1)
> [0x7f762e4543e1]
>  5: (BlueStore::MempoolThread::entry()+0x14d) [0x7f762e45a71d]
>  6: (()+0x7dc5) [0x7f762af9ddc5]
>  7: (clone()+0x6d) [0x7f762a09176d]
>  NOTE: a copy of the executable, or `objdump -rdS <executable>` is
> needed to interpret this.
>
> --- logging levels ---
>    0/ 5 rados
>    0/ 5 rbd
>    0/ 5 rbd_mirror
>    0/ 5 rbd_replay
>    0/ 5 journaler
>    0/ 5 objectcacher
>
>    4/ 5 leveldb
>    4/ 5 memdb
>    1/ 5 kinetic
>    1/ 5 fuse
>    1/ 5 mgr
>    1/ 5 mgrc
>    1/ 5 dpdk
>    1/ 5 eventtrace
>   -2/-2 (syslog threshold)
>   -1/-1 (stderr threshold)
>   max_recent     10000
>   max_new         1000
>   log_file /var/log/ceph/ceph-osd.5.log
> --- end dump of recent events ---
> 2017-09-30 15:51:44.957962 7f045fa43d00 -1 WARNING: the following
> dangerous and experimental features are enabled: bluestore
> 2017-09-30 15:51:44.957982 7f045fa43d00  0 set uid:gid to 167:167
> (ceph:ceph)
> 2017-09-30 15:51:44.958002 7f045fa43d00  0 ceph version 12.2.1
> (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous (stable), process
> (unknown), pid 11398
> 2017-09-30 15:51:44.962184 7f045fa43d00  0 pidfile_write: ignore empty
> --pid-file
> 2017-09-30 15:51:44.997185 7f045fa43d00 -1 WARNING: the following
> dangerous and experimental features are enabled: bluestore
>
> _______________________________________________
> ceph-users mailing list
> ceph-users at lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20171002/7551254c/attachment.html>


More information about the ceph-users mailing list