[ceph-users] Error in MDS (laggy or creshed)

Yan, Zheng ukernel at gmail.com
Sun Oct 7 20:40:03 PDT 2018


On Mon, Oct 8, 2018 at 10:32 AM Alfredo Daniel Rezinovsky
<alfrenovsky at gmail.com> wrote:
>
> I tried to downgrade and the mds still fails.
>
> I removed the mds and created them again.
>
> I already messed up with the purgue_queue. Can I reset the queue (with
> 13.2.1) ?

have you run 'ceph mds repaired ...' ?

>
> Thanks for your help
>
> On 07/10/18 23:18, Yan, Zheng wrote:
> > Sorry there is bug in 13.2.2 that breaks compatibility of purge queue
> > disk format. Please downgrading mds to 13.2.1, then run 'ceph mds
> > repaired cephfs_name:0'.
> >
> > Regards
> > Yan, Zheng
> > On Mon, Oct 8, 2018 at 9:20 AM Alfredo Daniel Rezinovsky
> > <alfrenovsky at gmail.com> wrote:
> >> Cluster with 4 nodes
> >>
> >> node 1: 2 HDDs
> >> node 2: 3 HDDs
> >> node 3: 3 HDDs
> >> node 4: 2 HDDs
> >>
> >> After a problem with upgrade from 13.2.1 to 13.2.2 (I restarted the
> >> nodes 1 at a time, think that was the problem)
> >>
> >> I upgraded with ubuntu apt-get upgrade. I had 1 active mds at a time
> >> when did the upgrade.
> >>
> >> All MDSs stopped working
> >>
> >> Status shows 1 crashed and no one in standby.
> >>
> >> If I restart an MDS status shows replay then crash with this log output:
> >>
> >>    ceph version 13.2.2 (02899bfda814146b021136e9d8e80eba494e1126) mimic
> >> (stable)
> >> 1: (()+0x3f5480) [0x555de8a51480]
> >> 2: (()+0x12890) [0x7f6e4cb41890]
> >> 3: (gsignal()+0xc7) [0x7f6e4bc39e97]
> >> 4: (abort()+0x141) [0x7f6e4bc3b801]
> >> 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char
> >> const*)+0x250) [0x7f6e4d22a710]
> >> 6: (()+0x26c787) [0x7f6e4d22a787]
> >> 7: (EMetaBlob::replay(MDSRank*, LogSegment*, MDSlaveUpdate*)+0x5f4b)
> >> [0x555de8a3c83b]
> >> 8: (EUpdate::replay(MDSRank*)+0x39) [0x555de8a3dd79]
> >> 9: (MDLog::_replay_thread()+0x864) [0x555de89e6e04]
> >> 10: (MDLog::ReplayThread::entry()+0xd) [0x555de8784ebd]
> >> 11: (()+0x76db) [0x7f6e4cb366db]
> >> 12: (clone()+0x3f) [0x7f6e4bd1c88f]
> >> NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed
> >> to interpret this
> >>
> >> journal reports OK
> >>
> >> Now im trying:
> >>
> >>    cephfs-data-scan scan_extents cephfs_data
> >>
> >>
> >> _______________________________________________
> >> ceph-users mailing list
> >> ceph-users at lists.ceph.com
> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>


More information about the ceph-users mailing list