[ceph-users] mimic: 3/4 OSDs crashed on "bluefs enospc"

Igor Fedotov ifedotov at suse.de
Mon Oct 1 05:01:47 PDT 2018


Hi Sergey,

could you please provide more details on your OSDs ?

What are sizes for DB/block devices?

Do you have any modifications in BlueStore config settings?

Can you share stats you're referring to?


Thanks,

Igor


On 10/1/2018 12:29 PM, Sergey Malinin wrote:
> Hello,
> 3 of 4 NVME OSDs crashed at the same time on assert(0 == "bluefs enospc") and no longer start.
> Stats collected just before crash show that ceph_bluefs_db_used_bytes is 100% used. Although OSDs have over 50% of free space, it is not reallocated for DB usage.
>
> 2018-10-01 12:18:06.744 7f1d6a04d240  1 bluefs _allocate failed to allocate 0x100000 on bdev 1, free 0x0; fallback to bdev 2
> 2018-10-01 12:18:06.744 7f1d6a04d240 -1 bluefs _allocate failed to allocate 0x100000 on bdev 2, dne
> 2018-10-01 12:18:06.744 7f1d6a04d240 -1 bluefs _flush_range allocated: 0x0 offset: 0x0 length: 0xa8700
> 2018-10-01 12:18:06.748 7f1d6a04d240 -1 /build/ceph-13.2.2/src/os/bluestore/BlueFS.cc: In function 'int BlueFS::_flush_range(BlueFS::FileWriter*, uint64_t, uint64_t)' thread 7f1d6a04d240 time 2018-10-01 12:18:06.746800
> /build/ceph-13.2.2/src/os/bluestore/BlueFS.cc: 1663: FAILED assert(0 == "bluefs enospc")
>
>   ceph version 13.2.2 (02899bfda814146b021136e9d8e80eba494e1126) mimic (stable)
>   1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) [0x7f1d6146f5c2]
>   2: (()+0x26c787) [0x7f1d6146f787]
>   3: (BlueFS::_flush_range(BlueFS::FileWriter*, unsigned long, unsigned long)+0x1ab4) [0x5586b22684b4]
>   4: (BlueRocksWritableFile::Flush()+0x3d) [0x5586b227ec1d]
>   5: (rocksdb::WritableFileWriter::Flush()+0x1b9) [0x5586b2473399]
>   6: (rocksdb::WritableFileWriter::Sync(bool)+0x3b) [0x5586b247442b]
>   7: (rocksdb::BuildTable(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, rocksdb::Env*, rocksdb::ImmutableCFOptions const&, rocksdb::MutableCFOptions const&, rocksdb::EnvOptions const&, rock
> sdb::TableCache*, rocksdb::InternalIterator*, std::unique_ptr<rocksdb::InternalIterator, std::default_delete<rocksdb::InternalIterator> >, rocksdb::FileMetaData*, rocksdb::InternalKeyComparator const&, std::vector<std::unique_ptr<
> rocksdb::IntTblPropCollectorFactory, std::default_delete<rocksdb::IntTblPropCollectorFactory> >, std::allocator<std::unique_ptr<rocksdb::IntTblPropCollectorFactory, std::default_delete<rocksdb::IntTblPropCollectorFactory> > > > co
> nst*, unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<unsigned long, std::allocator<unsigned long> >, unsigned long, rocksdb::SnapshotChecker*, rocksdb::Compression
> Type, rocksdb::CompressionOptions const&, bool, rocksdb::InternalStats*, rocksdb::TableFileCreationReason, rocksdb::EventLogger*, int, rocksdb::Env::IOPriority, rocksdb::TableProperties*, int, unsigned long, unsigned long, rocksdb
> ::Env::WriteLifeTimeHint)+0x1e24) [0x5586b249ef94]
>   8: (rocksdb::DBImpl::WriteLevel0TableForRecovery(int, rocksdb::ColumnFamilyData*, rocksdb::MemTable*, rocksdb::VersionEdit*)+0xcb7) [0x5586b2321457]
>   9: (rocksdb::DBImpl::RecoverLogFiles(std::vector<unsigned long, std::allocator<unsigned long> > const&, unsigned long*, bool)+0x19de) [0x5586b232373e]
>   10: (rocksdb::DBImpl::Recover(std::vector<rocksdb::ColumnFamilyDescriptor, std::allocator<rocksdb::ColumnFamilyDescriptor> > const&, bool, bool, bool)+0x5d4) [0x5586b23242f4]
>   11: (rocksdb::DBImpl::Open(rocksdb::DBOptions const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<rocksdb::ColumnFamilyDescriptor, std::allocator<rocksdb::ColumnFamilyDescri
> ptor> > const&, std::vector<rocksdb::ColumnFamilyHandle*, std::allocator<rocksdb::ColumnFamilyHandle*> >*, rocksdb::DB**, bool)+0x68b) [0x5586b232559b]
>   12: (rocksdb::DB::Open(rocksdb::DBOptions const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<rocksdb::ColumnFamilyDescriptor, std::allocator<rocksdb::ColumnFamilyDescriptor
>>> const&, std::vector<rocksdb::ColumnFamilyHandle*, std::allocator<rocksdb::ColumnFamilyHandle*> >*, rocksdb::DB**)+0x22) [0x5586b2326e72]
>   13: (RocksDBStore::do_open(std::ostream&, bool, std::vector<KeyValueDB::ColumnFamily, std::allocator<KeyValueDB::ColumnFamily> > const*)+0x170c) [0x5586b220219c]
>   14: (BlueStore::_open_db(bool, bool)+0xd8e) [0x5586b218ee1e]
>   15: (BlueStore::_mount(bool, bool)+0x4b7) [0x5586b21bf807]
>   16: (OSD::init()+0x295) [0x5586b1d673c5]
>   17: (main()+0x268d) [0x5586b1c554ed]
>   18: (__libc_start_main()+0xe7) [0x7f1d5ea2db97]
>   19: (_start()+0x2a) [0x5586b1d1d7fa]
>   NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
>
> _______________________________________________
> ceph-users mailing list
> ceph-users at lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



More information about the ceph-users mailing list