[ceph-users] Troubleshooting hanging storage backend whenever there is any cluster change

David Turner drakonstein at gmail.com
Thu Oct 11 13:27:12 PDT 2018


You should definitely stop using `size 3 min_size 1` on your pools.  Go
back to the default `min_size 2`.  I'm a little confused why you have 3
different CRUSH rules.  They're all identical.  You only need different
CRUSH rules if you're using Erasure Coding or targeting a different set of
OSDs like SSD vs HDD OSDs for different pools.

All of that said, I don't see anything in those rules that would indicate
why you're having problems with accessing your data when a node is being
restarted.  The `ceph status` and `ceph health detail` outputs will be
helpful while it's happening.

On Thu, Oct 11, 2018 at 3:02 PM Nils Fahldieck - Profihost AG <
n.fahldieck at profihost.ag> wrote:

> Thanks for your reply. I'll capture a `ceph status` the next time I
> encounter a not working RBD. Here's the other output you asked for:
>
> $ ceph osd crush rule dump
> [
>     {
>         "rule_id": 0,
>         "rule_name": "data",
>         "ruleset": 0,
>         "type": 1,
>         "min_size": 1,
>         "max_size": 10,
>         "steps": [
>             {
>                 "op": "take",
>                 "item": -10000,
>                 "item_name": "root"
>             },
>             {
>                 "op": "chooseleaf_firstn",
>                 "num": 0,
>                 "type": "host"
>             },
>             {
>                 "op": "emit"
>             }
>         ]
>     },
>     {
>         "rule_id": 1,
>         "rule_name": "metadata",
>         "ruleset": 1,
>         "type": 1,
>         "min_size": 1,
>         "max_size": 10,
>         "steps": [
>             {
>                 "op": "take",
>                 "item": -10000,
>                 "item_name": "root"
>             },
>             {
>                 "op": "chooseleaf_firstn",
>                 "num": 0,
>                 "type": "host"
>             },
>             {
>                 "op": "emit"
>             }
>         ]
>     },
>     {
>         "rule_id": 2,
>         "rule_name": "rbd",
>         "ruleset": 2,
>         "type": 1,
>         "min_size": 1,
>         "max_size": 10,
>         "steps": [
>             {
>                 "op": "take",
>                 "item": -10000,
>                 "item_name": "root"
>             },
>             {
>                 "op": "chooseleaf_firstn",
>                 "num": 0,
>                 "type": "host"
>             },
>             {
>                 "op": "emit"
>             }
>         ]
>     }
> ]
>
> $ ceph osd pool ls detail
> pool 5 'cephstor1' replicated size 3 min_size 1 crush_rule 0 object_hash
> rjenkins pg_num 4096 pgp_num 4096 last_change 1217074 flags hashpspool
> min_read_recency_for_promote 1 min_write_recency_for_promote 1
> stripe_width 0 application rbd
>         removed_snaps
>
> [1~9,b~1,d~7d1e8,7d1f6~3d05f,ba256~4bd9,bee30~357,bf188~5531,c46ba~85b3,ccc6e~b599,d820b~1,d820d~1,d820f~1,d8211~1,d8214~1,d8216~1,d8219~2,d821d~1,d821f~1,d8221~1,d8223~1,d8226~2,d8229~1,d822b~2,d822e~2,d8231~3,d8236~1,d8238~2,d823b~1,d823d~3,d8241~1,d8243~1,d8245~1,d8247~3,d824d~1,d824f~1,d8251~1,d8253~1,d8255~2,d8258~1,d825c~1,d825e~2,d8262~1,d8264~1,d8266~1,d8268~2,d826e~2,d8272~1,d8274~1,d8276~8,d8280~1,d8282~1,d8284~1,d8286~1,d8288~1,d828a~1,d828c~1,d828e~1,d8290~1,d8292~1,d8294~3,d8298~1,d829a~2,d829d~1,d82a0~4,d82a6~1,d82a8~2,d82ac~1,d82ae~1,d82b0~1,d82b2~1,d82b5~1,d82b7~1,d82b9~1,d82bb~1,d82bd~1,d82bf~1,d82c1~1,d82c3~2,d82c6~2,d82c9~1,d82cb~1,d82ce~1,d82d0~2,d82d3~1,d82d6~4,d82db~1,d82de~1,d82e0~1,d82e2~1,d82e4~1,d82e6~1,d82e8~1,d82ea~1,d82ed~1,d82ef~1,d82f1~1,d82f3~2,d82f7~2,d82fb~2,d82ff~1,d8301~1,d8303~1,d8305~1,d8307~1,d8309~1,d830b~1,d830e~1,d8311~2,d8314~3,d8318~1,d831a~1,d831c~1,d831f~3,d8323~2,d8329~1,d832b~2,d832f~1,d8331~1,d8333~1,d8335~1,d8338~6,d833f~1,d8341~1,d8343~1,d8345~2,d8349~2,d834c~1,d834e~1,d8350~1,d8352~1,d8354~1,d8356~4,d835b~1,d835d~2,d8360~1,d8362~3,d8366~3,d836b~3,d8370~1,d8372~1,d8374~1,d8376~3,d837a~1,d837c~1,d837e~2,d8381~1,d8383~1,d8385~1,d8387~3,d838b~2,d838e~4,d8393~1,d8396~1,d8398~2,d839b~1,d839d~2,d83a0~2,d83a3~1,d83a5~2,d83a9~2,d83ad~1,d83b0~2,d83b4~2,d83b8~1,d83ba~a,d83c5~1,d83c7~1,d83ca~1,d83cc~1,d83ce~1,d83d0~1,d83d2~6,d83d9~3,d83df~1,d83e1~2,d83e5~1,d83e8~1,d83eb~4,d83f0~1,d83f2~1,d83f4~3,d83f8~3,d83fd~2,d8402~1,d8405~1,d8407~1,d840a~2,d840f~1,d8411~1,d8413~3,d8417~3,d841c~4,d8422~4,d8428~2,d842b~1,d842e~1,d8430~1,d8432~5,d843a~1,d843c~3,d8440~5,d8447~1,d844a~1,d844d~1,d844f~1,d8452~1,d8455~1,d8457~1,d8459~2,d845d~2,d8460~1,d8462~3,d8467~1,d8469~1,d846b~2,d846e~2,d8471~4,d8476~6,d847d~3,d8482~1,d8484~1,d8486~2,d8489~2,d848c~1,d848e~1,d8491~4,d8499~1,d849c~3,d84a0~1,d84a2~1,d84a4~3,d84aa~2,d84ad~2,d84b1~4,d84b6~1,d84b8~1,d84ba~1,d84bc~1,d84be~1,d84c0~5,d84c7~4,d84ce~1,d84d0~1,d84d2~2,d84d6~2,d84db~1,d84dd~2,d84e2~2,d84e6~1,d84e9~1,d84eb~4,d84f0~4]
> pool 6 'cephfs_cephstor1_data' replicated size 3 min_size 1 crush_rule 0
> object_hash rjenkins pg_num 128 pgp_num 128 last_change 1214952 flags
> hashpspool stripe_width 0 application cephfs
> pool 7 'cephfs_cephstor1_metadata' replicated size 3 min_size 1
> crush_rule 0 object_hash rjenkins pg_num 128 pgp_num 128 last_change
> 1214952 flags hashpspool stripe_width 0 application cephfs
>
> Am 11.10.2018 um 20:47 schrieb David Turner:
> > My first guess is to ask what your crush rules are.  `ceph osd crush
> > rule dump` along with `ceph osd pool ls detail` would be helpful.  Also
> > if you have a `ceph status` output from a time where the VM RBDs aren't
> > working might explain something.
> >
> > On Thu, Oct 11, 2018 at 1:12 PM Nils Fahldieck - Profihost AG
> > <n.fahldieck at profihost.ag <mailto:n.fahldieck at profihost.ag>> wrote:
> >
> >     Hi everyone,
> >
> >     since some time we experience service outages in our Ceph cluster
> >     whenever there is any change to the HEALTH status. E. g. swapping
> >     storage devices, adding storage devices, rebooting Ceph hosts, during
> >     backfills ect.
> >
> >     Just now I had a recent situation, where several VMs hung after I
> >     rebooted one Ceph host. We have 3 replications for each PG, 3 mon, 3
> >     mgr, 3 mds and 71 osds spread over 9 hosts.
> >
> >     We use Ceph as a storage backend for our Proxmox VE (PVE)
> environment.
> >     The outages are in the form of blocked virtual file systems of those
> >     virtual machines running in our PVE cluster.
> >
> >     It feels similar to stuck and inactive PGs to me. Honestly though I'm
> >     not really sure on how to debug this problem or which log files to
> >     examine.
> >
> >     OS: Debian 9
> >     Kernel: 4.12 based upon SLE15-SP1
> >
> >     # ceph version
> >     ceph version 12.2.8-133-gded2f6836f
> >     (ded2f6836f6331a58f5c817fca7bfcd6c58795aa) luminous (stable)
> >
> >     Can someone guide me? I'm more than happy to provide more information
> >     as needed.
> >
> >     Thanks in advance
> >     Nils
> >     _______________________________________________
> >     ceph-users mailing list
> >     ceph-users at lists.ceph.com <mailto:ceph-users at lists.ceph.com>
> >     http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20181011/80bc9c0e/attachment.html>


More information about the ceph-users mailing list