[ceph-users] warning: fast-diff map is invalid operation may be slow; object map invalid

Jason Dillaman jdillama at redhat.com
Tue Oct 16 08:32:59 PDT 2018


On Mon, Oct 15, 2018 at 4:04 PM Anthony D'Atri <aad at dreamsnake.net> wrote:
>
>
> We turned on all the RBD v2 features while running Jewel; since then all clusters have been updated to Luminous 12.2.2 and additional clusters added that have never run Jewel.
>
> Today I find that a few percent of volumes in each cluster have issues, examples below.
>
> I'm concerned that these issues may present problems when using rbd-mirror to move volumes between clusters.  Many instances involve heads or nodes of snapshot trees; it's possible but unverified that those not currently snap-related may have been in the past.
>
> In the Jewel days we retroactively applied fast-diff, object-map to existing volumes but did not bother with tombstones.
>
> Any thoughts on
>
> 1) How this happens?

If you enabled object-map and/or fast-diff on pre-existing images,
then the object-map is automatically flagged as invalid since just
enabling the feature doesn't rebuild the object-map. This just
instructs librbd clients not to trust the object-map so all
optimizations are disabled.

> 2) Is rbd object-map rebuild"  always safe, especially on volumes that are in active use?

Yes, the live-rebuild of the HEAD image is just proxied over to the
current exclusive-lock owner. Rebuilds of any snapshot object-maps are
performed by the rbd CLI.

> 3) The disturbing messages spewed by `rbd ls` -- related or not?

Some of the errors spewed by "rbd ls" are not specifically related to
the object-map feature. For example, it appears that you have at least
two cloned images where the parent image snapshot is no longer
available (librbd::image::RefreshParentRequest: failed to locate
snapshot). It also appears that at least two of the images in your RBD
directory don't exist (librbd::image::OpenRequest: failed to retreive
immutable metadata).

However, for the "librbd::object_map::RefreshRequest: failed to load
object map" logs, those are harmless if you enabled the object-map
after the snapshot was created and haven't rebuilt the object map yet.

> 4) Would this as I fear confound successful rbd-mirror migration?

Nope -- rbd-mirror uses the journal for synchronization.

> I've found http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-August/012137.html that *seems* to indicate that a live rebuild is safe,but I'm still uncertain about the root cause, and if it's still happening.  I've never ventured into this dark corner before so I'm being careful.
>
> All clients are QEMU/libvirt; most are 12.2.2 but there are some lingering Jewel, most likely 10.2.6 or perhaps 10.2.3.  Eg:
>
>
> # ceph features
> {
>     "mon": {
>         "group": {
>             "features": "0x1ffddff8eea4fffb",
>             "release": "luminous",
>             "num": 5
>         }
>     },
>     "osd": {
>         "group": {
>             "features": "0x1ffddff8eea4fffb",
>             "release": "luminous",
>             "num": 983
>         }
>     },
>     "client": {
>         "group": {
>             "features": "0x7fddff8ee84bffb",
>             "release": "jewel",
>             "num": 15
>         },
>         "group": {
>             "features": "0x1ffddff8eea4fffb",
>             "release": "luminous",
>             "num": 3352
>         }
>     }
> }
>
>
> # rbd ls  -l |wc
> 2018-10-05 20:55:17.397288 7f976cff9700 -1 librbd::image::RefreshParentRequest: failed to locate snapshot: Snapshot with this id not found
> 2018-10-05 20:55:17.397334 7f976cff9700 -1 librbd::image::RefreshRequest: failed to refresh parent image: (2) No such file or directory
> 2018-10-05 20:55:17.397397 7f976cff9700 -1 librbd::image::OpenRequest: failed to refresh image: (2) No such file or directory
> 2018-10-05 20:55:17.398025 7f976cff9700 -1 librbd::io::AioCompletion: 0x7f978667b570 fail: (2) No such file or directory
> 2018-10-05 20:55:17.398075 7f976cff9700 -1 librbd::image::RefreshParentRequest: failed to locate snapshot: Snapshot with this id not found
> 2018-10-05 20:55:17.398079 7f976cff9700 -1 librbd::image::RefreshRequest: failed to refresh parent image: (2) No such file or directory
> 2018-10-05 20:55:17.398096 7f976cff9700 -1 librbd::image::OpenRequest: failed to refresh image: (2) No such file or directory
> 2018-10-05 20:55:17.398659 7f976cff9700 -1 librbd::io::AioCompletion: 0x7f978660c240 fail: (2) No such file or directory
> 2018-10-05 20:55:30.416174 7f976cff9700 -1 librbd::io::AioCompletion: 0x7f9786cd5ee0 fail: (2) No such file or directory
> 2018-10-05 20:55:34.083188 7f976d7fa700 -1 librbd::object_map::RefreshRequest: failed to load object map: rbd_object_map.b18d634146825.0000000000002d8f
> 2018-10-05 20:55:34.084101 7f976cff9700 -1 librbd::object_map::InvalidateRequest: 0x7f97544d11e0 should_complete: r=0
> 2018-10-05 20:55:38.597014 7f976d7fa700 -1 librbd::image::OpenRequest: failed to retreive immutable metadata: (2) No such file or directory
> 2018-10-05 20:55:38.597109 7f976cff9700 -1 librbd::io::AioCompletion: 0x7f9786d3a7c0 fail: (2) No such file or directory
> 2018-10-05 20:55:51.584101 7f976d7fa700 -1 librbd::object_map::RefreshRequest: failed to load object map: rbd_object_map.c447c403109b2.0000000000006a04
> 2018-10-05 20:55:51.592616 7f976cff9700 -1 librbd::object_map::InvalidateRequest: 0x7f975409fee0 should_complete: r=0
> 2018-10-05 20:55:59.414229 7f976d7fa700 -1 librbd::image::OpenRequest: failed to retreive immutable metadata: (2) No such file or directory
> 2018-10-05 20:55:59.414321 7f976cff9700 -1 librbd::io::AioCompletion: 0x7f9786df0760 fail: (2) No such file or directory
> 2018-10-05 20:56:09.029179 7f976d7fa700 -1 librbd::object_map::RefreshRequest: failed to load object map: rbd_object_map.9b28e148b97af.0000000000006a09
> 2018-10-05 20:56:09.035212 7f976cff9700 -1 librbd::object_map::InvalidateRequest: 0x7f9754644030 should_complete: r=0
> 2018-10-05 20:56:09.036087 7f976d7fa700 -1 librbd::object_map::RefreshRequest: failed to load object map: rbd_object_map.9b28e148b97af.0000000000006a0a
> 2018-10-05 20:56:09.042200 7f976cff9700 -1 librbd::object_map::InvalidateRequest: 0x7f97541d2c10 should_complete: r=0
>    6544   22993 1380784
>
> # rbd du
> warning: fast-diff map is invalid for -1037424/950f705d-d575-11e7-acf6-0242ac114406 at -1037424/d2600c5e-d83a-11e7-acf6-0242ac114406. operation may be slow.
> warning: fast-diff map is not enabled for -01f5fda8-e57d-11e7-a428-0242ac110705. operation may be slow.
> warning: fast-diff map is not enabled for -069b999b-76b7-11e7-9738-0242ac110704. operation may be slow.
> warning: fast-diff map is not enabled for -19576951-36ad-11e8-9dc3-0242ac11090d. operation may be slow.
> warning: fast-diff map is not enabled for -f519bcc3-3515-11e8-9dc3-0242ac11090d. operation may be slow.
> warning: fast-diff map is not enabled for -f8031a79-1a19-11e8-bf57-0242ac11180a. operation may be slow.
> warning: fast-diff map is invalid for -875915ce-a156-11e6-9216-000f533054e0 at 8f003d3a-82be-11e7-a90c-0242ac110704. operation may be slow.
> warning: fast-diff map is invalid for -aaf6ca96-9548-11e6-a7f8-000f53304d81@/1b3081ea-af70-11e6-9216-000f533054e0. operation may be slow.
> warning: fast-diff map is invalid for -207a435f-569d-11e7-aad7-0242ac110405 at 3a300bdb-3df5-11e8-9b2b-0242ac116704. operation may be slow.
> warning: fast-diff map is invalid for 6aa9d531-57e0-11e7-89c0-0242ac110704 at e5406f7f-c37b-11e8-acc7-0a58ac14d11f. operation may be slow.
> warning: fast-diff map is invalid for 6aa9d531-57e0-11e7-89c0-0242ac110704 at f2b06990-c444-11e8-acc7-0a58ac14d11f. operation may be slow.
> warning: fast-diff map is invalid for 6aa9d531-57e0-11e7-89c0-0242ac110704 at 1df20bdc-c50e-11e8-acc7-0a58ac14d11f. operation may be slow.
>
> # rbd info rbd/950f705d-d575-11e7-acf6-0242ac114406 at -d2600c5e-d83a-11e7-acf6-0242ac114406
> rbd image '950f705d-d575-11e7-acf6-0242ac114406':
> size 51200 MB in 12800 objects
> order 22 (4096 kB objects)
> block_name_prefix: rbd_data.3067d01131bc6e
> format: 2
> features: layering, striping, exclusive-lock, object-map, fast-diff, deep-flatten
> flags: object map invalid, fast diff invalid.
> protected: True
> stripe unit: 4096 kB
> stripe count: 1
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users at lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Jason


More information about the ceph-users mailing list