[ceph-users] safe to remove leftover bucket index objects

Wido den Hollander wido at 42on.com
Mon Oct 22 02:59:36 PDT 2018

On 8/31/18 5:31 PM, Dan van der Ster wrote:
> So it sounds like you tried what I was going to do, and it broke
> things. Good to know... thanks.
> In our case, what triggered the extra index objects was a user running
> PUT /bucketname/ around 20 million times -- this apparently recreates
> the index objects.

I'm asking the same!

Large omap object found. Object:
Key count: 5374754 Size (bytes): 1366279268

In this case I can't find '134167.15.1' in any of the buckets when I do:

for BUCKET in $(radosgw-admin metadata bucket list|jq -r '.[]'); do
    radosgw-admin metadata get bucket:$BUCKET > bucket.$BUCKET

If I grep through all the bucket.* files this object isn't showing up

Before I remove the object I want to make sure that it's safe to delete it.

A garbage collector for the bucket index pools would be very great to have.


> -- dan
> On Thu, Aug 30, 2018 at 7:20 PM David Turner <drakonstein at gmail.com> wrote:
>> I'm glad you asked this, because it was on my to-do list. I know that based on our not existing in the bucket marker does not mean it's safe to delete.  I have an index pool with 22k objects in it. 70 objects match existing bucket markers. I was having a problem on the cluster and started deleting the objects in the index pool and after going through 200 objects I stopped it and tested and list access to 3 pools. Luckily for me they were all buckets I've been working on deleting, so no need for recovery.
>> I then compared bucket IDs to the objects in that pool, but still only found a couple hundred more matching objects. I have no idea what the other 22k objects are in the index bucket that don't match bucket markers or bucket IDs. I did confirm there was no resharding happening both in the research list and all bucket reshard statuses.
>> Does anyone know how to parse the names of these objects and how to tell what can be deleted?  This is if particular interest as I have another costed with 1M injects in the index pool.
>> On Thu, Aug 30, 2018, 7:29 AM Dan van der Ster <dan at vanderster.com> wrote:
>>> Replying to self...
>>> On Wed, Aug 1, 2018 at 11:56 AM Dan van der Ster <dan at vanderster.com> wrote:
>>>> Dear rgw friends,
>>>> Somehow we have more than 20 million objects in our
>>>> default.rgw.buckets.index pool.
>>>> They are probably leftover from this issue we had last year:
>>>> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-June/018565.html
>>>> and we want to clean the leftover / unused index objects
>>>> To do this, I would rados ls the pool, get a list of all existing
>>>> buckets and their current marker, then delete any objects with an
>>>> unused marker.
>>>> Does that sound correct?
>>> More precisely, for example, there is an object
>>> .dir.61c59385-085d-4caa-9070-63a3868dccb6.2978181.59.8 in the index
>>> pool.
>>> I run `radosgw-admin bucket stats` to get the marker for all current
>>> existing buckets.
>>> The marker 61c59385-085d-4caa-9070-63a3868dccb6.2978181.59 is not
>>> mentioned in the bucket stats output.
>>> Is it safe to rados rm .dir.61c59385-085d-4caa-9070-63a3868dccb6.2978181.59.8 ??
>>> Thanks in advance!
>>> -- dan
>>>> Can someone suggest a better way?
>>>> Cheers, Dan
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users at lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> _______________________________________________
> ceph-users mailing list
> ceph-users at lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

More information about the ceph-users mailing list