[ceph-users] safe to remove leftover bucket index objects

Wido den Hollander wido at 42on.com
Mon Oct 22 02:59:36 PDT 2018



On 8/31/18 5:31 PM, Dan van der Ster wrote:
> So it sounds like you tried what I was going to do, and it broke
> things. Good to know... thanks.
> 
> In our case, what triggered the extra index objects was a user running
> PUT /bucketname/ around 20 million times -- this apparently recreates
> the index objects.
> 

I'm asking the same!

Large omap object found. Object:
6:199f36b7:::.dir.ea087a7e-cb26-420f-9717-a98080b0623c.134167.15.1:head
Key count: 5374754 Size (bytes): 1366279268

In this case I can't find '134167.15.1' in any of the buckets when I do:

for BUCKET in $(radosgw-admin metadata bucket list|jq -r '.[]'); do
    radosgw-admin metadata get bucket:$BUCKET > bucket.$BUCKET
done

If I grep through all the bucket.* files this object isn't showing up
anywhere.

Before I remove the object I want to make sure that it's safe to delete it.

A garbage collector for the bucket index pools would be very great to have.

Wido

> -- dan
> 
> On Thu, Aug 30, 2018 at 7:20 PM David Turner <drakonstein at gmail.com> wrote:
>>
>> I'm glad you asked this, because it was on my to-do list. I know that based on our not existing in the bucket marker does not mean it's safe to delete.  I have an index pool with 22k objects in it. 70 objects match existing bucket markers. I was having a problem on the cluster and started deleting the objects in the index pool and after going through 200 objects I stopped it and tested and list access to 3 pools. Luckily for me they were all buckets I've been working on deleting, so no need for recovery.
>>
>> I then compared bucket IDs to the objects in that pool, but still only found a couple hundred more matching objects. I have no idea what the other 22k objects are in the index bucket that don't match bucket markers or bucket IDs. I did confirm there was no resharding happening both in the research list and all bucket reshard statuses.
>>
>> Does anyone know how to parse the names of these objects and how to tell what can be deleted?  This is if particular interest as I have another costed with 1M injects in the index pool.
>>
>> On Thu, Aug 30, 2018, 7:29 AM Dan van der Ster <dan at vanderster.com> wrote:
>>>
>>> Replying to self...
>>>
>>> On Wed, Aug 1, 2018 at 11:56 AM Dan van der Ster <dan at vanderster.com> wrote:
>>>>
>>>> Dear rgw friends,
>>>>
>>>> Somehow we have more than 20 million objects in our
>>>> default.rgw.buckets.index pool.
>>>> They are probably leftover from this issue we had last year:
>>>> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-June/018565.html
>>>> and we want to clean the leftover / unused index objects
>>>>
>>>> To do this, I would rados ls the pool, get a list of all existing
>>>> buckets and their current marker, then delete any objects with an
>>>> unused marker.
>>>> Does that sound correct?
>>>
>>> More precisely, for example, there is an object
>>> .dir.61c59385-085d-4caa-9070-63a3868dccb6.2978181.59.8 in the index
>>> pool.
>>> I run `radosgw-admin bucket stats` to get the marker for all current
>>> existing buckets.
>>> The marker 61c59385-085d-4caa-9070-63a3868dccb6.2978181.59 is not
>>> mentioned in the bucket stats output.
>>> Is it safe to rados rm .dir.61c59385-085d-4caa-9070-63a3868dccb6.2978181.59.8 ??
>>>
>>> Thanks in advance!
>>>
>>> -- dan
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>> Can someone suggest a better way?
>>>>
>>>> Cheers, Dan
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users at lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> _______________________________________________
> ceph-users mailing list
> ceph-users at lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 


More information about the ceph-users mailing list