[ceph-users] Troubleshooting hanging storage backend whenever there is any cluster change
Burkhard.Linke at computational.bio.uni-giessen.de
Fri Oct 12 05:03:06 PDT 2018
On 10/12/2018 01:55 PM, Nils Fahldieck - Profihost AG wrote:
> I rebooted a Ceph host and logged `ceph status` & `ceph health detail`
> every 5 seconds. During this I encountered 'PG_AVAILABILITY Reduced data
> availability: pgs peering'. At the same time some VMs hung as described
Just a wild guess... you have 71 OSDs and about 4500 PG with size=3.
13500 PG instance overall, resulting in ~190 PGs per OSD under normal
If one host is down and the PGs have to re-peer, you might reach the
limit of 200 PG/OSDs on some of the OSDs, resulting in stuck peering.
You can try to raise this limit. There are several threads on the
mailing list about this.
Dr. rer. nat. Burkhard Linke
Bioinformatics and Systems Biology
35392 Giessen, Germany
Phone: (+49) (0)641 9935810
More information about the ceph-users