[ceph-users] Undersized fix for small cluster, other than adding a 4th node?

Ronny Aasen ronny+ceph-users at aasen.cx
Fri Nov 10 00:28:40 PST 2017


On 09. nov. 2017 22:52, Marc Roos wrote:
>   
> I added an erasure k=3,m=2 coded pool on a 3 node test cluster and am
> getting these errors.
> 
>     pg 48.0 is stuck undersized for 23867.000000, current state
> active+undersized+degraded, last acting [9,13,2147483647,7,2147483647]
>      pg 48.1 is stuck undersized for 27479.944212, current state
> active+undersized+degraded, last acting [12,1,2147483647,8,2147483647]
>      pg 48.2 is stuck undersized for 27479.944514, current state
> active+undersized+degraded, last acting [12,1,2147483647,3,2147483647]
>      pg 48.3 is stuck undersized for 27479.943845, current state
> active+undersized+degraded, last acting [11,0,2147483647,2147483647,5]
>      pg 48.4 is stuck undersized for 27479.947473, current state
> active+undersized+degraded, last acting [8,4,2147483647,2147483647,5]
>      pg 48.5 is stuck undersized for 27479.940289, current state
> active+undersized+degraded, last acting [6,5,11,2147483647,2147483647]
>      pg 48.6 is stuck undersized for 27479.947125, current state
> active+undersized+degraded, last acting [5,8,2147483647,1,2147483647]
>      pg 48.7 is stuck undersized for 23866.977708, current state
> active+undersized+degraded, last acting [13,11,2147483647,0,2147483647]
> 
> Mentioned here
> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-May/009572.html
> is that the problem was resolved by adding an extra node, I already
> changed the min_size to 3. Or should I change to k=2,m=2 but do I still
> then have good saving on storage then? How can you calculate saving
> storage of erasure pool?


minimum nodes for a cluster is k+m and with that you have no nodes for 
additional failure domain. IOW, if a node fail your cluster is degraded 
and can not heal itself.

having ceph heal on failures is kind of one one of the best things about 
ceph. so when choosing how many nodes to have in your cluster, you need 
to think:  k + m + how many node failures do i want to tolerate without 
stressing = minimum number of nodes


basically with a 3 node cluster, you can either run 3x replication or 
k=2 + m=1



to look for space saving you can read
http://ceph.com/geen-categorie/ceph-erasure-coding-overhead-in-a-nutshell/


kind regards
Ronny Aasen


More information about the ceph-users mailing list