[ceph-users] why sudden (and brief) HEALTH_ERR

lists lists at merit.unu.edu
Wed Oct 4 00:59:05 PDT 2017


ok, thanks for the feedback Piotr and Dan!

MJ

On 4-10-2017 9:38, Dan van der Ster wrote:
>> Since Jewel (AFAIR), when (re)starting OSDs, pg status is reset to "never
>> contacted", resulting in "pgs are stuck inactive for more than 300 seconds"
>> being reported until osds regain connections between themselves.
>>
> 
> Also, the last_active state isn't updated very regularly, as far as I can tell.
> On our cluster I have increased this timeout
> 
> --mon_pg_stuck_threshold: 1800
> 
> (Which helps suppress these bogus HEALTH_ERR's)



More information about the ceph-users mailing list