Lots of good info there, thank you!  I tend to get options fatigue when trying to pick out a new system.  This should help narrow that focus greatly.  




Grafana <https://grafana.com/>  is the web frontend for creating the graphs.


InfluxDB <https://www.influxdata.com/time-series-platform/influxdb/>  holds the time series data that Grafana pulls from.


To collect data, I am using collectd <https://collectd.org/wiki/index.php/Plugin:Ceph>  daemons running on each ceph node (mon,mds,osd), as this was my initial way of ingesting metrics.

I am also now using the influx plugin in ceph-mgr <http://docs.ceph.com/docs/luminous/mgr/influx/>  to have ceph-mgr directly report statistics to InfluxDB.


I know two other popular methods of collecting data are Telegraf <https://www.influxdata.com/time-series-platform/telegraf/>  and Prometheus <https://prometheus.io/> , both of which are popular, both of which have ceph-mgr plugins as well here <http://docs.ceph.com/docs/mimic/mgr/telegraf/>  and here <http://docs.ceph.com/docs/luminous/mgr/prometheus/> .

Influx Data also has a Grafana like graphing front end Chronograf <https://www.influxdata.com/time-series-platform/chronograf/> , which some prefer to Grafana.


Hopefully thats enough to get you headed in the right direction.

I would recommend not going down the CollectD path, as the project doesn't move as quickly as Telegraf and Prometheus, and the majority of the metrics I am pulling from these days are provided from the ceph-mgr plugin.


Hope that helps,


On Mar 20, 2019, at 11:30 AM, Brent Kennedy <bkennedy at cfl.rr.com <mailto:bkennedy at cfl.rr.com> > wrote:


Reed:  If you don’t mind me asking, what was the graphing tool you had in the post?  I am using the ceph health web panel right now but it doesn’t go that deep.





