[ceph-users] Ceph mgr Prometheus plugin: error when osd is down

Gökhan Kocak goekhan.kocak at cloudandheat.com
Wed Nov 14 07:32:39 PST 2018


Hello everyone,

we encountered an error with the Prometheus plugin for Ceph mgr:
One osd was down and (therefore) it had no class:
```
sudo ceph osd tree
ID  CLASS WEIGHT    TYPE NAME          STATUS REWEIGHT PRI-AFF
 28   hdd   7.27539             osd.28     up  1.00000 1.00000
  6               0 osd.6                down        0 1.00000

```

When we tried to curl the metrics, there was an error because the osd
had no class (see below "KeyError: 'class' ").

Anybody experience the same?

Isn't this an error on the Prometheus plugin's behalf? When an osd is down, the plugin should not stop working imo. 

```
~> curl -v 127.0.0.1:9283/metrics
*   Trying 127.0.0.1...
* Connected to 127.0.0.1 (127.0.0.1) port 9283 (#0)
> GET /metrics HTTP/1.1
> Host: 127.0.0.1:9283
> User-Agent: curl/7.47.0
> Accept: */*
>
< HTTP/1.1 500 Internal Server Error
< Date: Wed, 14 Nov 2018 13:59:59 GMT
< Content-Length: 1663
< Content-Type: text/html;charset=utf-8
< Server: CherryPy/3.5.0
<
<!DOCTYPE html PUBLIC
"-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html>
<head>
    <meta http-equiv="Content-Type" content="text/html;
charset=utf-8"></meta>
    <title>500 Internal Server Error</title>
    <style type="text/css">
    #powered_by {
        margin-top: 20px;
        border-top: 2px solid black;
        font-style: italic;
    }

    #traceback {
        color: red;
    }
    </style>
</head>
    <body>
        <h2>500 Internal Server Error</h2>
        <p>The server encountered an unexpected condition which
prevented it from fulfilling the request.</p>
        <pre id="traceback">Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/cherrypy/_cprequest.py", line
670, in respond
    response.body = self.handler()
  File "/usr/lib/python2.7/dist-packages/cherrypy/lib/encoding.py", line
217, in __call__
    self.body = self.oldhandler(*args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/cherrypy/_cpdispatch.py", line
61, in __call__
    return self.callable(*self.args, **self.kwargs)
  File "/usr/lib/x86_64-linux-gnu/ceph/mgr/prometheus/module.py", line
414, in metrics
    metrics = global_instance().collect()
  File "/usr/lib/x86_64-linux-gnu/ceph/mgr/prometheus/module.py", line
351, in collect
    self.get_metadata_and_osd_status()
  File "/usr/lib/x86_64-linux-gnu/ceph/mgr/prometheus/module.py", line
310, in get_metadata_and_osd_status
    dev_class['class'],
KeyError: 'class'
</pre>
    <div id="powered_by">
      <span>
        Powered by <a href="http://www.cherrypy.org">CherryPy 3.5.0</a>
      </span>
    </div>
    </body>
</html>
* Connection #0 to host 127.0.0.1 left intact
```

Kind regards,

Gökhan



More information about the ceph-users mailing list