[ceph-users] What goes in the monitor database?

Joao Eduardo Luis joao at suse.de
Sat Nov 4 14:42:28 PDT 2017

On Sat, 2017-11-04 at 20:35 +0000, Bryan Henderson wrote:
> Hi.  Can anyone give me a rough idea of what the monitor database is
> for?

The monitor k/v store is where we'll keep maps and other relevant data.

These maps keep the cluster state over time, and are critical for the
system to properly function.

> I have a single monitor and a single client that just connects once a
> second
> and does a "status" command.  There is one OSD and one MDS in there
> too.  This
> is a Hammer system with a LevelDB key-value store.  This produces a
> fair
> amount of activity in the database; it looks like about 25K of
> updates for
> every "status" transaction.  The database compacts periodically and
> over the
> longrun, does not grow in size.

Granted the cluster is healthy, the store will not keep more than a
predefined number of maps (configurable through a few options). Older
maps will be frequently trimmed as we are creating new maps and going
over that said threshold.

This however means that, if your cluster is healthy, the store will not
 grow in number of keys. As you add more osds, pools, etc, to the
cluster, the size of the maps will also increase. This ultimately means
that the store size will also grow; but in your case, I would not
expect your deployment's store to grow and what you're seeing is
expected behavior.

> Using ceph_kvstore_tool after shutting down the monitor, I see
> hundreds of
> keys.

Those are the maps I mentioned. You are certainly seeing things like
osdmap, mdsmap, monmap, logm, authm, alongside various other key
prefixes and a few different key suffixes for each.

> So what does the monitor have to store to do a "status" command?

At some point we added the audit log facilities to the monitors. I
can't recall whether that was in hammer or some later version, but that
would certainly persist the command invocation to the store (and across
the monitor quorum).

However, other components of Ceph (osds come to mind) will also
regularly provide updates to the monitors, and the monitors will be
generating new maps based on those updates.

> I've seen clues that the activity has to do with Paxos elections, but
> I'm
> fuzzy on why elections would be happening or why they would need a
> persistent
> database.

The activity has to do with Paxos proposals. Basically, the monitors
need the same information on all of them to match; we achieve that via
Paxos. If we are persisting data, we are doing it through Paxos (unless
in very specific cases when said data is specific to a single monitor).

If, on the other hand, you are frequently seeing elections (not Paxos
proposals), then your monitors may be having trouble holding a quorum.


More information about the ceph-users mailing list