CephFS vs GlusterFS

As an infrastructure engineer on the cloud platform development team, I have worked with many distributed storage systems, including those mentioned in the title. It seems that there is an understanding of their strengths and weaknesses, and I will try to share with you my thoughts on this matter. So to speak, let's see who has the longer hash function.

Disclaimer: Earlier in this blog, you may have seen articles about GlusterFS. I have nothing to do with these articles. This is the author's blog of the project team of our cloud and each of its members can tell their story. The author of those articles is an engineer of our operation group and he has his own tasks and his own experience, which he shared. Please take this into account if you suddenly see a difference of opinion. I would like to take this opportunity to convey my best regards to the author of those articles!

What will be discussed

Let's talk about the file systems that you can build with GlusterFS and CephFS. Let's discuss the architecture of these two systems, look at them from different angles, and at the end I even venture to draw some conclusions. Other Ceph features such as RBD and RGW will not be affected.


To make the article complete and understandable for everyone, let's understand the basic terminology of both systems:

RADOS (Reliable Autonomic Distributed Object Store) is a self-contained object storage that is the backbone of the Ceph project. CephFS, RBD (RADOS Block Device), RGW (RADOS Gateway) are high-level RADOS add-ons that provide end users with various interfaces to RADOS. Specifically, CephFS provides a POSIX-compatible file system interface. In fact, CephFS data is stored in RADOS. OSD (Object Storage Daemon) is a process that maintains a separate disk / object storage in a RADOS cluster. RADOS Pool - several OSDs, united by a common set of rules, such as, for example, a replication policy. From a data hierarchy perspective, a pool is a directory or a separate (flat, no subdirectories) namespace for objects. PG (Placement Group) - I will introduce the concept of PG a little later, in context, for better understanding.

Since RADOS is the foundation on which CephFS is built, I will often talk about it and this will automatically apply to CephFS.

GlusterFS Terminology (hereinafter gl):

Brick is a process that serves a separate disk, analogous to OSD in RADOS terminology. Volume - the volume into which the bricks are combined. A volume is an analogue of a pool in RADOS, it also has a specific replication topology between bricks.

Data distribution

To make it clearer, let's look at a simple example that can be implemented by both systems.

An example setup:

