How to Deploy a Ceph Cluster on Kubernetes With Rook

From the desk of a brilliant weirdo #2:

How to Deploy a Ceph Cluster on Kubernetes With Rook

From the desk of a brilliant weirdo #2:

Welcome to the ultimate Rook and Ceph survival guide.

Here’s the agenda for this article:

  • Introduction
  • Prerequisites
  • Steps to be performed
  • Common issues encountered
  • Conclusion
  • References

Introduction

Kubernetes was originally used to manage containers where the application could be hosted with a small amount of storage. As it grows, Kubernetes is graduating in handling more complex tasks.

A new trend is to set up storage capacity on pods to extend the load and traditional benefits of cloud data storage applications.

Let’s explore the terminologies:

What is Kubernetes?

Kubernetes containers at its core are stateless, but data must still be preserved, managed, and made accessible to other services.

The word stateless means that the container runs in isolation without any information about previous transactions, making the distribution, deletion, or replacement of the container easier.

Data, however, might be lost during life-cycle events such as deletion or restart.

What is Rook?

Rook is an open-source and cloud-native, storage worker or arranger for Kubernetes Cluster.

Rook uses the open-source Ceph scale-out storage platform and Kubernetes to provide a dynamic storage environment for dynamically scaling storage workloads and high performance.

Rook also solves Kubernetes storage challenges within the infrastructure by extending Kubernetes with custom types and controllers.

It automates deployment, scaling, upgrading, migration, monitoring, and resource management.

It is a framework for storage providers to integrate their solutions into cloud-native environments.

What is Ceph?

Ceph is open-source software that facilitates highly scalable object, block, and file-based storage under one whole system. The clusters of Ceph are designed in order to run commodity hardware with the help of an algorithm called CRUSH (Controlled Replication Under Scalable Hashing).

Prerequisites

Before we dive further, we need the following prerequisites:

  • A Kubernetes cluster with 4 nodes: 1 master and 3 workers. Each node is an Ubuntu 18.04 server with at least 4 GB of RAM. In this guide we are using a cluster created on DigitalOcean with the official kubeadm tool.
  • The kubectl command-line tool installed on a development server and configured to connect to our cluster.
  • A DigitalOcean block storage volume with at least 100 GB for each node of the cluster.

Step 1 — Setting up Rook

After completing the prerequisites, we have a fully functional Kubernetes cluster with 3 nodes and as many volumes.

We can now set up the Rook.

In this step, we will do the following tasks:

  • Clone Rook repository
  • Deploy Rook operator on Kubernetes cluster
  • Validate the status of deployed Rook operator

First, we clone the GitHub repository so that we can have all the resources needed to start setting up our Rook cluster:git clone --single-branch --branch release-1.3 https://github.com/rook/rook.git

Note: We are using the most recent and stable branch for the repository that is release 1.3. You may upgrade to the latest branch when it is released. We can check the latest branches of the Rook repository from the following link https://github.com/rook/rook/branches.

There is a Kubernetes operator for each storage solution supported by Rook. In short, an operator is a process, running in a pod, that contains all the logic to manage a complex application, and is often used to manage stateful applications.

Now we enter the directory using the following command:cd rook/cluster/examples/kubernetes/ceph

Next, we deploy the Kubernetes config file by using the following command. This config file is available by default in our directory, and it helps us in creating the resources we need for our Rook deployment.kubectl create -f common.yaml

After creating the common resources, the next step is to create the Rook operator.

Before deploying the operator.yaml file, we need to change the CSI_RBD_GRPC_METRICS_PORT variable because our Kubernetes cluster already uses the standard port by default.

To do this, we open the file using the command given below:nano operator.yaml

Now we will do the following things:

  • Search for CSI_RBD_GRPC_METRICS_PORT variable
  • Remove the # to uncomment this variable
  • Then change the value of port from 9001 to 9093

This is how our file should look like after the changes:kind: ConfigMap
apiVersion: v1
metadata:
name: rook-ceph-operator-config
namespace: rook-ceph
data:
ROOK_CSI_ENABLE_CEPHFS: "true"
ROOK_CSI_ENABLE_RBD: "true"
ROOK_CSI_ENABLE_GRPC_METRICS: "true"
CSI_ENABLE_SNAPSHOTTER: "true"
CSI_FORCE_CEPHFS_KERNEL_CLIENT: "true"
ROOK_CSI_ALLOW_UNSUPPORTED_VERSION: "false"
# Configure CSI CSI Ceph FS grpc and liveness metrics port
# CSI_CEPHFS_GRPC_METRICS_PORT: "9091"
# CSI_CEPHFS_LIVENESS_METRICS_PORT: "9081"
# Configure CSI RBD grpc and liveness metrics port
CSI_RBD_GRPC_METRICS_PORT: "9093"
# CSI_RBD_LIVENESS_METRICS_PORT: "9080"

We now save the file and exit it.

Next, we deploy the operator that will be in charge of the setup and orchestration of a Ceph cluster:kubectl create -f operator.yaml

Running the above command will generate the following output:configmap/rook-ceph-operator-config created
deployment.apps/rook-ceph-operator created

The operator takes a couple of seconds to be up and running. Its status can be verified with the following command:kubectl get pod -n rook-ceph

We have used the -n flag for getting the pods of a namespace. In our example this namespace is: rook-ceph

Once the operator deployment is ready, it will trigger the creation of the DeamonSets that are in charge of creating the rook-discovery agents on each worker node of our cluster. We will receive output similar to the one below:NAME READY STATUS RESTARTS AGE
rook-ceph-operator-75d95cb868-s7m5z 1/1 Running 0 78s
rook-discover-44dq9 1/1 Running 0 43s
rook-discover-7gxn7 1/1 Running 0 43s
rook-discover-xfqmk 1/1 Running 0 43s

Now, we have successfully installed Rook and deployed our first operator. Next, we will create a Ceph cluster and verify that it is working.

Step 2 — Creating a Ceph Cluster

Now that we have successfully set up Rook on our Kubernetes cluster, we are now ready to create a Ceph cluster within the Kubernetes cluster.

But before that, let’s take a look at some Ceph components that run as pods in our Kubernetes cluster so we can understand their functionality and why they are important.

The Rook architecture:

We have three main Ceph components which are Monitors(MON), Managers(MGR) and Object Store Device(OSD).

Monitors: Ceph Monitors, also called MONs, are responsible for storing the main copy of a cluster map. It allows the coordination of Ceph daemons. MONs are responsible for forming cluster quorums. All the cluster nodes report to monitor nodes and share information about every change in their state.

Although we may run a cluster even with one monitor or MON, it is always good to have more than one MON running at all times for better reliability and availability. Also for an HA cluster, it is advised to have at least 3 MONs running.

Note: Generally it is recommended that the numbers of MONs running should be odd because even-numbered monitors have a lower springiness to failures than odd-numbered monitors.

Managers: Ceph Managers, also called MGRs, run side to side with our monitors or MONs to provide the extra monitoring. They are responsible for keeping track of runtime metrics and the current state of our cluster. We need a minimum of 2 MGRs for a HA cluster.

Object Store Devices: Ceph Object Store Devices, also called OSDs, are responsible for the storing and handling of data to a local file system. We usually have one physical disk of our cluster tied to the OSD. We need a minimum of 3 OSDs for our HA cluster.

All of the components mentioned above run in our Rook cluster and interact directly with the Rook agents.

Now with all the basics done, let’s create our Ceph cluster.

First, we create a yaml file:nano cephcluster.yaml

Then we need to add some specifications which define how the cluster will be configured:

We define the apiVersion of our cluster and the kind of the Kubernetes object along with the name and namespace of the object.apiVersion: ceph.rook.io/v1
kind: CephCluster
metadata:
name: rook-ceph
namespace: rook-ceph

Next, we make an entry of the spec key which tells the Kubernetes which model to use while creating the cluster. For this, we specify the image version we want to use and whether or not we want to allow unsupported Ceph versions.spec:
cephVersion:
image: ceph/ceph:v14.2
allowUnsupported: false

After this, we add the dataDirHostPath to specify the path where we want to persist our configuration files; we must always specify this.dataDirHostPath: /var/lib/rook

Then we define if we want to skip upgrade checks via skipUpgradeChecks and when do we want to upgrade our cluster using the key continueUpgradeAfterChecksEvenIfNotHealthyskipUpgradeChecks: false
continueUpgradeAfterChecksEvenIfNotHealthy: false

Then we define the number of Ceph Monitors (MONs) we want to use via the mon key. We also define whether or not we want to allow multiple MONs to be deployed per node.mon:
count: 3
allowMultiplePerNode: false

We set up the Ceph dashboard using the dashboard key. We can set various options here like enabling the dashboard, customizing the port for serving it, and defining prefix URL in the case when we use a reverse proxy.dashboard:
enabled: true
ssl: true

We can monitor the cluster by enabling the monitoring of our cluster using the monitor key; we can set the enable option to true or false depending upon whether we need to monitor the cluster or not.

Note: We need to have Prometheus pre-installed for monitoring.monitoring:
enabled: false
rulesNamespace: rook-ceph

We can have RDB images shared asynchronously between two Ceph clusters using the rdbMirroring key. As we only have one cluster for this guide, this is not required for us, therefore we have set the workers to 0.

Note: RDB is an acronym for RADOS (Reliable Autonomic Distributed Object Store) block device. RADOS stays at the core of Ceph storage clusters. This layer makes sure that stored data always remains consistent and performs data replication, failure detection, and recovery among others.rbdMirroring:
workers: 0

Next, we add the storage key which lets us define the options for cluster-level storage. For example the size of the database, the number of OSDs to create per device, and which devices and nodes we want to use.storage:
useAllNodes: true
useAllDevices: true

There are few more configuration keys we may add to these, but those are not necessary for this guide. After all these configurations are set, we will have the following resulting file:apiVersion: ceph.rook.io/v1
kind: CephCluster
metadata:
name: rook-ceph
namespace: rook-ceph
spec:
cephVersion:
image: ceph/ceph:v14.2
allowUnsupported: false
dataDirHostPath: /var/lib/rook
skipUpgradeChecks: false
continueUpgradeAfterChecksEvenIfNotHealthy: false
mon:
count: 3
allowMultiplePerNode: false
dashboard:
enabled: true
ssl: true
monitoring:
enabled: false
rulesNamespace: rook-ceph
rbdMirroring:
workers: 0
storage:
useAllNodes: true
useAllDevices: true

Next we apply this manifest in our Kubernetes cluster:kubectl apply -f cephcluster.yaml

Now we check that the pods are running:kubectl get pod -n rook-ceph

It usually takes a couple of minutes for the cluster to be created. Once this is completed we should get a list similar to the following output:NAME READY STATUS RESTARTS AGE
csi-cephfsplugin-fnpm9 3/3 Running 0 10m
csi-cephfsplugin-hxml9 3/3 Running 0 10m
csi-cephfsplugin-provisioner-7c44c4ff49-k8z9t 4/4 Running 0 10m
csi-cephfsplugin-provisioner-7c44c4ff49-kmdp9 4/4 Running 1 10m
csi-cephfsplugin-qwm6m 3/3 Running 0 10m
csi-rbdplugin-dgmv8 3/3 Running 0 10m
csi-rbdplugin-provisioner-7458d98547-xg7x8 5/5 Running 1 10m
csi-rbdplugin-provisioner-7458d98547-xz2kg 5/5 Running 1 10m
csi-rbdplugin-qxx26 3/3 Running 0 10m
csi-rbdplugin-s2mxj 3/3 Running 0 10m
rook-ceph-mgr-a-5d8bf85bb7-nnnxc 1/1 Running 0 7m5s
rook-ceph-mon-a-7678858484–5txtz 1/1 Running 0 8m
rook-ceph-mon-b-6b6f697f94–577z8 1/1 Running 0 7m44s
rook-ceph-mon-c-89c78d866–4w5sb 1/1 Running 0 7m25s
rook-ceph-operator-75d95cb868-s7m5z 1/1 Running 0 13m
rook-ceph-osd-prepare-node-01-dj9tm 0/1 Completed 0 6m33s
rook-ceph-osd-prepare-node-02–49d5c 0/1 Completed 0 6m33s
rook-ceph-osd-prepare-node-03-md22x 0/1 Completed 0 6m33s
rook-discover-44dq9 1/1 Running 0 12m
rook-discover-7gxn7 1/1 Running 0 12m
rook-discover-xfqmk 1/1 Running 0 12m

We have now successfully set up our Ceph cluster and can continue by creating our first storage block.

Step 3 — Adding Block Storage

Block storage allows an individual pod to mount storage. In this step, we will create a storage block that can be used later on in our applications.

Before Ceph can provide storage to our cluster, we first need to create a storageclass and a cephblockpool.

This step allows the Kubernetes to interoperate with Rook while creating persistent volumes.

The below-given command will create the above-mentioned resources for us.kubectl apply -f ./csi/rbd/storageclass.yaml

After running the command we should have the following output :cephblockpool.ceph.rook.io/replicapool created
storageclass.storage.k8s.io/rook-ceph-block created

Notice that storageclass uses the namespace rook-ceph which is the default one, to create the resource… so if we use any other namespace then we should change the provisioner prefix to the same namespace we have defined.

Next, we define a PersistentVolumeClaim, which is nothing but a resource used to request storage.

We will first create a yaml file for that using the command:nano pvc-rook-ceph-block.yaml

Then we add the following to the file we just created pvc-rook-ceph-block.yaml:apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mongo-pvc
spec:
storageClassName: rook-ceph-block
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi

Let’s take a look at what we just did:

In the above file, we first set the apiVersion to be v1 as it was the most current and stable version at the time of writing this guide. Then we use the kind key to tell the Kubernetes about the type of resource we want which in our case is PersistentVolumeClaim .

Then we define the model that Kubernetes will be used for creating our PersistentVolumeClaim using the spec key. For this, we first specify the storage class we created earlier, i.e. rook-ceph-block by the storageClassName option and then we go on to define the access modes by accessModes option. ReadWriteOnce is used to tell the Kubernetes that the volume should be mounted only by a single node.

Now we will create the PersistentVolumeClaim with the following command:kubectl apply -f pvc-rook-ceph-block.yaml

It will generate the following output:persistentvolumeclaim/mongo-pvc created

Next, we will check the status of out pvc:kubectl get pvc

When the PVC is bound, we should get a list showing a binding between the PersistentVolumeClaim and the PersistentVolume. Behind the hood the PersistentVolume is stored on the volumes attached to the worker nodes.

We have now successfully created a storage class and used it to create a PersistenVolumeClaim that we will mount to an application to persist data in the next section.

Step 4 — Creating a MongoDB Deployment with Rook

Now that we have successfully created a storage block and a persistent volume, we will put it to use by implementing it in a MongoDB application.

We will do some configurations for that which will have the following things:

  • A single container deployment based on the latest version of the mongo image.
  • A persistent volume for preserving the MongoDB database data.
  • A service for exposing the MongoDB port on port 31017 of each node so we can have interaction with them later.

First, we open the configuration file:nano mongo.yaml

Next, we add the following specifications to our manifest with the Deployment resource.apiVersion: apps/v1
kind: Deployment
metadata:
name: mongo
spec:
selector:
matchLabels:
app: mongo
template:
metadata:
labels:
app: mongo
spec:
containers:
- image: mongo:latest
name: mongo
ports:
- containerPort: 27017
name: mongo
volumeMounts:
- name: mongo-persistent-storage
mountPath: /data/db
volumes:
- name: mongo-persistent-storage
persistentVolumeClaim:\
claimName: mongo-pvc

Let’s try to understand all the things we wrote in the above specifications.

We have to set an apiVersion for every resource in the manifest. We use the apiVersion: apps/v1 for services and deployments as it is the most recent and stable version. We then use the kind key to tell Kubernetes which resource we want. We also need to have a name specified through metadata.name for each definition.

The next section specified by the spec key tells the Kubernetes about what is the desired final state of our deployment. Here, we are requesting Kubernetes to create one pod and one replica.

For organizing and cross-referencing our Kubernetes resources we use the labels which are nothing but key-value pairs. We can specify them using the metadata.labels and later on we can search for them using selector.matchLabels.

Next, we specify the model that will be used by the Kubernetes for creating each pod by defining the spec template key. Here, we tell the information about our pod’s deployment using the name of the image, container ports, and the mounted volumes. Kubernetes automatically pulls the image we specify from an image registry.

Here we will be using the earlier created PersistentVolumeClaim for persisting the data of the /data/db directory of the pods.

Next, we define a Kubernetes by adding the following code to the file. This service will expose the MongoDB port on port 31017 of every node in our cluster.apiVersion: v1
kind: Service
metadata:
name: mongo
labels:
app: mongo
spec:
selector:
app: mongo
type: NodePort
ports:
- port: 27017
nodePort: 31017

As you may notice, we also specify an apiVersion, but here we define a Service instead of Deployment type.

Every connection is received by the service on port 31017 which then forwards it to the pod’s port 27017 where we can access our application.

The NodePort is used as the service type, which exposes the service at a static port between 30000 and 32767(in our case it is 31017).

After defining the resources, it is time to deploy it by using the following command:kubectl apply -f mongo.yaml

We should get the following output after this command:deployment.apps/mongo created
service/mongo created

We may also check the service and deployment status through the command:kubectl get svc,deployments

For the above command, we should get an output similar to this:NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 10.245.0.1 <none> 443/TCP 33m
service/mongo NodePort 10.245.124.118 <none> 27017:31017/TCP 4m50sNAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/mongo 1/1 1 1 4m50s

After all this, we are now ready for saving the data in our database. The simplest way of doing this is by using the MongoDB shell; it is included in MongoDB pod we just started and we can use kubectl to open it.

We first need to get the name of the pod for that, which we can get from the following command:kubectl get pods

The output of the above command will be similar to this:NAME READY STATUS RESTARTS AGE
mongo-7654889675-mjcks 1/1 Running 0 13m

Copy the pod name because we will use it in the following command to open the MongoDB shell:kubectl exec -it <out_pod_name> mongo

We have now successfully entered in the MongoDB shell. Now, let’s get started by creating a new database.use test

The use command is used to switch between databases or to create one if it doesn’t exist.

The output of above command will be:switched to db test

Now we will insert some data into our newly created test database. We’ll be using the insertOne() method for inserting a new document in our database.

The output of this would be following:{
"acknowledged" : true,
"insertedId" : ObjectId("5f22dd521ba9331d1a145a58")
}

We will now retrieve the data so that we can be sure that it was saved. We will use the find() method for this.db.getCollection("test").find()

The output of this would be something like this:NAME READY STATUS RESTARTS AGE
{ "_id" : ObjectId("5f1b18e34e69b9726c984c51"), "name" : "testing", "number" : 010 }

Now that we have saved some data in our database, it will be automatically persisted in the underlying Ceph volume structure. The biggest advantage of this is the dynamic provisioning of the volume. The meaning of dynamic provisioning is that Ceph will automatically provide the storage as and when the applications request it instead of developers having to send requests to their storage providers for creating the storage manually.

We can also validate the above-explained functionality if we want by just simply restarting the pod and double-checking the data. To do this we can delete the pod because it will be restarted to fulfill the state defined in the deployment:kubectl delete pod -l app=mongo

Now we can repeat the steps we used above for connecting to a MongoDB shell and then retrieving the data to verify that data is still there.

We now have the Rook and Ceph successfully set up and we have also used them to persist the data of our deployment. We will now look into the Rook toolbox and its functionalities in the final step of this guide.

Step 5 — Running the Rook Toolbox

The Rook Toolbox is a tool provided by the Rook that helps us do various things like getting the current state of our Ceph deployment, troubleshooting the problems, changing Ceph configurations(enabling of modules, creating users and pools, etc), and many other things.

In this step we will first install the Rook Toolbox and then we’ll execute some basic commands like getting the current Ceph status using the Rook Toolbox.

We’ll now deploy the toolbox.yaml file, situated in examples/kubernetes/ceph directory to start the toolbox.kubectl apply -f toolbox.yaml

We’ll have the following output after this:deployment.apps/rook-ceph-tools created

Now we’ll check that our pod is running:kubectl -n rook-ceph get pod -l "app=rook-ceph-tools"

The output of the above command will be similar to this:NAME READY STATUS RESTARTS AGE
rook-ceph-tools-7c5bf67444-bmpxc 1/1 Running 0 9s

Now we will connect to the pod once its running by kubectl exec command.kubectl -n rook-ceph exec -it $(kubectl) -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}') bash

Now we can execute the Ceph commands like ceph status to get the current status of our Ceph configuration. Let’s try to execute this command:ceph status

The output of the above would be :cluster:
id: 71522dde-064d-4cf8-baec-2f19b6ae89bf
health: HEALTH_OK
services:
mon: 3 daemons, quorum a,b,c (age 23h)
mgr: a(active, since 23h)
osd: 3 osds: 3 up (since 23h), 3 in (since 23h)
data:
pools: 1 pools, 32 pgs
objects: 61 objects, 157 MiB
usage: 3.4 GiB used, 297 GiB / 300 GiB avail
pgs: 32 active+clean
io:
client: 5.3 KiB/s wr, 0 op/s rd, 0 op/s wr

We have now successfully set up the Rook-Ceph cluster on Kubernetes. We have also learnt how to use the Rook Toolbox for debugging and troubleshooting our Ceph deployment.

Common Issues

Various issues can be encountered while deploying a Ceph cluster on Kubernetes with Rook. Some of those issues are listed here:

A pod using Rook storage does not work:

Observable symptoms for this issue are:

  • kubectl -n rook-system get pod shows the rook-agent pods in a CrashLoopBackOff status
  • The pod that is configured to use Rook storage is stuck in the ContainerCreating status
  • kubectl describe podspecifies any of the following:
    PersistentVolumeClaim is not bound
    timeout expired waiting for volumes to attach/mount

Possible reasons for this issue are:

  • The rook-agentpods are responsible for mapping and mounting the volume from the cluster onto the node that your pod will be running on. If the rook-agent pod is not running then it cannot perform this function.
  • A rook-ceph-agent pod is in a CrashLoopBackOff status because it cannot deploy its driver on a read-only filesystem.
  • PersistentVolume is failing to be created and bounded.
  • A rook-ceph-agent pod is failing to mount and format the volume
  • Usage of Kubernetes 1.7.x or earlier, and the kubelet has not been restarted after rook-ceph-agent is in the Running status.

Cluster failing to service requests:

Observable symptoms for this issue are:

  • Execution of the ceph command hangs.
  • Large amount of slow requests are blocking.
  • Large amount of stuck requests are blocking.
  • One or more MONs are restarting periodically.
  • PersistentVolume is failing to be created.

Possible reasons and solutions for this issue are:

  • MON pods restart, and one or more Ceph daemons are not getting configured with the proper cluster information. This is usually the result of not specifying the value for dataDirHostPath in your Cluster CRD.
  • The dataDirHostPath setting specifies a path on the local host for the Ceph daemons to store configuration and data. Setting this to a path like /var/lib/rook, reapplying your Cluster CRD and restarting all the Ceph daemons (MON, MGR, OSD, RGW) should solve this problem. After the Ceph daemons have been restarted, it may be best to restart rook-api.

Monitors are the only pods running:

Observable symptoms for this issue are:

  • After tearing down a working cluster to redeploy a new cluster, the new cluster fails to start.
  • The rook operator is running.
  • Only a partial number of the MON daemons are created and are failing.

Possible reasons and solutions for this issue are:

  • This is a common problem for restarting the Rook cluster where the local directory used for persistence can be cleared. This directory is a dataDirHostPath setting in the Cluster CRD and is set as /var/lib/rook. To fix the problem you need to remove all Rook components and delete the content in /var/lib/rook (or the directory specified by dataDirHostPath) on each of the hosts in the cluster. Then when the Cluster CRD is used to start a new cluster, the rook-operator should start all the pods as expected.

Note: Deleting the dataDirHostPath folder is destructive to the storage. Only delete the folder if you are trying to permanently purge the Rook cluster.

OSD pods are failing to start:

Observable symptoms for this issue are:

  • OSD pods fails to start.
  • A cluster is started after tearing down another cluster.

Possible reasons for this issue are the same as the reasons for the previous issue “Monitors are the only pods running”.

OSD pods are not created on my devices:

Observable symptoms for this issue are:

  • Even after specifying in the Cluster CRD, devices are not configured.
  • No OSD pods are started in the cluster.
  • Instead of having multiple OSD pods for each device, one pod is started on each node.

Possible solution for this issue is:

  • After having updated the CRD with the appropriate settings, or cleared the partition or file system from our devices, we can create an operator to analyze the devices and restart the operator. Each time the operator starts, it will ensure that all desired devices are configured.

Node hangs after reboot:

Observable symptom for this issue is:

  • After issuing a reboot command, node never returned online.

Possible solutions for this issue are:

  • The node needs to be drained before reboot. After the successful drain, the node can be rebooted as usual.
  • Because kubectl drain command automatically marks a node as unschedulable due to kubectl cordon effect, the node needs to be uncordoned once it is back online.
  • Drain the node:kubectl drain <node-name> — ignore-daemonsets — delete-local-data
  • Uncordon the node:kubectl uncordon <node_name>

Rook agent modprobe exec format error:

Observable symptoms for this issue are:

  • PersistentVolumes from Ceph fail/timeout to mount.
  • Rook Agent logs contain following lines:
    modinfo: ERROR: could not get modinfo from 'rbd': Exec format error

Possible solutions for this issue are:

  • If it is possible to upgrade your kernel, you should upgrade to 4.x, even better is >= 4.7due to a feature for CephFS added to the kernel.
  • If you are unable to upgrade the kernel, you need to go to each host that will consume storage and run:modprobe rbd

This command inserts the rbd module into the kernel. To persist this fix, you need to add the rbd kernel module to either /etc/modprobe.d/ or /etc/modules-load.d/. For both paths create a file called rbd.conf with the following content:rbd

Now when a host is restarted, the module should be loaded automatically.

Rook agent rbd module missing error:

Observable symptoms for this issue are:

  • Rook Agent in Error or CrashLoopBackOff status when deploying the Rook Operator.
  • Rook Agent logs contain below messages:2018–08–10 09:09:09.461798 I | exec: Running command: cat /lib/modules/4.15.2/modules.builtin
    2018–08–10 09:09:09.473858 I | exec: Running command: modinfo -F parm rbd
    2018–08–10 09:09:09.477215 N | ceph-volumeattacher: failed rbd single_major check, assuming it’s unsupported: failed to check for rbd module single_major param: Failed to complete ‘check kmod param’: exit status 1. modinfo: ERROR: Module rbd not found.
    2018–08–10 09:09:09.477239 I | exec: Running command: modprobe rbd
    2018–08–10 09:09:09.480353 I | modprobe rbd: modprobe: FATAL: Module rbd not found.
    2018–08–10 09:09:09.480452 N | ceph-volumeattacher: failed to load kernel module rbd: failed to load kernel module rbd: Failed to complete ‘modprobe rbd’: exit status 1.
    failed to run rook ceph agent. failed to create volume manager: failed to load kernel module rbd: Failed to complete ‘modprobe rbd’: exit status 1.

Possible solution for this issue is:

  • From the log message of Agent, we can see that the rbd kernel module is not available in the current system, neither as a builtin nor a loadable external kernel module. In this case, you have to re-configure and build a new kernel to address this issue.

Using multiple share filesystem (CephFS) is attempted on a kernel version older than 4.7:

Observable symptoms for this issue are:

  • More than one shared file system (CephFS) has been created in the cluster.
  • A pod attempts to mount any other shared file system besides the first one.
  • The pod incorrectly gets the first file system mounted instead of the intended file system.

Possible solution for this issue is:

  • The only solution to this problem is to upgrade your kernel to 4.7 or higher. This is due to a mount flag added in the kernel version 4.7 which allows to choose the filesystem by name.

Conclusion

That was a long tutorial! Congrats on getting through it. I hope this guide gave you an intensive knowledge of how we can run our own Rook-Ceph Cluster, and how we used a Rook Operator to deploy all the processes needed to run a Ceph Cluster within Kubernetes, and how to use it to provide storage for a MongoDB application.

For future work, you may also try out the other kinds of storage options Ceph provides like, shared file systems which are very useful if you want to mount the same volume to multiple pods at the same time.

Thanks for reading and hopefully following along!

Stay unparalleled,

P.S. And, of course, if you’re in need of a powerful GitHub alternative, then feel free to check out Codegiant.

Codegiant supports a simple issue tracker, git repositories, built-in CI/CD, and a documentation tool. If you’re tired of GitHub or GitLab’s complexity, check us out; we don’t bite :)


References:

https://github.com/rook/rook x

https://kubernetes.io/ x

https://rook.io/ x

https://docs.ceph.com/docs/master/ x

https://ilovett.github.io/docs/rook/master/common-problems.html x