Blog

Running ownCloud in Kubernetes With Rook Ceph Storage – Step by Step

The best practices for high availability, scalability, and performance? Read this guide about running ownCloud in Kubernetes with using Rook for a Ceph Cluster.
Running ownCloud in Kubernetes With Rook Ceph Storage
The first part of this series explained what we need for an ownCloud deployment in a Kubernetes cluster and gave a high level overview. You can find the example files for this guide in this GitHub repository.

Preparations

To follow this guide, you need admin access to a Kubernetes cluster with an Ingress controller. If you don’t have that already, you can follow these steps:

Kubernetes Cluster Access

If you don’t have a Kubernetes cluster, you can try using the following projects xetys/hetzner-kube on GitHubKubespray and others (Kubernetes documentation).

minikube is not enough when started with the default resources, be sure to give minikube extra resources otherwise you will run into problems! Be sure to add the following flags to the minikube start command: --memory=4096 --cpus=3 --disk-size=40g.

You should have cluster-admin access to the Kubernetes cluster! Other access can also work, but due to the nature of objects that are created along the way it is easier to have the cluster-admin access.

Kubernetes Cluster

Ingress Controller

WARNING: Only follow this section, if your Kubernetes cluster does not have an Ingress controller yet.

We are going to install the Kubernetes NGINX Ingress Controller.

# Taken from https://github.com/kubernetes/ingress-nginx/blob/master/deploy/static/mandatory.yaml
kubectl apply -f ingress-nginx/

The instructions shown here are for an environment without LoadBalancer Service type support (e.g., bare metal, “normal” VM provider, not cloud), for installation instructions for other environments check out Installation Guide – NGINX Ingress Controller.

# Taken from https://github.com/kubernetes/ingress-nginx/blob/master/deploy/static/provider/baremetal/service-nodeport.yaml
kubectl apply -f ingress-nginx/service-nodeport.yaml

As these are bare metal installation instructions, the NGINX Ingress controller will be available through a Service of type NodePort. This Service type exposes one or more ports on all Nodes in the Kubernetes cluster.

To get that port run:

$ kubectl get -n ingress-nginx service ingress-nginx
NAME            TYPE       CLUSTER-IP       EXTERNAL-IP   PORT(S)                      AGE
ingress-nginx   NodePort   10.108.254.160   <none>        80:30512/TCP,443:30243/TCP   3m

In that output you can see the NodePorts for HTTP and HTTPS on which you can connect to the NGINX Ingress controller and ownCloud later.

Though as written you probably want to look into a more “solid” way to expose the NGINX Ingress controller(s), for bare metal where there is no Kubernetes LoadBalancer integration one can consider using hostNetwork option for that: bare-metal considerations – NGINX Ingress Controller.

Namespaces

Through the whole installation we will create 4 Namespaces:

  • rook-ceph – For the Rook-run Ceph cluster + the Rook Ceph operator (will be created below).
  • owncloud – For ownCloud and the other operators, such as Zalando’s Postgres Operator and KubeDB for Redis.
  • ingress-nginx – If you don’t have an Ingress controller running yet, the namespace is used for the Ingress NGINX controller (it was already created with the Ingress Controller).
kubectl create -f namespaces.yaml

Rook Ceph Storage

Now on to running Ceph in Kubernetes, using the Rook.io project.

In the following sections make sure to use the available -test suffixed files if you have less than 3 Nodes which are available to any application / Pod (e.g., depending on your cluster the masters are not available for Pods). (You can change that, for that be sure to dig into the CephCluster object’s spec.placement.tolerations and the Operator environment variables for the discover and agent daemons. Running application Pods on the masters is not recommended though.)

Operator

The operator will take care of starting up the Ceph components one by one and also preparing of disks and health checking.

kubectl create -f rook-ceph/common.yaml
kubectl create -f rook-ceph/operator.yaml

You can check on the Pods to see how it looks:

$ kubectl get -n rook-ceph pod
NAME                                  READY   STATUS    RESTARTS   AGE
rook-ceph-agent-cbrgv                 1/1     Running   0          90s
rook-ceph-agent-wfznr                 1/1     Running   0          90s
rook-ceph-agent-zhgg7                 1/1     Running   0          90s
rook-ceph-operator-6897f5c696-j724m   1/1     Running   0          2m18s
rook-discover-jg798                   1/1     Running   0          90s
rook-discover-kfxc8                   1/1     Running   0          90s
rook-discover-qbhfs                   1/1     Running   0          90s

The rook-discover-* Pods are each one on each Node of your Kubernetes cluster, as they are discovering the disks of the Nodes so the operator can plan the actions for a given CephCluster object which comes up next.

 

ownCloud kubernetes rook ceph order structureOrder and structure prevail in the realm of Kubernetes.

Ceph Cluster

This is the definition of Ceph cluster that will be created in Kubernetes. It contains the lists and options on which disks to use and on which Nodes.

If you wanna see some example CephCluster objects to see what is possible, be sure to check out Rook v1.0 Documentation – CephCluster CRD.

INFO: Use the cluster-test.yaml when your Kubernetes cluster has less than 3 schedulable Nodes (e.g., minikube)! When using the cluster-test.yaml only one mon is started. If that mon is down for whatever reason, the Ceph Cluster will come to a halt to prevent any data “corruption”.

$ kubectl create -f rook-ceph/cluster.yaml

This will now cause the operator to start the Ceph cluster after the specifications in the CephCluster object.

To see which Pods have already been created by the operator, you can run (output example from a three node cluster):

$ kubectl get -n rook-ceph pod
NAME                                                     READY   STATUS      RESTARTS   AGE
rook-ceph-agent-cbrgv                                    1/1     Running     0          11m
rook-ceph-agent-wfznr                                    1/1     Running     0          11m
rook-ceph-agent-zhgg7                                    1/1     Running     0          11m
rook-ceph-mgr-a-77fc54c489-66mpd                         1/1     Running     0          6m45s
rook-ceph-mon-a-68b94cd66-m48lm                          1/1     Running     0          8m6s
rook-ceph-mon-b-7b679476f-mc7wj                          1/1     Running     0          8m
rook-ceph-mon-c-b5c468c94-f8knt                          1/1     Running     0          7m54s
rook-ceph-operator-6897f5c696-j724m                      1/1     Running     0          11m
rook-ceph-osd-0-5c8d8fcdd-m4gl7                          1/1     Running     0          5m55s
rook-ceph-osd-1-67bfb7d647-vzmpv                         1/1     Running     0          5m56s
rook-ceph-osd-2-c8c55548f-ws8sl                          1/1     Running     0          5m11s
rook-ceph-osd-prepare-owncloudrookceph-worker-01-svvz9   0/2     Completed   0          6m7s
rook-ceph-osd-prepare-owncloudrookceph-worker-02-mhvf2   0/2     Completed   0          6m7s
rook-ceph-osd-prepare-owncloudrookceph-worker-03-nt2gs   0/2     Completed   0          6m7s
rook-discover-jg798                                      1/1     Running     0          11m
rook-discover-kfxc8                                      1/1     Running     0          11m
rook-discover-qbhfs                                      1/1     Running     0          11m

Block Storage (RBD)

Before creating the CephFS filesystem, let’s create a block storage pool with a StorageClass. The StorageClass is for the PostgreSQL, and if you want, even the Redis cluster.

INFO: Use the storageclass-test.yaml when your Kubernetes cluster has less than 3 schedulable Nodes!

kubectl create -f rook-ceph/storageclass.yaml

In case of a block storage Pool there are no additional Pods that will be started, we’ll verify that the block storage Pool has been created in the “Toolbox” section above.

One more thing to do: set the created StorageClass as default in the Kubernetes cluster by running the following command:

kubectl patch storageclass rook-ceph-block -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'

Now you are ready to move onto the storage for the actual data to be stored in ownCloud!

CephFS

CephFS is the filesystem that Ceph offers. With its POSIX compliance it is a perfect fit to be used with ownCloud.

INFO: Use the filesystem-test.yaml when your Kubernetes cluster has less than 3 schedulable Nodes!

kubectl create -f rook-ceph/filesystem.yaml

The creation of the CephFS will cause so called MDS daemons, MDS Pods, to be started.

kubectl get -n rook-ceph pod
NAME                                    READY   STATUS      RESTARTS   AGE
[...]
rook-ceph-mds-myfs-a-747b75bdc7-9nzwx                    1/1     Running     0          11s
rook-ceph-mds-myfs-b-76b9fcc8cc-md8bz                    1/1     Running     0          10s
[...]

Toolbox

This will create a Pod which will allow us to run Ceph commands. It will be useful to quickly check the Ceph cluster’s status.

kubectl create -f rook-ceph/toolbox.yaml
# Wait for the Pod to be `Running`
kubectl get -n rook-ceph pod -l "app=rook-ceph-tools"
NAME                                    READY   STATUS      RESTARTS   AGE
[...]
rook-ceph-tools-5966446d7b-nrw5n                         1/1     Running     0          10s
[...]

Now use kubectl exec to enter the Rook Ceph Toolbox Pod:

kubectl exec -n rook-ceph -it $(kubectl get -n rook-ceph pod -l "app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}') bash

In the Rook Ceph Toolbox Pod, run the following command to get the Ceph cluster health status (example output from a 7 Node Kubernetes Rook Ceph cluster):

$ ceph -s
 cluster:
   id:     f8492cd9-3d14-432c-b681-6f73425d6851
   health: HEALTH_OK

services:
   mon: 3 daemons, quorum c,b,a
   mgr: a(active)
   mds: repl-2-1-2/2/2 up  {0=repl-2-1-c=up:active,1=repl-2-1-b=up:active}, 2 up:standby-replay
   osd: 7 osds: 7 up, 7 in

data:
   pools:   3 pools, 300 pgs
   objects: 1.41 M objects, 4.0 TiB
   usage:   8.2 TiB used, 17 TiB / 25 TiB avail
   pgs:     300 active+clean

io:
   client:   6.2 KiB/s rd, 1.5 MiB/s wr, 4 op/s rd, 140 op/s wr

You can also get it by using kubectl:

$ kubectl get -n rook-ceph cephcluster rook-ceph
NAME        DATADIRHOSTPATH   MONCOUNT   AGE   STATE     HEALTH
rook-ceph   /mnt/sda1/rook    3          14m   Created   HEALTH_OK

That even shows you some additional information directly through kubectl instead of having to read the ceph -s output.

Rook Ceph Summary

This is how it should look like in your rook-ceph Namespace now (example output from a 3 Node Kubernetes cluster):

$ kubectl get -n rook-ceph pod
NAME                                                     READY   STATUS      RESTARTS   AGE
rook-ceph-agent-cbrgv                                    1/1     Running     0          15m
rook-ceph-agent-wfznr                                    1/1     Running     0          15m
rook-ceph-agent-zhgg7                                    1/1     Running     0          15m
rook-ceph-mds-myfs-a-747b75bdc7-9nzwx                    1/1     Running     0          42s
rook-ceph-mds-myfs-b-76b9fcc8cc-md8bz                    1/1     Running     0          41s
rook-ceph-mgr-a-77fc54c489-66mpd                         1/1     Running     0          11m
rook-ceph-mon-a-68b94cd66-m48lm                          1/1     Running     0          12m
rook-ceph-mon-b-7b679476f-mc7wj                          1/1     Running     0          2m22s
rook-ceph-mon-c-b5c468c94-f8knt                          1/1     Running     0          2m6s
rook-ceph-operator-6897f5c696-j724m                      1/1     Running     0          16m
rook-ceph-osd-0-5c8d8fcdd-m4gl7                          1/1     Running     0          10m
rook-ceph-osd-1-67bfb7d647-vzmpv                         1/1     Running     0          10m
rook-ceph-osd-2-c8c55548f-ws8sl                          1/1     Running     0          9m48s
rook-ceph-osd-prepare-owncloudrookceph-worker-01-5xpqk   0/2     Completed   0          73s
rook-ceph-osd-prepare-owncloudrookceph-worker-02-xnl8p   0/2     Completed   0          70s
rook-ceph-osd-prepare-owncloudrookceph-worker-03-2qggs   0/2     Completed   0          68s
rook-ceph-tools-5966446d7b-nrw5n                         1/1     Running     0          8s
rook-discover-jg798                                      1/1     Running     0          15m
rook-discover-kfxc8                                      1/1     Running     0          15m
rook-discover-qbhfs                                      1/1     Running     0          15m

The important thing is that the ceph -s output or the kubectl get cephcluster output shows that the health is HEALTH_OK and that you have OSD Pods running. The ceph -s output line should say: osd: 3 osds: 3 up, 3 in (where 3 is basically the amount of OSD Pods).

Should you not have any OSD Pod, make sure all your Nodes are Ready and schedulable (e.g., no taints preventing “normal” Pods to run) and make sure to check out the logs of the rook-ceph-osd-prepare-* and of existing rook-ceph-osd-[0-9]* Pods.

If you don’t have any Pods related to rook-ceph-osd-* look into the rook-ceph-operator-* logs for error messages, be sure to go over each line to make sure you don’t miss an error message.

PostgreSQL

Moving on to the PostgreSQL for ownCloud. Zalando’s PostgreSQL operator does a great job for running PostgreSQL in Kubernetes.

First thing to create is the PostgreSQL Operator which brings the CustomResourceDefinitions, remember the custom Kubernetes objects, with itself. Using the Ceph block storage (RBD) we are going to create a redundant PostgreSQL instance for ownCloud to use.

$ kubectl create -n owncloud -f postgres/postgres-operator.yaml
# Check for the PostgreSQL operator Pod to be created and running
$ kubectl get -n owncloud pod
NAME                                 READY   STATUS    RESTARTS   AGE
postgres-operator-6464fc9c48-6twrd   1/1     Running   0          5m23s

With the operator created, move on to the PostgreSQL custom resource object that will cause the operator to create a PostgreSQL instance for use in Kubernetes:

# Make sure the CustomResourceDefinition of the PostgreSQL has been created
$ kubectl get customresourcedefinitions.apiextensions.k8s.io postgresqls.acid.zalan.do
NAME                        CREATED AT
postgresqls.acid.zalan.do   2019-08-04T10:27:59Z

The CustomResourceDefinition exists? Perfect, continue with the creation:

kubectl create -n owncloud -f postgres/postgres.yaml

It will take a bit for the two PostgreSQL Pods to appear, but in the end you should have two owncloud-postgres Pods:

$ kubectl get -n owncloud pod
NAME                                 READY   STATUS    RESTARTS   AGE
owncloud-postgres-0                  1/1     Running   0          92s
owncloud-postgres-1                  1/1     Running   0          64s
postgres-operator-6464fc9c48-6twrd   1/1     Running   0          7m

owncloud-postgres-0 and owncloud-postgres-1 in Running status? That looks good.

Now that the database is running, let’s continue to the Redis.

Redis

To run a Redis cluster we need the KubeDB Operator. You can install it with a bash script or Helm. To keep it quick’n’easy we’ll use their bash script for that:

curl -fsSL https://raw.githubusercontent.com/kubedb/cli/0.12.0/hack/deploy/kubedb.sh -o kubedb.sh
# Take a look at the script using, e.g., `cat kubedb.sh`
#
# If you are fine with it, run it:
chmod +x kubedb.sh
./kubedb.sh
# It will install the KubeDB operator to the cluster in the `kube-system` Namespace

(You can remove the script afterwards: rm kubedb.sh)

For more information on the bash script and / or the Helm installation, checkout KubeDB.

Now move on to create the Redis cluster. Run:

kubectl create -n owncloud -f redis.yaml

It will take a few seconds for the first Redis Pod(s) to be started, to check that it worked, look for Pods with redis-owncloud- in their name:

$ kubectl get -n owncloud pods
NAME                                 READY   STATUS    RESTARTS   AGE
owncloud-postgres-0                  1/1     Running   0          6m41s
owncloud-postgres-1                  1/1     Running   0          6m13s
postgres-operator-6464fc9c48-6twrd   1/1     Running   0          12m
redis-owncloud-shard0-0              1/1     Running   0          49s
redis-owncloud-shard0-1              1/1     Running   0          40s
redis-owncloud-shard1-0              1/1     Running   0          29s
redis-owncloud-shard1-1              1/1     Running   0          19s
redis-owncloud-shard2-0              1/1     Running   0          14s
redis-owncloud-shard2-1              1/1     Running   0          10s

That is how it should look like now.

ownCloud

Now the final “piece”: ownCloud. The folder owncloud/ contains all the manifests we need:

  • ConfigMap and Secret for basic configuration of the ownCloud.
  • Deployment to get ownCloud Pods running in Kubernetes.
  • Service and Ingress to expose ownCloud to the internet.
  • CronJob to run the ownCloud cron task execution (e.g., cleanup and others), instead of having the cron run per instance.

The ownCloud Deployment currently uses a custom built image (galexrt/owncloud-server:latest) which has a fix for a clustered Redis configuration issue (There is already an open pull request).

kubectl create -n owncloud -f owncloud/
# Now we'll wait for ownCloud to have installed the database to then scale the ownCloud up to `2` (or more if you want)

The admin username is myowncloudadmin and can be changed in the owncloud/owncloud-configmap.yaml file. Be sure to restart both ownCloud Pods after changing values in the ConfigMaps and Secrets.

If you want to change the admin password, edit the owncloud/owncloud-secret.yaml file line OWNCLOUD_ADMIN_PASSWORD. The values in a Kubernetes Secret object are base64 encoded (e.g., echo -n YOUR_PASSWORD | base64 -w0)!

To know when your ownCloud is up’n’running check the logs, e.g.:

$ kubectl logs -n owncloud -f owncloud-856fcc4947-crscn
Creating volume folders...
Creating hook folders...
Waiting for PostgreSQL...
wait-for-it: waiting 180 seconds for owncloud-postgres:5432
wait-for-it: owncloud-postgres:5432 is available after 1 seconds
Removing custom folder...
Linking custom folder...
Removing config folder...
Linking config folder...
Writing config file...
Fixing base perms...
Fixing data perms...
Fixing hook perms...
Installing server database...
ownCloud was successfully installed
ownCloud is already latest version
Writing objectstore config...
Writing php config...
Updating htaccess config...
.htaccess has been updated
Writing apache config...
Enabling webcron background...
Set mode for background jobs to 'webcron'
Touching cron configs...
Starting cron daemon...
Starting apache daemon...
[Sun Aug 04 13:26:18.986407 2019] [mpm_prefork:notice] [pid 190] AH00163: Apache/2.4.29 (Ubuntu) configured -- resuming normal operations
[Sun Aug 04 13:26:18.986558 2019] [core:notice] [pid 190] AH00094: Command line: '/usr/sbin/apache2 -f /etc/apache2/apache2.conf -D FOREGROUND'

The Installing server database... will take some time depending on your network, storage and other factors.

After the [Sun Aug 04 13:26:18.986558 2019] [core:notice] [pid
190] AH00094: Command line: '/usr/sbin/apache2 -f
/etc/apache2/apache2.conf -D FOREGROUND'
 you should be able to reach your ownCloud instance through the NodePort Service Port (on HTTP) or through the Ingress (default address owncloud.example.com). If you are using the Ingress from the example files, be sure to edit it to use a (sub-) domain pointing to the Ingress controllers in your Kubernetes cluster.

You now have a ownCloud instance running!

Further points

HTTPS

To further improve the experience of running ownCloud in Kubernetes, you will probably want to checkout Jetstack’s cert-manager project on GitHub to get yourself Letsencrypt certificates for your Ingress controller. The cert-manager allows you to request Let’s Encrypt certificates easily through Kubernetes custom objects and keep them uptodate.

Meaning the ownCloud will then be reachable via HTTPS which combined with the ownCloud encryption makes it pretty secure.

For more information on using TLS with Kubernetes Ingress, checkout Ingress – Kubernetes.

Pod Health Checks

In the owncloud/owncloud-deployment.yaml there is a readinessProbe and livenessProbe in the Deployment sepc but commented out. After the ownCloud has been installed and you have verified it is running, you can go ahead and uncomment those lines and use kubectl apply / kubectl replace (don’t forget to specify the Namespace -n owncloud).

Upload Filesize

When changing the upload filesize on the ownCloud instance itself through the environment variables, be sure to also update the Ingress controller with the “max upload file size”.

Other Configuration Options

When wanting to change config options, you need to provide them through environment variables. You can specify them in the owncloud/owncloud-configmap.yaml.

A list of all available environment variables can be found here:

Updating ownCloud in Kubernetes

It is the same procedure as with running ownCloud with, e.g., docker-compose.

To update ownCloud you need to scale down the Deployment to 1 (replicas), then update the image, wait for the one single Pod come up again and then scale up the ownCloud Deployment again to, e.g., 2 or more.

Summary

This is the end of the two part series on running ownCloud in Kubernetes – thanks for reading into it. Hopefully it is helpful. Feedback is appreciated! Share this guide with others.

ownCloud

August 8, 2019

Read now:

Understanding Web Applications in oCIS

Understanding Web Applications in oCIS

In today’s fast-paced digital world, web applications play a crucial role in enhancing user experience and functionality. Infinite Scale comes with a world-class web interface to manage file resources, but it can be extended by utilizing ownCloud Infinite Scale (oCIS) as a construction set for custom web apps.

read more
Full digital sovereignty has 3 levels

Full digital sovereignty has 3 levels

Digital sovereignty is becoming increasingly important for public authorities and companies – and they already have the option of using fully sovereign software stacks. Content collaboration specialist ownCloud explains what sets them apart.

read more