So Storage needs to be 'always on'.
But maintenance still needs to happen, so the storage architecture needs to support this requirement.
I've been playing with Ceph recently, so I thought I'd look at the implications of upgrading my test cluster to the latest Dumpling release.
Here's my test environment
As you can see it's only a little lab, but will hopefully serve to indicate how well the Ceph guys are handling the requirement for 'always-on' storage.
The first thing I did was head on over to the upgrade docs at ceph.com. They're pretty straight forward, with a well defined sequence and some fairly good instructions(albeit with a focus skewed towards Ubuntu!)
For my test I decided to create a rbd volume, mount it from the cluster and then continue to access to the disk while performing the upgrade.
Upgrade ProcessI didn't want to update the OS on each node and ceph at the same time, so I decided to only bring ceph up to the current level. The dumpling version has a couple of additional dependencies, so I installed those first on each node;
Then I ran the rpm update on each node with the following command;
Once this was done, the upgrade guide simply indicates a service restart at each layer is needed.
Once all the monitors were done, you notice that the admin box could no longer talk to the cluster...gasp...but the update from 0.61 to 0.67 has changed the protocol and port used by the monitors. So this is an expected outcome, until the admin client is updated.
Now each of the osd processes needed to be restarted
Now my client rbd volume obviously connects to the osd's, but even with the osd restart my mounted rbd volume/filesystem carried on regardless.
Kudos to Ceph guys - a non-disruptive upgrade (at least for my small lab!)
I finished off the upgrade process by upgrading the rpm's on the client, and now my lab is running Dumpling.
After any upgrade I'd always recommend a sanity check before you consider it 'job done'. In this case, I thought I'd use a simple performance metric, and compare before and after. In this case, I just ran a 'dd' to the rbd device, before and after. The chart below shows the results;
As you can see the profile is very similar, with only minimal disparity between the releases.
It's good to see that open source storage is also delivering to the 'always on' principle. The next release of gluster (v3.5) is scheduled for December this year, so I'll run through the same scenario with gluster then.