Intro
We at Voxel are deepening our relationship with the whole notion of infrastructure automation and our commitment to the growing set of tools from the OpenStack project and Chef from the OpsCode folks.
Where we last left off, I was walking through our manifest files which we use to crank out clusters of OpenStack’s Swift Object Storage Clouds. Now I’ll talk more about what a typical Swift topology looks like, which will help explain what this manifest actually means.
Swift Topology and our Setup Variables: Hardware, Rings, Users
OS and Networking: First off, Swift needs to know the various networks and IP addresses of the different services. It’s a good idea to separate your Swift deployment into a public “proxy” network and a “storage” network. Their network activity profiles are much different, and it will give the network jocks a chance to tune for the traffic patterns they see. It also offers a level of protection against sabotage from the outside. So, have at least two network devices on every single node, one for the storage network, one for the proxy network. Your own basic Chef node configuration should be able to arrange that.
Second, we’re going to make life easy and use Ubuntu 10.04 LTS and the swift-core repo available from the Launchpad Swift site. Those repos are built in our Swift cookbooks.
Finally, you’ve gotta setup a dedicated storage device of some kind on each of the Swift object datastore servers, not the proxies. You really ought to make this storage simple JBOD (Just a Bunch of Disks). No RAID – the fail-over advantages of RAID are lost because the swift system itself is replicating the data to other nodes.
Swift Server Chef Roles and the Rings: Based on the “Multi-node setup” of Swift as documented on the OpenStack website, we create two classes of servers, each with different suites of software and different hardware profiles. The “swift-proxy” role proxies and load balances the HTTP requests from clients, and also handles authentication. Swift also allows external authentication and authorization services to be configured, but it’s a custom coding job. Proxy role nodes have a special setting that allows account maintenance, which I have sub-classed in our Chef roles as “swift-auth” nodes, and they manage sensitive information that is best be kept away from end users by keeping them on a private network with access to the storage network. Not much disk is required by these servers. The “swift-storage” are the disk hogs. They handle the life-cycle of the three “rings” – the stored objects ring, the container ring, and the user account data ring. These rings are managed by nodes that I’ve subclassed off the storage nodes and called “swift-util.” On “swift-util” role nodes you’d run the ring rebuilding which takes a lot of CPU and other utilities to manage and maintain the rings like “st” All the services share some core packages, but each have their own Ubuntu “deb” packages.
With this info in mind, it becomes clear what our manifest will be able to do. Let’s walk through them now.
The Manifest, made manifest
There are two parts to the manifest: the spiceweasel YAML file, and the data bag with the ring and device definitions.
The SpiceWeasel deployment file – mytest_infrastructure.yaml
Here’s a sample spiceweasel YAML file for our purposes.
cookbooks: - swift: - apt: roles: - swift-storage: - swift-proxy: - swift-auth: - openstack-base: - swift-util: data bags: - mytest_cluster: nodes: - voxel 2: - role[swift-proxy] - --hostname_pattern=proxy --domain_name=swift.newgoliath.com --config_id_group=4 --facility=lga --image_id=39 --swap=2 - voxel 2: - role[swift-auth] - --hostname_pattern=proxy --domain_name=swift.newgoliath.com --config_id_group=4 --facility=lga --image_id=39 --swap=2 - voxel 5: - role[swift-storage] - --hostname_pattern=storage --domain_name=swift.newgoliath.com --config_id_group=4 --facility=lga --image_id=39 --swap=2 - voxel 2: - role[swift-util] - --hostname_pattern=util --domain_name=swift.newgoliath.com --config_id_group=4 --facility=lga --image_id=39 --swap=2
Broken down into sections, we’ll review the above.
The first stanza calls knife to upload those two cookbooks into the chef server.
cookbooks: - swift: - apt:
The second section uploads the <rolename>.json or <rolename>.rb files into the chef server.
roles: - swift-storage: - swift-proxy: - swift-auth: - openstack-base: - swift-util:
The third section indicates the data bags to be loaded into the server (much more on this below):
data bags: - mytest_cluster:
The last section indicates the nodes that will be built via our Voxel automated node creation. Note here that we’re making two types of servers. We’re making two proxy servers and five storage servers. They are using our “hostname_pattern” feature, that appends sequential two digit numbers to the hostnames, and appends the “domain_name” to it to create a FQDN. Our servers are offered in configuration sets, documented elsewhere – which define the hardware dedicated to each. ‘facility’ defines the Voxel facility where you’d like the server deployed. ‘image_id’ is from the list of available OS images – this one being Ubuntu 10.04LTS, and ‘swap’ indicating the amount of swap space that will be reserved on disk.
nodes: - voxel 2: - role[swift-proxy] - --hostname_pattern=proxy --domain_name=swift.newgoliath.com --config_id_group=4 --facility=lga --image_id=39 --swap=2 - voxel 2: - role[swift-auth] - --hostname_pattern=proxy --domain_name=swift.newgoliath.com --config_id_group=4 --facility=lga --image_id=39 --swap=2 - voxel 5: - role[swift-storage] - --hostname_pattern=storage --domain_name=swift.newgoliath.com --config_id_group=4 --facility=lga --image_id=39 --swap=2 - voxel 2: - role[swift-util] - --hostname_pattern=util --domain_name=swift.newgoliath.com --config_id_group=4 --facility=lga --image_id=39 --swap=2
The Swift specific data bag – “mytest_cluster.json”
You use this data bag to kick off the initial configuration of the rings and the devices. It’s then watched by the Swift cookbook, and the rings are rebuilt and distributed through the zones in the order described in the “zone_config_order” setting below. This allows you to maintain a high level of service, while you reconfigure the rings to optimize cluster performance.
/chef-repo/data_bags/testcluster# less conf.json { "id": "conf", "ring_common": { "zone_config_order": "1,2,3,4,5", "account_part_power": 18, "account_replicas": 3, "account_min_part_hours": 1, "container_part_power": 18, "container_replicas": 3, "container_min_part_hours": 1, "object_part_power": 18, "object_replicas": 3, "object_min_part_hours": 1 }, "rings": [ { "status": "online", "ring_type": "account", "cluster": "testcluster", "zone": "1", "hostname": "storage01", "port": 6002, "device": "xvda4", "weight": 100, "meta": "install" } , { "status": "online", "ring_type": "account", "cluster": "testcluster", "zone": "2", "hostname": "storage02", "port": 6002, "device": "xvda4", "weight": 100, "meta": "install" } , { "status": "online", "ring_type": "account", "cluster": "testcluster", "zone": "3", "hostname": "storage03", "port": 6002, "device": "xvda4", "weight": 100, "meta": "install" } , { "status": "online", "ring_type": "account", "cluster": "testcluster", "zone": "4", "hostname": "storage04", "port": 6002, "device": "xvda4", "weight": 100, "meta": "install" } , { "status": "online", "ring_type": "account", "cluster": "testcluster", "zone": "5", "hostname": "storage05", "port": 6002, "device": "xvda4", "weight": 100, "meta": "install" } , { "status": "online", "ring_type": "container", "cluster": "testcluster", "zone": "1", "hostname": "storage01", "port": 6001, "device": "xvda4", "weight": 100, "meta": "install" } , { "status": "online", "ring_type": "container", "cluster": "testcluster", "zone": "2", "hostname": "storage02", "port": 6001, "device": "xvda4", "weight": 100, "meta": "install" } , { "status": "online", "ring_type": "container", "cluster": "testcluster", "zone": "3", "hostname": "storage03", "port": 6001, "device": "xvda4", "weight": 100, "meta": "install" } , { "status": "online", "ring_type": "container", "cluster": "testcluster", "zone": "4", "hostname": "storage04", "port": 6001, "device": "xvda4", "weight": 100, "meta": "install" } , { "status": "online", "ring_type": "container", "cluster": "testcluster", "zone": "5", "hostname": "storage05", "port": 6001, "device": "xvda4", "weight": 100, "meta": "install" } , { "status": "online", "ring_type": "object", "cluster": "testcluster", "zone": "1", "hostname": "storage01", "port": 6000, "device": "xvda4", "weight": 100, "meta": "install" } , { "status": "online", "ring_type": "object", "cluster": "testcluster", "zone": "2", "hostname": "storage02", "port": 6000, "device": "xvda4", "weight": 100, "meta": "install" } , { "status": "online", "ring_type": "object", "cluster": "testcluster", "zone": "3", "hostname": "storage03", "port": 6000, "device": "xvda4", "weight": 100, "meta": "install" } , { "status": "online", "ring_type": "object", "cluster": "testcluster", "zone": "4", "hostname": "storage04", "port": 6000, "device": "xvda4", "weight": 100, "meta": "install" } , { "status": "online", "ring_type": "object", "cluster": "testcluster", "zone": "5", "hostname": "storage05", "port": 6000, "device": "xvda4", "weight": 100, "meta": "install" } ] }
The file has two major parts. First, there’s the short stanza with the common configuration options which define the basics of the cluster and the rings. Second, there’s the details of the rings with the hardware and zones.
Here’s some detailed explanation:
The first stanza:
"id": "conf", "ring_common": { "zone_config_order": "1,2,3,4,5", "account_part_power": 18, "account_replicas": 3, "account_min_part_hours": 1, "container_part_power": 18, "container_replicas": 3, "container_min_part_hours": 1, "object_part_power": 18, "object_replicas": 3, "object_min_part_hours": 1 },
Most of these settings you’re unlikely to change once your cluster is up and running. Of note here is the “zone_config_order” is unique to Voxel’s system, and controls how configuration changes are applied. Configuration changes can be quite costly to the system – they can really gum up IO – so it’s important to take a phased approach to applying ring configuration changes. See the OpenStack Swift documentation for the settings for “part_power”, “replicas” and “min_part_hours.”
The second major part of the data bag file is details of each of the three rings. We’ll take the “account” ring here as our example.
"rings": [ { "status": "online", "ring_type": "account", "cluster": "testcluster", "zone": "1", "hostname": "storage01", "port": 6002, "device": "xvda4", "weight": 100, "meta": "install" } , { "status": "online", "ring_type": "account", "cluster": "testcluster", "zone": "2", "hostname": "storage02", "port": 6002, "device": "xvda4", "weight": 100, "meta": "install" } , { "status": "online", "ring_type": "account", "cluster": "testcluster", "zone": "3", "hostname": "storage03", "port": 6002, "device": "xvda4", "weight": 100, "meta": "install" } , { "status": "online", "ring_type": "account", "cluster": "testcluster", "zone": "4", "hostname": "storage04", "port": 6002, "device": "xvda4", "weight": 100, "meta": "install" } , { "status": "online", "ring_type": "account", "cluster": "testcluster", "zone": "5", "hostname": "storage05", "port": 6002, "device": "xvda4", "weight": 100, "meta": "install" } ,
If you’re familiar with the swift-ring-builder tool, you’ll notice all the familiar parameters – but I’ve added two: “status” and “cluster.”
“Status” lets you take a config line out of service without abandoning it. You’d use this if you need to do serious work on a piece of hardware or network segment, and it will be unreliable for a while. “status”: “offline” will allow you to perform work on the configured item without the ring concerning itself with keeping the data on it as counting in a replica. Once you set it back to “status”: “online” Swift will begin to replicate to it again and the proxies will access it for data.
“cluster” indicates the name of your cluster – so chef will have the option of deploying and managing multiple clusters without the ring config information getting all confused.
Note also that I set the “meta” setting to “install,” which you should feel free to change while you modify your cluster and manage it with Chef. For example, you might want to add a new storage node and device once the cluster is up – and you would indicate this with a “meta”:
And on to Victory
This concludes our review of architecture and the “manifest” concept. Coming up in July, we’ll see Part 3 where I’ll be reviewing just how our chef cookbooks make use of the “manifest” and integrate with some of our local systems – making servicing our customers a breeze!
So long, and thanks for reading!