Getting dynamic on the cloud

June 16, 2010

Cloud based voice services need to dynamically scale up or down in response to load. In these series we will build custom virtual servers, destroy and re-create them on demand.

###The Cloud Advantage over traditional servers

  • Provisioning can take minutes instead of days
  • Outsource everything (servers, network, power, and  facilities)
  • Upgrades can be done with few clicks (as opposed to have to deal with traditional hardware/network vendors).
  • Less exposure to hardware/network failures (in my experience the sheer size of cloud vendors make them less prone to failures than most other providers )
  • Less exposure to underprovisioning/overprovisioning risks. (when you buy boxes it is hard to know how many you will be using, buy too few and you will be underpowered, buy too many and you are wasting money)
  • Cloud servers usually are provisioned month to month and some vendors like Rackspace and Amazon will bill by the hour with no commitment whatsoever

The last point is very important.  The fact that services are billed by the hour opens interesting scenarios, specifically the ability to scale down to save money on infrastructure during non peak hours. (Or to scale up during spikes)

There are no cloud server providers that will allow you to shutdown or pause idle virtual servers to save money. The next best thing is to delete and re-build the server on demand. Traditionally the requirement to provision a great number of systems was dealt with image files. You provision one server manually  and then completely save the system state as an image file, and if you requirements rarely change,  it is a good way to solve the problem. However, when you have to deal with dozen of little changes here and there you begin to need dozens of image files that are mostly the same except in some little parts. Sounds like something nasty.

###A brief introduction to Chef

Enter Chef built by Opscode, Chef allows to use a Ruby based DSL to describe precisely how a system is going to be built.  Chef works by reading a spec file in JSON format. The spec file will enumerate the software packages that are going to be installed in the system. This packages are called Cookbooks in Chef, and there are dozens of them in the Opscode cookbooks site. Most cookbooks have configurable attributes, for instance the MySQL cookbook allow you to set the root password for the database server. To do this you have to send it as an attribute, attributes also go into the spec file. This spec file is called json-attributes in the Chef documentation.

This time we will focus on Chef-solo, the small brother of Chef-server. While Chef-solo is fully capable to build all kinds of systems it is not able to manage a great number of Chef nodes nor does it present with the nice web interface Chef-server does. In Chef a node refers to the system that is going to be configured.

Chef is said to be idempotent, this means that no matter how many times you run a spec on a given node, you will always get the same result. If Chef is told that the target system needs to have a working installation of Apache, and it was already installed by a previous Chef run then it will be left alone, however if the spec file changed -say we want to have openvpn on that system, then the next time Chef runs it will install the openvpn using the proper package manager tool for our platform (apt, yum, etc).

###DYNAMIC IP ADDRESS WOES & THE DDNS SOLUTION

In these series we will use the Rackspace cloud, everything discussed here should apply to Amazon EC2 servers as well. To create virtual servers there are two roads, one is logging into the provider control panel and add a new server choosing its size and operative system, (and probably a base image file if we are on Amazon). The second road is to do it programatically,  Rackspace offers a REST based API while Amazon offers a SOAP one. There are two ruby gems that can save us a lot of time, one of the would be the RightScale tools and the other is Fog which we will be using for this article.

When we create a server on the cloud, we usually do not know what IP address we are going to get until after we have given the order to do so.  We could store this IP address somewhere so that we can refer to the system at a later time. Once assigned to a virtual server the IP address should never change. Unless we delete and re-create a server, which is exactly what we will be doing here, probably several times a day.  The same situation could arise if one of the virtual servers or the underlying physical hardware/network failed (has never happened to me to this date, but i know people that do). It is important to note that Amazon has some facilities for IP sharing known as Elastic IP that could open a different way to deal with this situation. One way to work around this ever-changing IP address problem is to not rely on the IP address but to use Dynamic DNS. This is going to be our scenario:

  • An active Rackspace Cloud account with an API key ( or an active Amazon EC2 account and API Key)
  • One BIND 9 server to act as DNS master that will receive updates from the dynamic nodes.
  • A “builder” system with  a working Ruby, rubygems and fog gem installed, most likely our workstation, but could also be the same system as the DNS master.
  • A “target” system that is going to be created in our Rackspace account using Fog and configured using Chef

####GETTING READY FOR DYNAMIC DNS Traditional dynamic DNS providers, like dyndns.org offer an easy, reliable and inexpensive way to assign DNS names to devices with ever changing IP addresses, however they do come with limitations, more often than not, your desired hostname is unavailable, and running it under your own domain name is generally out of the question. Thankfully ISC Bind 8 and 9 come with everything you need to roll your own Dynamic DNS service. We are going to generate a key from our workstation for one dynamic node and then we will copy it to our Bind 9 server. Bind allows for dynamic updates of IP information for clients that authenticate with a key, in this example we will use TSIG keys and HMAC-MD5 security.

To generate a key we need the dnssec-keygen utility, which comes with Bind, lets install it in our workstation using something like

#we assume a debian/ubuntu based workstation, use something like yum install bind if you are on a Red Hat/Centos system or the appropiate command for your platform

$   aptitude install bind9 -y

IMPORTANT make sure that your dynamic node and your DNS server run the same version of both BIND and nsupdate (provided by dnsutils) in both your DNS server and your dynamic nodes, the reason for this is that keys created with different versions of dnssec-keygen will have a different version number and therefore be incompatible

Now lets generate the key, this can be a bit slow depending on your machine

#be sure to change the USER part to a meaningful fully qualified domain name preceeded with a relevant hostname for the node

$ dnssec-keygen -a HMAC-MD5 -b 512 -n USER yourhostname.yourdomainname.com.

The previous command will generate two files one ending in .key and other ending in .private

user@linux:/tmp# ls -l
-rw------- 1 root root 130 Jul 22 01:20 Kyourhostname.yourdomainname.com.+157+20820.key
-rw------- 1 root root 156 Jul 22 01:20 Kyourhostname.yourdomainname.com.+157+20820.private

Lets rename our security keys, change the one ending in .key to tsig.key, then change the one ending in .private to tsig.private

user@linux:/tmp# ls -l
-rw------- 1 root root 130 Jul 22 01:20 tsig.key
-rw------- 1 root root 156 Jul 22 01:20 tsig.private

Now open the tsig.private file, it should look similar to this:

user@linux:/tmp# cat tsig.private

Private-key-format: v1.3
Algorithm: 157 (HMAC_MD5)
Key:YrVW9yP6gNMA7VbcU/r2mSIwYnFj/XkCDd6QuqOHE26/ipnrPy+eXrKrUyaFhB2XWNdVLUX7QCUkfhg4zN5YiA==
Bits: AAA=
Created: 20100727021736
Publish: 20100727021736
Activate: 20100727021736

The TSIG key is in the third line everything after Key: up to the end of the line, including the ending “==”, so lets copy it. Now we need to add this key to Bind, in a debian/ubuntu system there is a /etc/bind/keys.conf file, other platforms might have different preferred locations of keys but you can always add it to the main bind/named configuration file. Lets create a key stanza replacing based on the example below replacing YOURPUBLICKEYHERE with the actual public key:

user@linux:/tmp# cat /etc/bind/keys.conf

key yourhostname.yourdomainname.com. {
  algorithm HMAC-MD5;
  secret "YOURPUBLICKEYHERE";
};

We are almost done with the Bind server, now we need to tell Bind what permissions the key created above has. In the debian/ubuntu Bind 9 server we will use the file /etc/bind/named.conf.local which is where the Linux vendor says they recommend user settings should go. We have our zone listings there, lets edit our example zone listing in /etc/bind/named.conf.local

user@linux:/tmp# cat /etc/bind/named.conf.local
...

zone "yourdomainname.com." {
  type master;
  file "/etc/bind/zones/master/yourdomainname.com.db";
  update-policy {
    grant yourhostname.yourdomainname.com. name yourhostname.yourdomainname.com. A TXT;
  };  
};

This will make sure that any given key will only be able to make updates for its assigned subdomain. Restart Bind server now, as we are done with it.

Please keep your TSIG files somwehere handy, as we will use them again when we create our virtual server We will focus now on the client part of DDNS, the script that will be executed from the dynamic node at boot time so that the DNS records are updated with the new IP.

###SETTING THINGS UP FOR CHEF-SOLO

Cloud server APIs allow to create files on the server that is going to be built. Lets create the following three files in our builder or workstation machine:

The Chef Solo configuration file, solo.rb

user@linux:/tmp# cat solo.rb
#This is the chef config file,  it setups certain defaults for chef-solo execution
file_cache_path  "/tmp/chef-solo"
cookbook_path    ["/var/chef-solo/site-cookbooks", "/var/chef-solo/cookbooks"]
log_level        :info
log_location     STDOUT
ssl_verify_mode  :verify_none

The Chef JSON attributes file, dna.json

user@linux:/tmp# cat dna.json 
{
  
  "gems": [
    { "name": "rake" }
  ],  
  "recipes": [
    "build-essential", 
    "gems",
    "bind::dynamic_dns",
    "dnsutils"

  ]
}

And finally, the bootstrap shell script, bootstrap.sh

user@linux:/tmp# cat bootstrap.sh
#!/bin/bash

apt-get update
apt-get install git-core libncurses5-dev ruby ruby-dev rdoc libruby-extras  -y  

#get a working rubygems install
cd /tmp
wget http://rubyforge.org/frs/download.php/69365/rubygems-1.3.6.tgz
tar xvf rubygems-1.3.6.tgz
cd rubygems-1.3.6
ruby setup.rb
cp /usr/bin/gem1.8 /usr/bin/gem
cd

#lets get some basic gems installed
gem install ohai chef github --no-rdoc --no-ri

#get opscode repos
mkdir -p /var/chef-solo
git clone git://github.com/opscode/cookbooks.git
cp -r cookbooks /var/chef-solo

#get our repos

git clone git://github.com/renemendoza/site-cookbooks.git
cp -r site-cookbooks /var/chef-solo
cd /var/chef-solo/site-cookbooks
gh submodule init
gh submodule update
cd


#call out chef
chef-solo -c /etc/chef/solo.rb -j /etc/chef/dna.json

sh /tmp/ddns_update

By now you should have the following in your project directory

user@linux:~/cloud_automation/article_01$ ls -l
total 12
-rwxr-xr-x 1 user user  385 2010-06-11 19:56 bootstrap.sh
-rw-r--r-- 1 user user  482 2010-07-19 15:38 dna.json
-rw-r--r-- 1 user user  209 2010-07-19 15:38 solo.rb
-rw------- 1 user user  130 2010-07-19 19:56 tsig.key
-rw------- 1 user user  229 2010-07-19 19:56 tsig.private
  • dna.json file will list all the recipes due for installation, along with any parameters required
  • bootstrap.sh, will be executed after the server is done building. Its purpose is:
  • Install a proper development environment with ruby, rubygems, chef and ohai gems.
  • Download the full cookbook database from Opscode.
  • Download the customized bind cookbook we are using for this example
  • Launch chef-solo using the parameters defined in the previous two files jsolo.rb and dna.json

###THE SERVER CREATION SCRIPT

Chef Server and newly released Opscode Platform users have it much easier when it comes to create servers, as Opscode has published a tool known as Knife. Knife does many tricks and one of them is create servers in Rackspace and Amazon EC2 among other providers. Knife also bootstrap servers, but since we are dealing with Chef-solo we wont cover Knife here, we will instead rely on the same tool that Knife uses internally to create servers: Fog.

Having created the three previous files we are almost ready to begin building Rackspace Cloud servers fully configured by Chef, since we will be using Fog, Net::SFTP and Net::SSH we need to have those gems installed in our workstation

#lets install net/sftp, net/ssh and fog

$ sudo gem install net-sftp
$ sudo gem install net-ssh
$ sudo gem install fog

Now that we are going to need the builder (or workstation) system ssh public key. this is going to get used to log into your newly created server without typing a password. be sure to set the value of SSH_KEY in the script below to the contents of your workstation ssh public key, also be sure to set the values of TSIG_PRIVATE_PATH and TSIG_KEY_PATH to the names of the TSIG files we generated before, where TSIG_PRIVATE_PATH is the name of the file ending in .private and TSIG_KEY_PATH for the one ending in .key

Finally the server creation script:

user@linux:/tmp# cat fog_create.rb
#!/usr/bin/ruby

require "rubygems"
require 'yaml'
require "fog"
require "net/sftp"
require "net/ssh"

TSIG_PRIVATE_FILENAME     = "name/of/your/TSIG/file/ending/with/.private"
TSIG_KEY_FILENAME         = "name/of/your/TSIG/file/ending/with/.key"
SSH_KEY                   = "REPLACE THIS WITH THE CONTENTS OF YOUR PUBLIC SSH KEY FILE"
RACKSPACE_USERNAME        = "REPLACE THIS WITH YOUR RACKSPACE CLOUD USERNAME"
RACKSPACE_API_KEY         = "REPLACE THIS WITH YOUR RACKSPACE CLOUD API KEY"
SERVER_NAME               = "myserver.mydomain.tld" #CHEF REQUIRES THIS TO BE A FULLY QUALIFIED DOMAIN NAME TO WORK
RACKSPACE_FLAVOR_ID       = 1 #  The smallest (256MB ram) server available in Rackspace Cloud
RACKSPACE_IMAGE_ID        = 49 #  Ubuntu 10.0

@connection = Fog::Rackspace::Servers.new(
  :rackspace_username => RACKSPACE_USERNAME,
  :rackspace_api_key  => RACKSPACE_API_KEY
)

server = @connection.servers.create(
  :flavor_id   => RACKSPACE_FLAVOR_ID,
  :image_id    => RACKSPACE_IMAGE_ID,
  :name        => SERVER_NAME,
  :personality => [ 
    { 'path' => '/root/.ssh/authorized_keys', 'contents'  => SSH_KEY },
    { 'path' => '/etc/chef/solo.rb',          'contents'  => File.open("solo.rb").read },
    { 'path' => '/etc/chef/dna.json',         'contents'  => File.open("dna.json").read },
    { 'path' => '/root/bootstrap.sh',         'contents'  => File.open("bootstrap.sh").read }
  ]
)
server.wait_for { ready? }

Net::SFTP.start( server.addresses['public'].first, 'root') do |sftp|
  sftp.mkdir! "/etc/dyndns_keys"
  sftp.upload!("tsig.key", "/etc/dyndns_keys/tsig.key")
  sftp.upload!("tsig.private", "/etc/dyndns_keys/tsig.private")
end

Net::SSH.start( server.addresses['public'].first, 'root') do |ssh|
  ssh.exec("sh bootstrap.sh")
end

This script logs into Rackspace using your credentials, creates a server using the parameters provided and creates several files in the server when it is ready according to the contents of the personality hash key. Then the script waits until the server is ready to copy your files using SFTP, finally it executes a script that triggers Chef-solo configuration processs

Now execute your script and watch your server getting built

$   ruby fog_create.rb

After some minutes it will be ready, lets try it (make sure either you try this from the same machine running the DNS server or have your workstation resolver set to your DNS server. If you are configured to use your ISP’s DNS then you may face DNS name propagation issues)

$ ping  yourhostname.yourdomainname.com
PING yourhostname.yourdomainname.com (174.143.211.217) 56(84) bytes of data.
64 bytes from 174-143-211-217.static.cloud-ips.com (174.143.211.217): icmp_seq=1 ttl=63 time=0.000 ms
64 bytes from 174-143-211-217.static.cloud-ips.com (174.143.211.217): icmp_seq=2 ttl=63 time=0.000 ms
...

You will get of course a different ip

Delete the server using fog or using your control panel, re run fog_create.rb, wait until is ready and ping again. it should work

A final warning: It is quite possible that while erasing and deleting servers Rackspace will recycle IP addresses and you get one that you have had assigned before, the problem with this is that your machine ssh client might add that ip address to your known_hosts file, and then prevent you to connect to a different machine because it wrongly thinks there is a security threat. You can fix this quite easily by adding something like this to your .ssh/config file

  Host *.yourdomainname.com StrictHostKeyChecking no

Of course and as before substitute yourdomainname.com with a meaningful value


Reference:

  • http://akitaonrails.com/2010/02/20/cooking-solo-with-chef
  • http://wiki.opscode.com/display/chef/Documentation
  • The super helpful fellows from #chef in irc.freenode.net
  • http://github.com/geemus/fog
  • http://www.rackspacecloud.com/cloud_hosting_products/servers/api
  • http://linux.yyz.us/nsupdate/
  • http://linux.yyz.us/dns/ddns-server.html
  • http://ops.ietf.org/dns/dynupd/secure-ddns-howto.html

Comments

comments powered by Disqus