Sunday, 31 March 2013

Chef: Not for me

It will make life easy they said...

Firstly let me admit right up front that I don't program in Ruby. However circumstances dictated that Chef, from Opscode, would occupy a large part of my time this last month. The result is that Chef and I have developed a mutual dislike.

Using a config management system makes sense. It spares you the drudgery of repeated tinkering to get multiple servers to the exact same state each time. And it helps with scaling. Saves time and creates a repeatable process? What's not to like?

How about an overly complicated model? The first hint is the cute naming. Cookbooks, Recipes and Knife. Which don't map to the mental models I have of those items. Annoying, but not too serious. At least not a serious as requiring a Chef Server and a Chef Workstation to manage Chef Nodes, which are also Chef Clients. Sometimes.

The Chef Server is supposed to be a central repository of the final state. That information is uploaded to the server from the Workstation. Immediately there is potential for confusion. Did the latest changes get pushed to the Server? The rationale is that the Nodes/Clients can periodically request the latest expected state from the Server. So how do I test new Recipes without pushing to the Server and potentially disrupting the Nodes/Clients who then read the new expected state? Why do I have to think about that?

But it gets worse.

I created a local environment in which to run my Chef Recipes. A shiny new Ubuntu 12.04 LTS, all updates applied and running in a VirtualBox host. Snapshotted to allow me to quickly get back to a base install. Part of that process was the need to delete the Node and Client information from the Server using knife. But knife kept throwing an error when I tried to delete the node.
chef knife delete_object': undefined method `destroy'
Sigh.

So I went ahead and upgraded chef and now my recipes, such as they were, stopped working. Mutter! After some investigation I discovered that Chef 11 had changed the way attributes were defined because...
Chef would load attributes files in essentially random order
Given that the words 'random order' appeared in the reasons for the change I can appreciate the need for a change that breaks things. Also keep in mind that I was still learning Chef, so my Recipe's weren't the best. Nevertheless, this kind of thing is annoying.

I continued working with Chef, but kept losing time and hair because of seemingly stupid things. Look at the following 2 excerpts:-
----- excerpt one -----
if recipe_config['ssl_enabled']
    template ssl_cert_file do
        source 'ssl.crt'
        owner 'root'
        group 'root'
        mode '0600'
        variables(
            :content => certificate_secrets[node.chef_environment]['cert']
        )
        notifies :restart, resources(:service => "nginx"), :delayed
    end
end
----- excerpt two -----
if recipe_config['ssl_enabled']
    template ssl_cert_file do
        source 'ssl.crt'
        owner 'root'
        group 'root'
        mode '0600'
        variables(
            :content => certificate_secrets[node.chef_environment]['cert']
        )
    end
    notifies :restart, resources(:service => "nginx"), :delayed
end
The 'notifies' notification for the nginx resource is outside the template block in the second excerpt. Chef chokes on that, claiming that it
Cannot find a resource for notifies
What is the difference? Obviously this stems from my lack of understanding of Chef or Ruby, or both, but this seems unreasonable. I asked around the office and even those who had used Ruby couldn't tell me what the issue was. So I just used excerpt one.

And the voice in my head said "You're going to regret this."

After further work with my Recipe I reached a state were I was ready to test what I had built on AWS (this was for an existing client that was already using AWS). Chef's Knife tool has a handy plugin for managing EC2 instances. It allowed me to create and deploy with a single command:-
knife ec2 server create -I 'ami-b6089bdf' -x ubuntu -i ~/.ssh/awskey.pem -r 'role[webserver]' -E stage -f t1.micro -G 'web-service' --region 'us-east' -Z 'us-east-1c'
This command will create a new t1.micro instance from the specified image file, and deploy it as a  'webserver'. The other parameters specify the user to login as, the key file authentication, the region and availability zone and the security group. All very useful, except I kept getting an error:-
.../gems/excon-0.20.0/lib/excon/socket.rb:42:in `getaddrinfo': getaddrinfo: nodename nor servname provided, or not known (SocketError) (Excon::Errors::SocketError)
from .../gems/excon-0.20.0/lib/excon/socket.rb:42:in `connect'
from .../gems/excon-0.20.0/lib/excon/ssl_socket.rb:72:in `connect'
from .../gems/excon-0.20.0/lib/excon/socket.rb:32:in `initialize'
from .../gems/excon-0.20.0/lib/excon/ssl_socket.rb:8:in `initialize'
from .../gems/excon-0.20.0/lib/excon/connection.rb:341:in `new'
from .../gems/excon-0.20.0/lib/excon/connection.rb:341:in `socket'
from .../gems/excon-0.20.0/lib/excon/connection.rb:87:in `request_call'
from.../gems/excon-0.20.0/lib/excon/middlewares/mock.rb:79:in `request_call'
from .../gems/excon-0.20.0/lib/excon/middlewares/instrumentor.rb:22:in `request_call'
from .../gems/excon-0.20.0/lib/excon/middlewares/base.rb:15:in `request_call'
from .../gems/excon-0.20.0/lib/excon/middlewares/base.rb:15:in `request_call'
from .../gems/excon-0.20.0/lib/excon/connection.rb:220:in `request'
from .../gems/excon-0.20.0/lib/excon/middlewares/idempotent.rb:11:in `error_call'
from .../gems/excon-0.20.0/lib/excon/middlewares/base.rb:10:in `error_call'
from .../gems/excon-0.20.0/lib/excon/connection.rb:236:in `rescue in request'
from .../gems/excon-0.20.0/lib/excon/connection.rb:197:in `request'
from .../gems/excon-0.20.0/lib/excon/middlewares/idempotent.rb:11:in `error_call'
from .../gems/excon-0.20.0/lib/excon/middlewares/base.rb:10:in `error_call'
from .../gems/excon-0.20.0/lib/excon/connection.rb:236:in `rescue in request'
from .../gems/excon-0.20.0/lib/excon/connection.rb:197:in `request'
from .../gems/excon-0.20.0/lib/excon/middlewares/idempotent.rb:11:in `error_call'
from .../gems/excon-0.20.0/lib/excon/middlewares/base.rb:10:in `error_call'
from .../gems/excon-0.20.0/lib/excon/connection.rb:236:in `rescue in request'
from .../gems/excon-0.20.0/lib/excon/connection.rb:197:in `request'
from .../gems/fog-1.10.0/lib/fog/core/connection.rb:21:in `request'
from .../gems/fog-1.10.0/lib/fog/aws/compute.rb:384:in `_request'
from .../gems/fog-1.10.0/lib/fog/aws/compute.rb:379:in `request'
from .../gems/fog-1.10.0/lib/fog/aws/requests/compute/describe_images.rb:54:in `describe_images'
from .../gems/fog-1.10.0/lib/fog/aws/models/compute/images.rb:49:in `all'
from .../gems/fog-1.10.0/lib/fog/aws/models/compute/images.rb:55:in `get'
from .../gems/knife-ec2-0.6.2/lib/chef/knife/ec2_server_create.rb:360:in `ami'
from .../gems/knife-ec2-0.6.2/lib/chef/knife/ec2_server_create.rb:367:in `validate!'
from .../gems/knife-ec2-0.6.2/lib/chef/knife/ec2_server_create.rb:226:in `run'
from .../gems/chef-11.4.0/lib/chef/knife.rb:460:in `run_with_pretty_exceptions'
from .../gems/chef-11.4.0/lib/chef/knife.rb:173:in `run'
from .../gems/chef-11.4.0/lib/chef/application/knife.rb:123:in `run'
from .../gems/chef-11.4.0/bin/knife:25:in `<top (required)>'
from .../bin/knife:19:in `load'
from .../bin/knife:19:in `<main>'
from .../bin/ruby_noexec_wrapper:14:in `eval'
from .../bin/ruby_noexec_wrapper:14:in `<main>'
Cost a day as I tried to debug network issues, but the issue was that I was specifying the region incorrectly. AWS' sole eastern datacenter is 'us-east-1', not 'us-east' as I had specified. Dumb error, but even dumber error message. Moreover, getting obtuse error messages wasn't uncommon.

Finally I was able to repeatably create 'webserver' instances on EC2. Each instance was identical, had all the required software and configurations and was ready for deployment once I had done a software update (I want to maintain software update as a discrete, manual step to allow proper rollout of updates and obviate any new software breaking working systems.) So Chef is a legitimate solution to the issue of configuration management of multiple flavours of multiple server environments.

But not for me.

Chef is a very thick layer over the administration of a system. One has to learn Ruby and Chef. And in addition to Cookbooks, Recipes, Servers, Workstations, Nodes and Clients, there are Resources, Attributes, Providers, Notifications and Actions. For me, someone who has managed systems before, this layer gets in the way of doing things. I seem to have spent an inordinate amount of time coaxing Chef to do what I knew I could do (and have done) fairly easily with some shell and/or Perl scripts, albeit not with a large number of servers.

I prefer something less cumbersome. I think I will have a look at Ansible. It seems more my cup of tea.

Saturday, 9 March 2013

The Azure Dashboard

First steps

I started out by logging into the Azure dashboard (or Portal) and creating a Linux virtual machine. I used Chrome (25.0.1364.160) on OS/X 10.8.2.

Creating an Ubuntu LTS VM
The process was simple. Instead of the Quick Create option, I selected the Gallery. That started a wizard. As indicated, there is a 4 step process that allows me to specify various items like machine name, initial user name (default is azureuser), password, DNS name, region and some Azure specific settings, after which the VM is created.

One issue I had was on the second screen. After selecting the option to upload my own SSH key, the upload dialog did not appear when I clicked on Browse For File.

Azure VM creation Wizard

It worked on Windows 8 using Chrome (25.0.1364.152 m), so definitely a bug there.

I completed the process and the VM was created. I connected to it using the DNS name I provided. The DNS entry was on the domain cloudapp.net and the SSH daemon was configured to run on a port other than 22.
$ ssh -p 12345 azureuser@mytest01.cloupapp.net
Login was successful and the Ubuntu Server was running as expected. I could sudo to root and make changes, install packages and do the many other things expected.

That's all well and good, but I have an affinity for the command line. I deleted the server and downloaded and installed the Windows Azure Command Line Tools. That was no problem at all given that a Mac installer was available. The result is a command, azure, that can be used to manage the account.

First step after installing was to register my computer. That's a simple 3 step process. First download a .publishsettings file, then import it into the tool and finally delete the downloaded file. Ensure that you are logged into Azure before starting the process.
$ azure account download
$ azure account import ~/Downloads/3-Month\ Free\ Trial-3-10-2013-credentials.publishsettings
$ rm ~/Downloads/3-Month\ Free\ Trial-3-10-2013-credentials.publishsettings 
That's it! Now I can manage the service from the command line: -
$ azure vm image list
List of available VM images

Next up, creating a VM from the command line.

Friday, 8 March 2013

Startup on Windows Azure

Microsoft were one of the sponsors at Confoo in Montreal this year. They were promoting the Azure cloud platform, and I managed to get snared into creating a 90 day trial account. Okay, okay, the RC Mini Cooper helped.

Image borrowed from http://www.mandogroup.com/microsoft-build-2012/

I went through the sign-up process via my Nexus 7 tablet and it all worked fine, other than the annoying pop-up dialogs. Those are a UX mistake on smaller tablets and on smartphones. Didn't stop me, and I doubt that this was a typical use case, but still...

After signing up I fired up my laptop, created a basic 'Hello Confoo' HTML page, init'ed a Git archive, set the remote to my Azure Website, committed and pushed. And voila, my page was active and an RC Mini Cooper was in my grubby paws.

This was really a simple process, and given that I have a new client project, as one does, I thought I would give it a shot on Azure.

The project is very client specific, so I won't discuss it much, but there are many components to it that will allow me to see how it all works on Azure. The components are the usual: -

I normally use MySQL or Postgres for the SQL database but I am toying with using SQL Server as it's part of the free trial.

Plus I want to be able to easily setup and maintain this system so I will also: -
  • Use Chef or Ansible for CM
  • Setup a package system for code deploys
This should provide enough to have a decent idea of what Azure can do as compared to AWS. I will document the process and highlight and oddities on this blog. Must say that I don't expect any issues and hopefully I won't need the full 90 days to get it all working...