Author: poultont

Component Pattern

This is part of a series of posts on puppet patterns. If you can spot any flaws or think of better solutions let me know!

This pattern is similar to the PuppetLabs best practices of roles and profiles. If you’re not using roles and profiles it’s a good idea to look into them, there’s a post by Craig Dunn which explains it really well. The basic idea is reuse via abstraction. There’s an acronym in development DRY which stands for Don’t Repeat Yourself (the antitheses being WET, Write Everything Twice, so smart!!) so we try to “dry out” our code by reducing all the times we repeat ourselves. So we use profiles to abstract classes and roles to abstract profiles:

node webserver01 {
  include role::webserver
}

node webserver02 {
  include role::webserver
}

class role::webserver {
  include profile::tomcat
  # other profiles
}

class profile::tomcat {
  # include classes and/or add parameterised classes
}

Instead of defining all the classes we need with all the parameters etc on every node, we wrap all related classes together into profiles and then group the profiles into roles. Yes we could define both nodes as a comma separated list, but pretend they’re in different files! This pattern also works well if you’re using an ENC, each node only has one role so that’s a lot less configuration/management in the ENC

Roles and profiles look just like normal modules:

puppet
  - modules
    - role
      - manifests
        - webserver.pp
    - profiles
      - manifests
        - tomcat.pp

I have found one downside though. Even in a relatively non-trivial environment structure, by the time you’ve created roles to describe every different type of machine in all of your environments, your role names either become too long and cumbersome, or they become too vague and unclear.

The component pattern is designed to be an abstraction on top of profiles that focuses on the deployed components on the nodes instead of the role of the node as a whole:

node webserver01, webserver02 {
  include component::company_webapp
}

node apiserver01, apiserver02 {
  include component::search_api
  include component::rest_api
}

node dbserver01 {
  include component::main_db
}

node testserver01 {
  include component::company_webapp
  include component::search_api
  include component::rest_api
  include component::main_db
}

class component::company_webapp {
  include profiles::tomcat
  include profiles::company_webapp
  # other profiles
}

class component::search_api {
  include profiles::tomcat
  include profiles::search_api
  # other profiles
}

class component::main_db {
  include profiles::mysql
  include profiles::main_db
  # other profiles
}

Each component has everything it needs to run in isolation, but you can also put all your components on one node if you want too, even if they share resources e.g. the webapp and both APIs are probably all going to run on the same JVM behind the same instance of tomcat, but because of the way we structure our components, profiles and classes, any combination should be possible (assuming your apps play nicely together)

Yes, it’s a bit more code in the node definitions and there is more potential for repetition, but I think the node definitions are a lot more transparent and the component classes are reusable all over the place allowing you to structure your environments however you want.

Thera are some wins for ENCs too. You can create new environments and new layouts, or add components to existing nodes without having to create new roles. You can also give more context to your ENC because you’re now defining nodes by the components that are running on them rather than the role. I came up with this pattern when I was working out how to structure a release orchestration tool I’m tinkering with. I can now define components in the tool’s database and assign them to nodes in a layout in any combination and then easily translate that layout into ENC definitions.

So component classes only include profiles and have everything they need to run your component. However they are also designed to run side by side with any other components allowing any combination you want.

Advertisements

Virtual Parameter Pattern

This is part of a series of posts on puppet patterns. If you can spot any flaws or think of better solutions let me know!

The aim of this pattern is to reduce coupling between modules, or rather flip the coupling in the right direction. Imagine you have a module that can collate a number of parameters from other modules, for example mcollective has facts like facter, any module can add a fact to mcollective’s list so that it can be queried by a client (the version of the code you’ve deployed for example).

The first way to solve this is to have an “mcollective::facts” class which has all the facts preset as variables. This isn’t good for two obvious reasons, the list of facts is preset, and the mcollective module now contains references to other modules whether through references to variables in specific classes, calling functions in another module, or knowledge of how that module works (hiera keys etc)

class mcollective::facts {
  fact_one = $other_module::foo::var
  fact_two = function_from_another_module('var1')
  fact_tree = hiera('module_specific_key')

  file { 'mcollective facts':
    # write facts
  }
}

The second way is to have an “mcollective::fact” defined type that updates a file line (or whatever needs doing to set the parameter). The classes in other modules that want to set facts can now just define an instance of the fact class and pass the key and value in as parameters. This is good because the coupling is now reversed so that the mcollective module isn’t dependent on all the modules that want to use it. But, it’s now almost too tightly coupled, you can’t just have random file_line (or concat, or whatever) resources floating around the catalogue, there are dependencies that need to be defined, for example the file has to exist before you can set a line in it, and the service might need to be restarted once you’re set a new fact. So you now have to define before/require and notify/subscribe relationships between mcollective classes inside the classes that are setting facts. Your classes now have to have knowledge of how the mcollective module works which means you have to set it up properly, but it also means that if anything changes in the mcollectice module it could potentially break all the modules using it.

define mcollective::fact (
  $fact_value,
) {
  file_line { "mcollective fact $name":
    # write fact
  }
}

class foo {
  # class stuff

  mcollective::fact { 'foo-fact':
    fact_value => 'baz',
    require    => File['mcollective facts'],
    notify     => Service['mcollective'],
  }
}

The solution is to use virtual resources. We keep the “mcollective::fact” class but we make the file_line resource virtual. This way we can define facts in whichever classes we want, but all the logic is encapsulated inside the mcollective module. Resource collectors can grab all the virtual parameters and handle the dependencies. Use tags to make sure we only grab the file_line resources we actually want.

define mcollective::fact (
  $fact_value,
) {
  @file_line { "mcollective fact $name":
    # write fact
    tag => 'mcollective_fact'
  }
}

class mcollective::facts {
  # setup facts file and directories etc
  
  File['mcollective facts'] -> File_line <| tag == 'mcollective_fact' |> ~> Service['mcollective']
}

class foo {
  # class stuff

  mcollective::fact { 'foo-fact':
    fact_value => 'baz',
  }
}

Now the foo class only needs to know about one resource, and the mcollective module doesn’t need to know how it’s being used (if at all). If we want to change how mcollective manages it’s facts it doesn’t matter, we can change everything behind the scenes without having to change any other modules.

This pattern can be used in many other situations, it doesn’t just have to be facts, files and file_lines

Easy folder encryption in Mac OSX

I have some SSH keys sitting on my MacBook Pro and I decided that it’s probably a good idea to store them in such a way that if someone ever got access to my machine they wouldn’t be able to copy or use them. This solution doesn’t just apply to ssh keys, it’s for any folder that you want to encrypt, but still have easy access to. The folder could even be a git project that you want to keep secure when you’re not working on it.

I’m running Yosemite which comes with Apple’s filevault disk encryption stuff, but I’ve had a couple of issues with it so I’m not using it, even if you are this method allows you to move the folder around separately (USB, Dropbox, etc) and protects you if you forgot to lock your screen.

The idea is to have an encrypted disk image (dmg) sitting on your file system somewhere, and then a command line alias to mount the image to a specific directory with one command (with autocomplete and everything), and then another alias to unmount (and therefore re-secure) the image when you’re done.

The setup is two fold:

1. Create an encrypted image of the folder. You can use Disk Utility to create a new image from an existing folder and then choose your encryption option, or you can choose to create a new blank encrypted image which you can add files to afterwards. Either way the setup is pretty intuitive, and if not there’s always google

2. Add some bash aliases to make the process seamless. I’m a terminal kind of guy, there’s always one open so for me that’s the easiest option for shortcutting the mounting process. My .bash_profile includes the following (I called the image “vault”):

vault_root=~/Documents/some/folder/vault
alias mount-vault='hdiutil attach ${vault_root}.dmg -mountpoint ${vault_root}'
alias unmount-vault='hdiutil detach ${vault_root}'

And that’s it! Running “mount-vault” will prompt you for the password to decrypt the image and then mount the image as a folder with the same name, right next to itself in the directory structure. Once I’m done I run “unmount-vault” and it’s all locked up again. Simples

N.B. One caveat is that the “detach” action of hdiutil only accepts a mountpoint (instead of a device) on OS X 10.4 (Tiger) and above, I doubt that’s really an issue for anyone but just thought I’d mention it

Programmatically Set Jenkins Build Variables

The EnvInject Plugin is extremely useful when it comes to injecting build variables into your Jenkins jobs, however it’s not really geared up for using dynamic values. The plugin can take a list of specified properties and values, or it can take a path to a properties file and read in the values. But what if you want to use a number of variables to dynamically construct another? For example, take the version of the code from a properties file in git (1.2), add the Jenkins build number (345), and then add the branch name (develop), resulting in:

FULL_VERSION=1.2.345-develop

How to inject build variables isn’t immediately obvious, Jenkins exposes them all over the place like a flasher and his junk, but just like a flasher, Jenkins protects it’s variables from tampering. I did however find a way to do it:

It basically revolves around the EnvironmentContributingAction interface. Whenever Jenkins processes the “environment” it rattles through all the “Action” classes associated with that Job, and if the class implements the EnvironmentContributingAction interface, Jenkins calls the buildEnvVars method on that class and helpfully passes in the map of build variables as an argument. This map is special however in that it is the actual map and not just a copy which Jenkins normally exposes. Adding to this map object will actually add real build variables to you job to be used later on whenever you need them.

And so to the implementation! All I’ve done is create my own class that implements EnvironmentContributingAction which takes a key and a value in its constructor. When the buildEnvVars method is called it simply adds the key-value pair to the variable map.

The Job class just provides a wrapper to create the instance, add it to the job actions, and then call build.getEnvironment() so that the new variable is actually injected.

import hudson.EnvVars
import hudson.model.*

def build_number = Job.getVariable("BUILD_NUMBER")
def branch_name  = Job.getVariable("GIT_BRANCH")

def build_branch = "${build_number}-${branch_name}"

Job.setVariable("BUILD_BRANCH", build_branch)

class Job {

    static def getVariable(String key) {
        def config = new HashMap()
        def thr = Thread.currentThread()
        def build = thr?.executable
        def envVarsMap = build.parent.builds[0].properties.get("envVars")
        config.putAll(envVarsMap)
        return config.get(key)
    }

    static def setVariable(String key, String value) {
        def build = Thread.currentThread().executable
        def action = new VariableInjectionAction(key, value)
        build.addAction(action)
        build.getEnvironment()
    }
}

class VariableInjectionAction implements EnvironmentContributingAction {

    private String key
    private String value

    public VariableInjectionAction(String key, String value) {
        this.key = key
        this.value = value
    }

    public void buildEnvVars(AbstractBuild build, EnvVars envVars) {

        if (envVars != null && key != null && value != null) {
            envVars.put(key, value);
        }
    }

    public String getDisplayName() {
        return "VariableInjectionAction";
    }

    public String getIconFileName() {
        return null;
    }

    public String getUrlName() {
        return null;
    }
}

This is a simple example that gets the build number and branch name of the current job, concatenates them and sets the result as the value of a new variable called “BUILD_BRANCH”

The ‘Job’ and ‘VariableInjectionAction’ classes provide all the functionality, just copy them and use as you see fit. I hope this helps someone

Unit Testing Puppet with rspec-puppet: Mocking Classes and Functions

I’ve created a sample repo on GitHub that has a mini tutorial and examples of how to mock functions, classes, and defined types. This way you you can isolate the specific thing that you want to test (a class, a defined type, a custom function) without inheriting complexity from dependencies.

It’s still in its early stages so bugs and weird things are likely to crop up, but it’s a start!

I wont repeat anything here, this is just a placeholder if anyone finds this page first

Enjoy

hiera-eyaml: Per-value encrypted backend for Hiera (and Puppet)

This post is all about the hiera-eyaml GitHub project which I created to provide per-value encryption of sensitive data in a yaml file that Hiera can then decrypt. I started this post mainly for the comments section so that people can leave questions, comments, suggestions, abuse, etc, but also to clean up the readme file so that it’s a bit more concise.

If you’ve started using hiera-eyaml (I hope it helps) and you’re having problems (sorry) please add an issue on GitHub to keep them all in the same place, and make it easier for anyone who may be looking.

Hiera-eyaml

The ‘inspiration’ for this little project came from 2 sources:

  1. An existing hiera encryption solution hiera-gpg
  2. This post on /dev/random that I found whilst looking for encryption options

Most of the reasons to create an alternative backend for Hiera are summed up in the /dev/random post, but the main one is the ability to encrypt each value individually and not the whole file. This provides a bit more transparency and allows those configuring Puppet to know where each value is defined. If something isn’t working within a hierarchical data source it’s nice to be able to see at a glance where each value is defined and where it should be overridden or added to.

I also ran into problems using hiera-gpg (not actually hiera-gpg’s fault but another project it uses internally ruby-gpgme which didn’t seem to recognise my keychain)

Improvements

It’s not exactly the most compact syntax ever so I’ll try and find a way of slimming it down a bit. I did try using Zlib but that didn’t really help much.

eYaml doesn’t support keys with a passphrase yet, but as Craig Dunn explains in his post about hiera-gpg “it would mean having the password stored in /etc/puppet/hiera.yaml as plaintext anyway, so I don’t see that as adding much in the way of security.”

GPG seems to have this secure “feel to it” so there might be a better encryption method to use than a pair of pem keys.

Apologies for the state of the blog, I’ll sort out a better theme and CSS when I get a chance.

Update 2013-07-19:

hiera-eyaml is now up on rubygems so download and install with ease!

Thanks

Thank you to Craig Dunn for his work on hiera-gpg and corresponding blog post mentioned above, it definitely made it easier to write this having his code as a reference.

Thank you to Geoff Meakin and Simon Hildrew for their awesome and continuing contributions

Script a MarkLogic REST Server in xquery

This is just a quick one, mainly in case I forget, but if you happen to be trying to script the creation of a REST server on top of a MarkLogic database using the Admin module, you might be wondering how to turn the http service they setup in the examples into an actual REST server. Well the trick is to have a poke around the MarkLogic files until you stumble upon /MarkLogic/Modules/MarkLogic/rest-api/lib/bootstrap-util.xqy. This little bit of xquery contains a function util:bootstrap-rest-server which basically runs the standard http server setup stuff, but then has 7 magic lines

let $appserver-id := admin:appserver-get-id($config, $groupid, $rest-appserver-name)
let $config := admin:appserver-set-url-rewriter($config, $appserver-id, "/MarkLogic/rest-api/rewriter.xqy")
let $config := admin:appserver-set-error-handler($config,$appserver-id, "/MarkLogic/rest-api/error-handler.xqy")
let $config := admin:appserver-set-authentication($config, $appserver-id, "digest")
let $config := admin:appserver-set-log-errors($config, $appserver-id, false())
let $config := admin:appserver-set-rewrite-resolves-globally($config, $appserver-id, true())
let $config := admin:database-set-directory-creation($config, xdmp:database( $rest-db-modules-name ), "automatic")

So your creation script can either potentially just call this function (although I haven’t tried it), but we wanted to create a REST server using the file system as our Module store, so I just modified our script to this:

xquery version "1.0-ml";
import module namespace admin = "http://marklogic.com/xdmp/admin"
  at "/MarkLogic/admin.xqy";

let $GroupName := "Default"
let $ServiceName := "MyRestServer"
let $DocumentDir := "MyRestServerRoot/"
let $ServicePort := 8003
let $Database := "MyDatabase"

let $config := admin:get-configuration()

let $groupid := admin:group-get-id($config, $GroupName)

let $config := admin:http-server-create(
    $config,
    $groupid,
    $ServiceName,
    $DocumentDir,
    $ServicePort,
    "file-system",
    admin:database-get-id($config, $Database))
 
let $appserver-id := admin:appserver-get-id($config, $groupid, $ServiceName)
let $config := admin:appserver-set-url-rewriter($config, $appserver-id, "/MarkLogic/rest-api/rewriter.xqy")
let $config := admin:appserver-set-error-handler($config,$appserver-id, "/MarkLogic/rest-api/error-handler.xqy")
let $config := admin:appserver-set-authentication($config, $appserver-id, "digest")
let $config := admin:appserver-set-log-errors($config, $appserver-id, false())
let $config := admin:appserver-set-rewrite-resolves-globally($config, $appserver-id, true())

return admin:save-configuration($config);

You may need to tweak your script for your specific scenario but at least now you know where to look