Programmatic Ansible Middle Ground

 

Forward

About a year ago serversforhackers posted a great article on how to run Ansible programamatically  Since then Ansible has had a major release which introduced changes within the Python API.

Simulating The CLI

Not that long ago Jason  DeTiberus and I were talking about how to use Ansible from within other Python packages. One of the things he said was that it should be possible to reuse the command line code instead of the internal API if you hook into the right place. I finally had some time to take a look and it seems he’s right!

If you took a look at the 2.0 API you’ll see there is a lot more power handed over to you as the developer but with that comes a lot of code. Code that for many will be nearly copy/paste style code directly from command-line interface code. So when there is not a need for the extra power why not just reuse code that already exists?

import os  # Used for expanding paths

from ansible.cli.playbook import PlaybookCLI
from ansible.errors import AnsibleOptionsError, AnsibleParserError

def execute_playbook(playbook, hosts, args=[]):
    """
    :param playbook: Full path to the playbook to execute.
    :type playbook: str
    :param hosts: A host or hosts to target the playbook against.
    :type hosts: str, list, or tuple
    :param args: Other arguments to pass to the run.
    :type args: list
    :returns: The TaskQueueHandler for the run.
    :rtype: ansible.executor.task_queue_manager.TaskQueueManager.
    """
    # Set hosts args up right for the ansible parser. It likes to have trailing ,'s
    if isinstance(hosts, basestring):
        hosts = hosts + ','
    elif hasattr(hosts, '__iter__'):
        hosts = ','.join(hosts) + ','
    else:
        raise AnsibleParserError('Can not parse hosts of type {}'.format(
            type(hosts)))

    # Create the cli object
    cli_args = ['playbook'] + args + ['-i', hosts, os.path.realpath(playbook)]
    print('Executing: {}'.format(' '.join(cli_args)))
    cli = PlaybookCLI(cli_args)
    # Parse args and run it
    try:
        cli.parse()
        # Return the result:
        # 0: Success
        # 1: "Error"
        # 2: Host failed
        # 3: Unreachable
        # 4: Parser Error
        # 5: Options error
        return cli.run()
    except (AnsibleParserError, AnsibleOptionsError) as error:
        print('{}: {}'.format(type(error), error))
        raise error

 

Breaking It Down

The function starts off with some hosts parsing. This is not really needed but it does make the function easier to work with. On the command line Ansible likes to have a comma at the end of hosts passed in. This chunk of code makes sure that if a list or string is given for a host that the resulting host string is properly formatted.

    # Set hosts args up right for the ansible parser. It likes to have trailing ,'s
    if isinstance(hosts, basestring):
        hosts = hosts + ','
    elif hasattr(hosts, '__iter__'):
        hosts = ','.join(hosts) + ','
    else:
        raise AnsibleParserError('Can not parse hosts of type {}'.format(type(hosts)))

The Real Code

This chunk of code is what is actually calling Ansible. It creates the command line argument list, creates a PlaybookCLI instance, has it parsed, and then executes the playbook.

    # Create the cli object
    cli_args = ['playbook'] + args + ['-i', hosts, os.path.realpath(playbook)]
    print('Executing: {}'.format(' '.join(cli_args)))
    cli = PlaybookCLI(cli_args)
    # Parse args and run it
    try:
        cli.parse()
        # Return the result:
        # 0: Success
        # 1: "Error"
        # 2: Host failed
        # 3: Unreachable
        # 4: Parser Error
        # 5: Options error
        return cli.run()
    except (AnsibleParserError, AnsibleOptionsError) as error:
        print('{}: {}'.format(type(error), error))
        raise error

Using The Function

# Execute /tmp/test.yaml with 2 hosts
result = execute_playbook('/tmp/test.yaml', ['192.168.152.100', '192.168.152.101'])

# Execute /tmp/test.yaml with 1 host and add the -v flag
result = execute_playbook('/tmp/test.yaml', '192.168.152.101', ['-v'])

Intercepting The Output

One drawback of using the command-line interface code directly is that the output is expected to go to the user in the standard way. That is to say, it’s sent to the screen and colorized. This will probably be fine for some, but others may want to grab the output and use it in some form. While it is possible to change output through the configuration options it is also possible to monkey patch display and intercept the output for your own use cases. As an example, here is a Display class which forwards all output that is not meant for the screen only to our logging.info method.

# MONKEY PATCH to catch output. This must happen at the start of the code!
import logging

from ansible.utils.display import Display

# Set up our logging
logger = logging.getLogger('transport')
logger.setLevel(logging.INFO)
handler = logging.StreamHandler()
handler.formatter = logging.Formatter('%(name)s - %(message)s')
logger.addHandler(handler)

class LogForward(Display):
    """
    Quick hack of a log forwarder
    """

    def display(self, msg, screen_only=None, *args, **kwargs):
        """
        Pass display data to the logger.
        :param msg: The message to log.
        :type msg: str
        :param args: All other non-keyword arguments.
        :type args: list
        :param kwargs: All other keyword arguments.
        :type kwargs: dict
        """
        # Ignore if it is screen only output
        if screen_only:
            return
        logging.getLogger('transport').info(msg)

    # Forward it all to display
    info = display
    warning = display
    error = display
    # Ignore debug
    debug = lambda s, *a, **k: True

# By simply setting display Ansible will slurp it in as the display instance
display = LogForward()
# END MONKEY PATCH. Add code after this line.

Putting It All Together

If you want to use it all together it should look like this:

# MONKEY PATCH to catch output. This must happen at the start of the code!
import logging

from ansible.utils.display import Display

# Set up our logging
logger = logging.getLogger('transport')
logger.setLevel(logging.INFO)
handler = logging.StreamHandler()
handler.formatter = logging.Formatter('%(name)s - %(message)s')
logger.addHandler(handler)

class LogForward(Display):
    """
    Quick hack of a log forwarder
    """

    def display(self, msg, screen_only=None, *args, **kwargs):
        """
        Pass display data to the logger.
        :param msg: The message to log.
        :type msg: str
        :param args: All other non-keyword arguments.
        :type args: list
        :param kwargs: All other keyword arguments.
        :type kwargs: dict
        """
        # Ignore if it is screen only output
        if screen_only:
            return
        logging.getLogger('transport').info(msg)

    # Forward it all to display
    info = display
    warning = display
    error = display
    # Ignore debug
    debug = lambda s, *a, **k: True

# By simply setting display Ansible will slurp it in as the display instance
display = LogForward()
# END MONKEY PATCH. Add code after this line.

import os  # Used for expanding paths

from ansible.cli.playbook import PlaybookCLI
from ansible.errors import AnsibleOptionsError, AnsibleParserError

def execute_playbook(playbook, hosts, args=[]):
    """
    :param playbook: Full path to the playbook to execute.
    :type playbook: str
    :param hosts: A host or hosts to target the playbook against.
    :type hosts: str, list, or tuple
    :param args: Other arguments to pass to the run.
    :type args: list
    :returns: The TaskQueueHandler for the run.
    :rtype: ansible.executor.task_queue_manager.TaskQueueManager.
    """
    # Set hosts args up right for the ansible parser. It likes to have trailing ,'s
    if isinstance(hosts, basestring):
        hosts = hosts + ','
    elif hasattr(hosts, '__iter__'):
        hosts = ','.join(hosts) + ','
    else:
        raise AnsibleParserError('Can not parse hosts of type {}'.format(
            type(hosts)))

    # Create the cli object
    cli_args = ['playbook'] + args + ['-i', hosts, os.path.realpath(playbook)]
    logger.info('Executing: {}'.format(' '.join(cli_args)))
    cli = PlaybookCLI(cli_args)
    # Parse args and run it
    try:
        cli.parse()
        # Return the result:
        # 0: Success
        # 1: "Error"
        # 2: Host failed
        # 3: Unreachable
        # 4: Parser Error
        # 5: Options error
        return cli.run()
    except (AnsibleParserError, AnsibleOptionsError) as error:
        logger.error('{}: {}'.format(type(error), error))
        raise error

 

Pros and Cons

Of course nothing is without drawbacks. Here are some negatives with this method:

  • No direct access to “TaskQueueManager“
  • If the CLI changes the code must change
  • Monkey patching …. ewww

But the positives seem to be worth it so far:

  • You don’t have to deal with “TaskQueueManager“ and all of the construction code
  • The CLI doesn’t seem to change often
  • The same commands one would run on the CLI can easily be extrapolated and even run manually

etcdobj: A Minimal etcd Object Mapper for Python

I didn’t have a lot on my agenda Friday. I wanted to review and return emails, do some reading, get some minor hacking on etcdobj done (more on that…), eat more calories then normal in an attempt to screw with my metabolism (nailed it!), catch up with a few coworkers, play some video games, and, apparently, accidentally order an air purifier from Amazon. I succeed in all of it. But on to this etcdobj thing…

While working on Commissaire I started to feel a bit dirty over storing json documents in keys. It’s not uncommon, but it felt like it would be so much better if a document was broken into three layers:

  • Python: Classes/Objects
  • Transport: For saving/retreiving objects
  • etcd: A single or series of keys

By splitting up what normally is json data into a series of keys and two clients change overlapping parts of an object there won’t be a collision or require the client to fail, fetch, update, then try saving again. I searched the Internet for a library that would provide this and came up wanting. It seems that either simple keys/values or shoving json into a key is what most people stick with.

etcdobj is truly minimal. Partly because it’s new, partly because being small should make it easier to build upon or even bundle (it’s got a very permissive license), and partly because I’ve never written an ORM-like library before and don’t want to build to much on what could be a shaky foundation. That’s why I’m hoping this post will encourage some more eyes and help with the code.

Current Example

To create a representation of data a class must subclass EtcdObj and follow a few rules.

  1. __name__ must be provided as it will be the parent in the key path.
  2. Fields are class level variables and must be set to an instance that subclasses etcdobj.Field.
  3. The name of a field is the next layer in the key path and do not need to be the same as the class level variable.
from etcdobj import EtcdObj, fields

class Example(EtcdObj):
    __name__ = 'example' # The parent key
    # Fields all take a name that will be used as their key
    anint = fields.IntField('anint')
    astr = fields.StrField('astr')
    adict = fields.DictField('adict')

Creating a new object and saving it to etcd is pretty easy.

server = Server()

ex = Example(anint=1, astr="hello", adict={"hi": "there"})
ex.anint = 100  # update the value of anint
server.save(ex)
# Would save like so:
# /example/anint = "100"
# /example/astr = "hello"
# /example/adict/hi = "there"

As is retrieving the data.

new_ex = server.read(Example())
# new_ex.anint = 100
# new_ex.astr = "hello"
# new_ex.adict = {"hi": "there"}

Ideas

Some ideas for the future include:

  • Object watching (if data changes on the server it changes in the local instance)
  • Object to json structure
  • Deep DictField value casting/validation
  • Library level logging

Lend a Hand

The code base is currently around 416 lines of code including documentation and license header. If etcdobj sounds like something you’d use come take a look and help make it something better than I can produce all by my lonesome.

From Gevent to CherryPy

I’ve been working on a project for the last few months on GitHub called Commissaire along with some other smart folks. Without getting to deep into what the software is supposed to do, just know it’s a REST service which needs to handle some asynchronous tasks. When prototyping the web service I started utilizing gevent for it’s WSGI server and coroutines but, as it turns out, it didn’t end up being the best fit. This is not a post about gevent sucking because it doesn’t suck. gevent is pretty awesome but it’s not for every use case.

The Problem

One of the asynchronous tasks we do in Commissaire utilizes Ansible. We use the Ansible python API to handle part of host bootstrapping of a new host. Under the covers Ansible uses the multiprocessing module when executing it’s work. Specifically, this occurs when the TaskQueueManager starts its run. Under normal circumstances this is no problem but when gevent is in use it’s monkey patching ends up causing some problems. As noted in the post using monkey.patch_all(thread=False, socket=False) can be a solution. What this ends up doing is patching everything except thread and socket. But even this wasn’t enough for us to get past problems we were facing between multiprocessing, gevent, and Ansible. The closest patch we found was to also disable os, subprocess and a few other things making most of gevents great features unavailable. At this point it seemed pretty obvious gevent was not going to be a good fit.

Looking Elsewhere

There are no lack of options when looking for a Python web application server. Here are the requirements that I figured we would need:

Requirements

  • Importable as a library
  • Supports WSGI
  • Supports TLS
  • Active user base
  • Active development
  • Does not require a reverse proxy
  • Does not require greenlets
  • Supports Python 2 and 3

Based on the name of this post you already know we chose CherryPy. It hit all the requirements and came with a few added benefits. The plugin system which allows for calls to be published over an internal bus let’s us decouple our data saving internals (though couples us with CherryPy as it is doing the abstraction). The server is also already available in many Linux distributions at new enough versions. That’s a big boon hoping to have software easily installed via traditional means.

The runner up was Waitress. Unlike CherryPy which assumes you are developing within the CherryPy web framework, Waitress assumes WSGI. Unfortunately, Waitress requires a reverse proxy for TLS. If it had first class support for TLS we would have probably have picked it.

Going back to a more traditional threading server is definitely not as sexy as utilizing greenlets/coroutines but it has provided a consistent result when paired with a multiprocessing worker process and that is what matters.

Porting Time

Porting to a different library can be an annoying task and can feel like busy work. Porting can be even worse when you liked the library in use in the first place as I did (and still do!) with gevent.

Initial porting of main functionality from gevent to CherryPy took roughly four hours. After that, porting it took about another 6 hours to iron out some rough edges followed by updating unit tests. Really, the unit testing updates ended up being more work, in terms of time, than the actual functionality. A lot of that was our fault in how we use mock, but I digress. That’s really not much time!

So What

So far I’m happy with the results. The application functionality works as expected, the request/response speeds are more than acceptable, and CherryPy as a server has been fun to work with. Assuming no crazy corner cases don’t crop up I don’t see use moving off CherryPy anytime soon.

Flask-Track-Usage 1.1.0 Released

A few years ago the initial Flask-Track-Usage release was announced via my blog. At the time I thought I’d probably be the one user. I’m glad to say I was wrong! Today I’m happy to announce the release of Flask-Track-Usage 1.1.0 which sports a number enhancements and bug fixes.

Unfortunately, some changes are not backwards compatible. However, I believe the backwards incompatible changes make the overall experience better. If you would like to stick with the previous version of Flask-Track-Usage make sure to version pin in your requirements file/section:

flask_track_usage==1.0.1

Version 1.1.0 has made changes requested by the community as well as a few bug fixes. These include:

  • Addition of the X-Forwarded-For header as xforwardedfor in storage. Requested by jamylak.
  • Configurable GeoIP endpoint support. Requested by jamylak.
  • Migration from pymongo.Connection to pymongo.MongoClient.
  • Better SQLStorage metadata handling. Requested by gouthambs.
  • SQLStorage implementation redesign. Requested and implemented by gouthambs.
  • Updated documentation for 1.1.0.
  • Better unittesting.

I’d like to thank Gouthaman Balaraman who has been a huge help authoring the SQLStorage based on the SQLAlchemy ORM and providing feedback and support on Flask-Track-Usage design.

As always, please report bugs and feature requests on the GitHub Issues Page.

Quote

Red Hat Developer Blog: Feeling Developer Pain

The rest of this post describes our journey from initially trying to implement a simple solution to improve the day-to-day lives of developers, through the technical limitations we experienced along the way, and finally arrives at the empathy for our developers we’ve gained from that experience. We’ll wrap up with a note on how Red Hat Software Collections (announced as GA in September) would’ve simplified our development process.

Read the whole post Tim and I wrote over at the Red Hat Develope Blog.

More Editors/IDE’s

In my last post I talked about not being able to find a good Python editor/IDE other than vim. Nothing has really changed since then but there was another editor and IDE that was brought to my attention which I failed to point out. Let’s talk about them!

Editors/IDE’s

Emacs

While talking with Tim Bielawa I was reminded about Emacs since it’s what Tim uses. Emacs has such amazing integration that people sometimes say it’s an operating system itself! I’ve really only ever given Emacs two real shots at being my main editor. The first time was when I was just starting to get into programming and was reading a lot about what other programmers and system administrators used. At first it seemed a lot easier than vi but I ended up running with vi/vim since, at the very least, vi seemed to always be present on every Linux/BSD box I worked with by default. The second attempt was after I promised a few Emacs users at work that I’d give it another fair shot. I said I would use Emacs when I’d normally use vim for a full week knowing that it would force me to learn more about the editor. It wasn’t easy to stop trying to use mode editing but I was able to code without feeling too contained by the editor (Note: the contained feeling was me not knowing the editor well, not Emacs itself. It was really obvious that Emacs is a powerful editor). My biggest gripes from the week long test were around needing to install emacs on systems for use (which is sort of a silly one, I admit) and I felt some of the commands were way to long. I don’t remember the last one off the top of my head but I do remember that one command had a bunch of dashes and was frustrating every time I needed to use it. I think it’s about time I try it again and see if I can overcome my hurdles trying to use it efficiently. Who knows, maybe third time’s the charm!

Cloud9

Cloud9 is really cool. It’s not your grandfather’s IDE by any stretch. As the name implies the application exists out in the cloud (They use Openshift). I love the idea that I can start editing code with full IDE features from any machine I’m currently occupying. I have not tried using Cloud9 with a tablet (with keyboard of course!) yet but if that works then this thing would have rockstar status in my mind. I’ve used it for a few projects for both ease of use and to test out some of the features Cloud9 boasts. Being that it is on Openshift the IDE has it’s own platform letting you install tools and dependencies. There are also some collaboration features which I have not tried out yet mainly because I’m not sure how that works when you are using Github (IE: can they push code under your name or are they restricted?). I would use Cloud9 a lot more if it wasn’t for the Internet. While Cloud9 is pretty responsive in most cases but due to some point in the network connection between “here” and “there” things slow down or stop responding for a second or two. If this was something other than coding I probably wouldn’t care that much … but this is coding. Any hiccups while editing breaks concentration and slows down progress. One other issue I noticed was the lack of preferences across ones instances. Say you have two projects in Cloud9. Each project has it’s own IDE instance. If you want to set preferences for both IDE instances you will have to open each on it’s own and set them. You can not set any kind of global default preferences for all your instances. Hopefully they will add that functionality as I’m pretty positive I’m not the only person who finds that a bit weird. Over time Cloud9 and similar IDE’s will find ways to speed up and add better preference support but until then, in my mind, Cloud9 is straddling the line between full contender and really cool tech preview to keep my eye on.