Programmatic Ansible Middle Ground

 

Forward

About a year ago serversforhackers posted a great article on how to run Ansible programamatically  Since then Ansible has had a major release which introduced changes within the Python API.

Simulating The CLI

Not that long ago Jason  DeTiberus and I were talking about how to use Ansible from within other Python packages. One of the things he said was that it should be possible to reuse the command line code instead of the internal API if you hook into the right place. I finally had some time to take a look and it seems he’s right!

If you took a look at the 2.0 API you’ll see there is a lot more power handed over to you as the developer but with that comes a lot of code. Code that for many will be nearly copy/paste style code directly from command-line interface code. So when there is not a need for the extra power why not just reuse code that already exists?

import os  # Used for expanding paths

from ansible.cli.playbook import PlaybookCLI
from ansible.errors import AnsibleOptionsError, AnsibleParserError

def execute_playbook(playbook, hosts, args=[]):
    """
    :param playbook: Full path to the playbook to execute.
    :type playbook: str
    :param hosts: A host or hosts to target the playbook against.
    :type hosts: str, list, or tuple
    :param args: Other arguments to pass to the run.
    :type args: list
    :returns: The TaskQueueHandler for the run.
    :rtype: ansible.executor.task_queue_manager.TaskQueueManager.
    """
    # Set hosts args up right for the ansible parser. It likes to have trailing ,'s
    if isinstance(hosts, basestring):
        hosts = hosts + ','
    elif hasattr(hosts, '__iter__'):
        hosts = ','.join(hosts) + ','
    else:
        raise AnsibleParserError('Can not parse hosts of type {}'.format(
            type(hosts)))

    # Create the cli object
    cli_args = ['playbook'] + args + ['-i', hosts, os.path.realpath(playbook)]
    print('Executing: {}'.format(' '.join(cli_args)))
    cli = PlaybookCLI(cli_args)
    # Parse args and run it
    try:
        cli.parse()
        # Return the result:
        # 0: Success
        # 1: "Error"
        # 2: Host failed
        # 3: Unreachable
        # 4: Parser Error
        # 5: Options error
        return cli.run()
    except (AnsibleParserError, AnsibleOptionsError) as error:
        print('{}: {}'.format(type(error), error))
        raise error

 

Breaking It Down

The function starts off with some hosts parsing. This is not really needed but it does make the function easier to work with. On the command line Ansible likes to have a comma at the end of hosts passed in. This chunk of code makes sure that if a list or string is given for a host that the resulting host string is properly formatted.

    # Set hosts args up right for the ansible parser. It likes to have trailing ,'s
    if isinstance(hosts, basestring):
        hosts = hosts + ','
    elif hasattr(hosts, '__iter__'):
        hosts = ','.join(hosts) + ','
    else:
        raise AnsibleParserError('Can not parse hosts of type {}'.format(type(hosts)))

The Real Code

This chunk of code is what is actually calling Ansible. It creates the command line argument list, creates a PlaybookCLI instance, has it parsed, and then executes the playbook.

    # Create the cli object
    cli_args = ['playbook'] + args + ['-i', hosts, os.path.realpath(playbook)]
    print('Executing: {}'.format(' '.join(cli_args)))
    cli = PlaybookCLI(cli_args)
    # Parse args and run it
    try:
        cli.parse()
        # Return the result:
        # 0: Success
        # 1: "Error"
        # 2: Host failed
        # 3: Unreachable
        # 4: Parser Error
        # 5: Options error
        return cli.run()
    except (AnsibleParserError, AnsibleOptionsError) as error:
        print('{}: {}'.format(type(error), error))
        raise error

Using The Function

# Execute /tmp/test.yaml with 2 hosts
result = execute_playbook('/tmp/test.yaml', ['192.168.152.100', '192.168.152.101'])

# Execute /tmp/test.yaml with 1 host and add the -v flag
result = execute_playbook('/tmp/test.yaml', '192.168.152.101', ['-v'])

Intercepting The Output

One drawback of using the command-line interface code directly is that the output is expected to go to the user in the standard way. That is to say, it’s sent to the screen and colorized. This will probably be fine for some, but others may want to grab the output and use it in some form. While it is possible to change output through the configuration options it is also possible to monkey patch display and intercept the output for your own use cases. As an example, here is a Display class which forwards all output that is not meant for the screen only to our logging.info method.

# MONKEY PATCH to catch output. This must happen at the start of the code!
import logging

from ansible.utils.display import Display

# Set up our logging
logger = logging.getLogger('transport')
logger.setLevel(logging.INFO)
handler = logging.StreamHandler()
handler.formatter = logging.Formatter('%(name)s - %(message)s')
logger.addHandler(handler)

class LogForward(Display):
    """
    Quick hack of a log forwarder
    """

    def display(self, msg, screen_only=None, *args, **kwargs):
        """
        Pass display data to the logger.
        :param msg: The message to log.
        :type msg: str
        :param args: All other non-keyword arguments.
        :type args: list
        :param kwargs: All other keyword arguments.
        :type kwargs: dict
        """
        # Ignore if it is screen only output
        if screen_only:
            return
        logging.getLogger('transport').info(msg)

    # Forward it all to display
    info = display
    warning = display
    error = display
    # Ignore debug
    debug = lambda s, *a, **k: True

# By simply setting display Ansible will slurp it in as the display instance
display = LogForward()
# END MONKEY PATCH. Add code after this line.

Putting It All Together

If you want to use it all together it should look like this:

# MONKEY PATCH to catch output. This must happen at the start of the code!
import logging

from ansible.utils.display import Display

# Set up our logging
logger = logging.getLogger('transport')
logger.setLevel(logging.INFO)
handler = logging.StreamHandler()
handler.formatter = logging.Formatter('%(name)s - %(message)s')
logger.addHandler(handler)

class LogForward(Display):
    """
    Quick hack of a log forwarder
    """

    def display(self, msg, screen_only=None, *args, **kwargs):
        """
        Pass display data to the logger.
        :param msg: The message to log.
        :type msg: str
        :param args: All other non-keyword arguments.
        :type args: list
        :param kwargs: All other keyword arguments.
        :type kwargs: dict
        """
        # Ignore if it is screen only output
        if screen_only:
            return
        logging.getLogger('transport').info(msg)

    # Forward it all to display
    info = display
    warning = display
    error = display
    # Ignore debug
    debug = lambda s, *a, **k: True

# By simply setting display Ansible will slurp it in as the display instance
display = LogForward()
# END MONKEY PATCH. Add code after this line.

import os  # Used for expanding paths

from ansible.cli.playbook import PlaybookCLI
from ansible.errors import AnsibleOptionsError, AnsibleParserError

def execute_playbook(playbook, hosts, args=[]):
    """
    :param playbook: Full path to the playbook to execute.
    :type playbook: str
    :param hosts: A host or hosts to target the playbook against.
    :type hosts: str, list, or tuple
    :param args: Other arguments to pass to the run.
    :type args: list
    :returns: The TaskQueueHandler for the run.
    :rtype: ansible.executor.task_queue_manager.TaskQueueManager.
    """
    # Set hosts args up right for the ansible parser. It likes to have trailing ,'s
    if isinstance(hosts, basestring):
        hosts = hosts + ','
    elif hasattr(hosts, '__iter__'):
        hosts = ','.join(hosts) + ','
    else:
        raise AnsibleParserError('Can not parse hosts of type {}'.format(
            type(hosts)))

    # Create the cli object
    cli_args = ['playbook'] + args + ['-i', hosts, os.path.realpath(playbook)]
    logger.info('Executing: {}'.format(' '.join(cli_args)))
    cli = PlaybookCLI(cli_args)
    # Parse args and run it
    try:
        cli.parse()
        # Return the result:
        # 0: Success
        # 1: "Error"
        # 2: Host failed
        # 3: Unreachable
        # 4: Parser Error
        # 5: Options error
        return cli.run()
    except (AnsibleParserError, AnsibleOptionsError) as error:
        logger.error('{}: {}'.format(type(error), error))
        raise error

 

Pros and Cons

Of course nothing is without drawbacks. Here are some negatives with this method:

  • No direct access to “TaskQueueManager“
  • If the CLI changes the code must change
  • Monkey patching …. ewww

But the positives seem to be worth it so far:

  • You don’t have to deal with “TaskQueueManager“ and all of the construction code
  • The CLI doesn’t seem to change often
  • The same commands one would run on the CLI can easily be extrapolated and even run manually
Advertisement

Brave: Because It’s The Best Middle Available

I’m no different than a large portion of web users who are looking to read content and stay safe: I use one or more ad blockers. Is this stealing website content? No. I’d be happy to be served ads is they didn’t have such a bad track record in terms of security and privacy.

Privacy

Many people are still ignorant to what information they are giving up to online advertisers. In a 2013 post by VentureBeat they noted:

Advertisers and the tracking companies they employee are able to gather all sorts of information about you, such as the websites you frequent and what kind of products you’re interested — and even some even scarier stuff like political views, health problems, and personal finances.

Over time the picture you provide to these private companies becomes clearer and clearer. Of course, you may not care if companies know you like chocolate chip cookies but you may not want them to know more personal things or, worse, be able to extrapolate things about you that you haven’t even unknowingly shared … not to mention government use for predictive modeling.

Malvertising

Privacy is important but this is the bigger problem in my opinion. Malvertising has proven a successful vector to infect users machines with malware. If you are interested in a time line of large malvertising events GeoEdge has a nice post. A quick summary of heavy hitters who have inadvertently exposed their readers to threats include The New York Times, eBay, LA Times, Spotify, Answers.com, Huffinton Post, MSN, BBC, AOL and NFL. Of course, there are many more but that list should be enough to get anyone’s attention.

Options

So what are valid options to protect personal privacy and security on the web?

Ignore Internet Content

This is the best option but it’s very unlikely. Everyone loses with this as content providers get nothing from their ads and readers don’t get any news.

Go To “Safe” Sources

Another good option, but about as unlikely as the first. It takes work to find out the sites that are not tracking or injecting third party advertising. It also assumes safe sources are always safe but the web is a constantly evolving place and a site may be totally different upon two visits.

Run an Ad Blocker

This is the most common solution today. It blocks as many ads and third party cookies as it can and generally keeps users safe. It’s not a perfect solution as the content providers miss getting any ad clicks/impressions but the reader gets a much safer (and faster) experience.

Some sites actively block ad blockers. When I come across these sites who nicely ask me to unblock their ads I head over to google and find another source for the same story. I don’t think I’m alone in doing that.

Use (something like) Brave That Shares Ad Revenue

This is a newer thing and the actual reason  I wanted to write this post. Brave seems to be a good middle ground which attempts to keep users safe while still providing money back to content providers. In some ways Brave is acting like a arbitrator to let everyone get something out of the deal. Users get content, creators get money. Yeah Brave (and users) get a cut to but that’s not so bad (though I’d be fine with not getting a cut at all as a user).

Here is the flows for revenue from Brave:

brave_infographic_large

852764

Unfortunately, the NAA  didn’t quite grasp the above idea and has called to Brave to stop. Surprise, at least one of the companies who signed the letter has put users at risk via malvertising on multiple occasions.

Brave has posted a rebuttal in an attempt to help NAA understand the business model and why it’s not illegal. Hopefully logic will triumph over emotion and posturing.

My Hope

My hope is that users will jump on to the idea that Brave provides (whether they use Brave or not) and that the NAA will understand that it is a business model where everyone wins, even their readers.

From Gevent to CherryPy

I’ve been working on a project for the last few months on GitHub called Commissaire along with some other smart folks. Without getting to deep into what the software is supposed to do, just know it’s a REST service which needs to handle some asynchronous tasks. When prototyping the web service I started utilizing gevent for it’s WSGI server and coroutines but, as it turns out, it didn’t end up being the best fit. This is not a post about gevent sucking because it doesn’t suck. gevent is pretty awesome but it’s not for every use case.

The Problem

One of the asynchronous tasks we do in Commissaire utilizes Ansible. We use the Ansible python API to handle part of host bootstrapping of a new host. Under the covers Ansible uses the multiprocessing module when executing it’s work. Specifically, this occurs when the TaskQueueManager starts its run. Under normal circumstances this is no problem but when gevent is in use it’s monkey patching ends up causing some problems. As noted in the post using monkey.patch_all(thread=False, socket=False) can be a solution. What this ends up doing is patching everything except thread and socket. But even this wasn’t enough for us to get past problems we were facing between multiprocessing, gevent, and Ansible. The closest patch we found was to also disable os, subprocess and a few other things making most of gevents great features unavailable. At this point it seemed pretty obvious gevent was not going to be a good fit.

Looking Elsewhere

There are no lack of options when looking for a Python web application server. Here are the requirements that I figured we would need:

Requirements

  • Importable as a library
  • Supports WSGI
  • Supports TLS
  • Active user base
  • Active development
  • Does not require a reverse proxy
  • Does not require greenlets
  • Supports Python 2 and 3

Based on the name of this post you already know we chose CherryPy. It hit all the requirements and came with a few added benefits. The plugin system which allows for calls to be published over an internal bus let’s us decouple our data saving internals (though couples us with CherryPy as it is doing the abstraction). The server is also already available in many Linux distributions at new enough versions. That’s a big boon hoping to have software easily installed via traditional means.

The runner up was Waitress. Unlike CherryPy which assumes you are developing within the CherryPy web framework, Waitress assumes WSGI. Unfortunately, Waitress requires a reverse proxy for TLS. If it had first class support for TLS we would have probably have picked it.

Going back to a more traditional threading server is definitely not as sexy as utilizing greenlets/coroutines but it has provided a consistent result when paired with a multiprocessing worker process and that is what matters.

Porting Time

Porting to a different library can be an annoying task and can feel like busy work. Porting can be even worse when you liked the library in use in the first place as I did (and still do!) with gevent.

Initial porting of main functionality from gevent to CherryPy took roughly four hours. After that, porting it took about another 6 hours to iron out some rough edges followed by updating unit tests. Really, the unit testing updates ended up being more work, in terms of time, than the actual functionality. A lot of that was our fault in how we use mock, but I digress. That’s really not much time!

So What

So far I’m happy with the results. The application functionality works as expected, the request/response speeds are more than acceptable, and CherryPy as a server has been fun to work with. Assuming no crazy corner cases don’t crop up I don’t see use moving off CherryPy anytime soon.

More Editors/IDE’s

In my last post I talked about not being able to find a good Python editor/IDE other than vim. Nothing has really changed since then but there was another editor and IDE that was brought to my attention which I failed to point out. Let’s talk about them!

Editors/IDE’s

Emacs

While talking with Tim Bielawa I was reminded about Emacs since it’s what Tim uses. Emacs has such amazing integration that people sometimes say it’s an operating system itself! I’ve really only ever given Emacs two real shots at being my main editor. The first time was when I was just starting to get into programming and was reading a lot about what other programmers and system administrators used. At first it seemed a lot easier than vi but I ended up running with vi/vim since, at the very least, vi seemed to always be present on every Linux/BSD box I worked with by default. The second attempt was after I promised a few Emacs users at work that I’d give it another fair shot. I said I would use Emacs when I’d normally use vim for a full week knowing that it would force me to learn more about the editor. It wasn’t easy to stop trying to use mode editing but I was able to code without feeling too contained by the editor (Note: the contained feeling was me not knowing the editor well, not Emacs itself. It was really obvious that Emacs is a powerful editor). My biggest gripes from the week long test were around needing to install emacs on systems for use (which is sort of a silly one, I admit) and I felt some of the commands were way to long. I don’t remember the last one off the top of my head but I do remember that one command had a bunch of dashes and was frustrating every time I needed to use it. I think it’s about time I try it again and see if I can overcome my hurdles trying to use it efficiently. Who knows, maybe third time’s the charm!

Cloud9

Cloud9 is really cool. It’s not your grandfather’s IDE by any stretch. As the name implies the application exists out in the cloud (They use Openshift). I love the idea that I can start editing code with full IDE features from any machine I’m currently occupying. I have not tried using Cloud9 with a tablet (with keyboard of course!) yet but if that works then this thing would have rockstar status in my mind. I’ve used it for a few projects for both ease of use and to test out some of the features Cloud9 boasts. Being that it is on Openshift the IDE has it’s own platform letting you install tools and dependencies. There are also some collaboration features which I have not tried out yet mainly because I’m not sure how that works when you are using Github (IE: can they push code under your name or are they restricted?). I would use Cloud9 a lot more if it wasn’t for the Internet. While Cloud9 is pretty responsive in most cases but due to some point in the network connection between “here” and “there” things slow down or stop responding for a second or two. If this was something other than coding I probably wouldn’t care that much … but this is coding. Any hiccups while editing breaks concentration and slows down progress. One other issue I noticed was the lack of preferences across ones instances. Say you have two projects in Cloud9. Each project has it’s own IDE instance. If you want to set preferences for both IDE instances you will have to open each on it’s own and set them. You can not set any kind of global default preferences for all your instances. Hopefully they will add that functionality as I’m pretty positive I’m not the only person who finds that a bit weird. Over time Cloud9 and similar IDE’s will find ways to speed up and add better preference support but until then, in my mind, Cloud9 is straddling the line between full contender and really cool tech preview to keep my eye on.

Why I Chose NewsBlur

Not all that long ago Google Reader closed it doors pushing millions of users off the platform. Many users were frustrated to lose their long time place to get their news not all that different from someone in yesteryear losing their favorite newspaper.  The whole thing was far from ideal but did go to teach users that you can’t expect cloud services to last forever (which is a good wake up call). But in the fall of Google Reader came many possible replacements which added their own spins on how one reads news. Feedly, The Old Reader and NetVibes were a few of the popular replacements. But I settled on NewsBlur and eventually became a paid user.

NewsBlur is mainly written by Samuel Clay (more on why I say mainly later).  He seems like a friendly, hard working fellow. He responds to bug reports and is active in his products community.  While this may seem like common sense just take a few minutes to look at random SaaS products on the Internet. You’ll find many of the developers are hidden behind customer service groups who, at worst, are outsourced and are more of a dead end than a way to get things fixed. Long story short, it seems like Samuel really cares about his product.

It is possible to have a Free account on NewsBlur. While you are limited to a specific amount of feeds many people will find the limits are higher than the feed counts they had in Google Reader. At the time of writing the limit is 64 sites.

There are some social features provided by NewsBlur yet these features are not required nor forced into general workflow. For instance, there is a concept of the BlurBlog which looks like it could be fun. But I tend to read the news and share elsewhere. If I ever decide to use the BlurBlog functionality it’s there. Otherwise I can just use NewsBlur as a fantastic reader.

NewsBlur is Open Source under the MIT license (also known as the Expat License). This gives me peace of mind knowing if Samuel ever decided that he was done with NewsBlur I could export my feeds, setup my own instance, and continue using the product on my own infrastructure. Yeah, it’s not trivial but it’s possible which is a huge advantage given the last reader I used shut down.

No software is without it’s bugs but Samuel does a good job bug squashing. And if you are developer who wants to give a hand you can patch the issue yourself and submit the fix (another win for Open Source). At the time of writing there are 43 development contributors to NewsBlur. This is a much better solution than waiting for a customer service representative to reinterpret your bug submission to a developer so that the fix may be done someday in the future.

If you are still looking for a replacement for Google Reader give NewsBlur a chance even if it’s a second chance as the application seems to be enhanced weekly. If you like it, consider becoming a paid user. Can you can’t say no to Shiloh:

Introducing Flask-Track-Usage

A little while ago one of the guys on a project I work one was asking about how many people were using the projects public web service. My first thought was to go grepping through logs. After all, the requests are right there and pretty consumable with a bit of Unix command line magic. But after a little discussion it became clear that would get old after a while. What about a week from now? How about a month or year? Few people want to go run commands and then manually correlate them. This lead to us looking around for some common solutions. The most obvious one was Google Analytics. To be honest I don’t much care about those systems. While that one may not (or may be) intrusive on users I just don’t feel all that comfortable forcing people to be subjected to a third party of a third party unless there is no other good choice. Luckily, being that the metrics are service related, the javascript/cookie/pixel based transaction wouldn’t have worked very well anyway.

So it was off to look at what others have made with a heavy eye towards Flask based solutions so it matched the same framework we were already using. Flask-Analytics came up in a search. The simple design was something I liked but the extension was more so aimed at using cookies to track users through an application while we want to track overall usage. I figured it was time to roll something ourselves and provide it back out to the community if they could use it as well.

Here it is in all it’s simplistic glory: Flask-Track-Usage. It doesn’t use cookies nor javascript and can store the results into any system which you provide a callable or Storage object. There is also FreeGeoIP integration for those what want to track where users are coming from. The code comes with a MongoDB Storage object for those who want to store the content back into their MongoDB. Want to know a bit more of the technical details? Check out the README or the project page. Patches welcome!

Hello Raspberry Pi

I couldn’t help it. I’ve watched other open source friends rave about playing with the Raspberry Pi but had yet to really jump in on it all. See, I bought a GuruPlug a while back and had kind of a bad experience with it. You know, overheat and shut off. In fairness the manufacturer of the device did provide a hardware fix quite a while later, but I’d already moved on and forgotten why I bought the device in the first place. It took the consistant praise from online friends and one conversation with my friend Andrew to get me to take the plunge.

Yesterday it arrived in a nondescript package. A simple yellow padded envelope. Opening it up I saw two small boxes. Funny thing is the larger of the two boxes is actually the wall plug. But I didn’t have time to do anything other than prepare an SD card with raspbian on it. But today I’ve thrown it on the network (headless) and have an ssh connection in to update the default system.

Now I’m at a bit of a loss or, I should say I’m not sure where to start. I picked up a breadboard, wires and LED’s to do a little playing around. I’m just not sure what I want to do for a longer term project. I’d like to start working with some sensors and pick up more knowledge in that space. I have some more components on the way but it will be a while. Maybe I’ll snag a Arduino while I wait.