etcdobj: A Minimal etcd Object Mapper for Python

I didn’t have a lot on my agenda Friday. I wanted to review and return emails, do some reading, get some minor hacking on etcdobj done (more on that…), eat more calories then normal in an attempt to screw with my metabolism (nailed it!), catch up with a few coworkers, play some video games, and, apparently, accidentally order an air purifier from Amazon. I succeed in all of it. But on to this etcdobj thing…

While working on Commissaire I started to feel a bit dirty over storing json documents in keys. It’s not uncommon, but it felt like it would be so much better if a document was broken into three layers:

  • Python: Classes/Objects
  • Transport: For saving/retreiving objects
  • etcd: A single or series of keys

By splitting up what normally is json data into a series of keys and two clients change overlapping parts of an object there won’t be a collision or require the client to fail, fetch, update, then try saving again. I searched the Internet for a library that would provide this and came up wanting. It seems that either simple keys/values or shoving json into a key is what most people stick with.

etcdobj is truly minimal. Partly because it’s new, partly because being small should make it easier to build upon or even bundle (it’s got a very permissive license), and partly because I’ve never written an ORM-like library before and don’t want to build to much on what could be a shaky foundation. That’s why I’m hoping this post will encourage some more eyes and help with the code.

Current Example

To create a representation of data a class must subclass EtcdObj and follow a few rules.

  1. __name__ must be provided as it will be the parent in the key path.
  2. Fields are class level variables and must be set to an instance that subclasses etcdobj.Field.
  3. The name of a field is the next layer in the key path and do not need to be the same as the class level variable.
from etcdobj import EtcdObj, fields

class Example(EtcdObj):
    __name__ = 'example' # The parent key
    # Fields all take a name that will be used as their key
    anint = fields.IntField('anint')
    astr = fields.StrField('astr')
    adict = fields.DictField('adict')

Creating a new object and saving it to etcd is pretty easy.

server = Server()

ex = Example(anint=1, astr="hello", adict={"hi": "there"})
ex.anint = 100  # update the value of anint
server.save(ex)
# Would save like so:
# /example/anint = "100"
# /example/astr = "hello"
# /example/adict/hi = "there"

As is retrieving the data.

new_ex = server.read(Example())
# new_ex.anint = 100
# new_ex.astr = "hello"
# new_ex.adict = {"hi": "there"}

Ideas

Some ideas for the future include:

  • Object watching (if data changes on the server it changes in the local instance)
  • Object to json structure
  • Deep DictField value casting/validation
  • Library level logging

Lend a Hand

The code base is currently around 416 lines of code including documentation and license header. If etcdobj sounds like something you’d use come take a look and help make it something better than I can produce all by my lonesome.

Advertisements

Brave: Because It’s The Best Middle Available

I’m no different than a large portion of web users who are looking to read content and stay safe: I use one or more ad blockers. Is this stealing website content? No. I’d be happy to be served ads is they didn’t have such a bad track record in terms of security and privacy.

Privacy

Many people are still ignorant to what information they are giving up to online advertisers. In a 2013 post by VentureBeat they noted:

Advertisers and the tracking companies they employee are able to gather all sorts of information about you, such as the websites you frequent and what kind of products you’re interested — and even some even scarier stuff like political views, health problems, and personal finances.

Over time the picture you provide to these private companies becomes clearer and clearer. Of course, you may not care if companies know you like chocolate chip cookies but you may not want them to know more personal things or, worse, be able to extrapolate things about you that you haven’t even unknowingly shared … not to mention government use for predictive modeling.

Malvertising

Privacy is important but this is the bigger problem in my opinion. Malvertising has proven a successful vector to infect users machines with malware. If you are interested in a time line of large malvertising events GeoEdge has a nice post. A quick summary of heavy hitters who have inadvertently exposed their readers to threats include The New York Times, eBay, LA Times, Spotify, Answers.com, Huffinton Post, MSN, BBC, AOL and NFL. Of course, there are many more but that list should be enough to get anyone’s attention.

Options

So what are valid options to protect personal privacy and security on the web?

Ignore Internet Content

This is the best option but it’s very unlikely. Everyone loses with this as content providers get nothing from their ads and readers don’t get any news.

Go To “Safe” Sources

Another good option, but about as unlikely as the first. It takes work to find out the sites that are not tracking or injecting third party advertising. It also assumes safe sources are always safe but the web is a constantly evolving place and a site may be totally different upon two visits.

Run an Ad Blocker

This is the most common solution today. It blocks as many ads and third party cookies as it can and generally keeps users safe. It’s not a perfect solution as the content providers miss getting any ad clicks/impressions but the reader gets a much safer (and faster) experience.

Some sites actively block ad blockers. When I come across these sites who nicely ask me to unblock their ads I head over to google and find another source for the same story. I don’t think I’m alone in doing that.

Use (something like) Brave That Shares Ad Revenue

This is a newer thing and the actual reason  I wanted to write this post. Brave seems to be a good middle ground which attempts to keep users safe while still providing money back to content providers. In some ways Brave is acting like a arbitrator to let everyone get something out of the deal. Users get content, creators get money. Yeah Brave (and users) get a cut to but that’s not so bad (though I’d be fine with not getting a cut at all as a user).

Here is the flows for revenue from Brave:

brave_infographic_large

852764

Unfortunately, the NAA  didn’t quite grasp the above idea and has called to Brave to stop. Surprise, at least one of the companies who signed the letter has put users at risk via malvertising on multiple occasions.

Brave has posted a rebuttal in an attempt to help NAA understand the business model and why it’s not illegal. Hopefully logic will triumph over emotion and posturing.

My Hope

My hope is that users will jump on to the idea that Brave provides (whether they use Brave or not) and that the NAA will understand that it is a business model where everyone wins, even their readers.

From Gevent to CherryPy

I’ve been working on a project for the last few months on GitHub called Commissaire along with some other smart folks. Without getting to deep into what the software is supposed to do, just know it’s a REST service which needs to handle some asynchronous tasks. When prototyping the web service I started utilizing gevent for it’s WSGI server and coroutines but, as it turns out, it didn’t end up being the best fit. This is not a post about gevent sucking because it doesn’t suck. gevent is pretty awesome but it’s not for every use case.

The Problem

One of the asynchronous tasks we do in Commissaire utilizes Ansible. We use the Ansible python API to handle part of host bootstrapping of a new host. Under the covers Ansible uses the multiprocessing module when executing it’s work. Specifically, this occurs when the TaskQueueManager starts its run. Under normal circumstances this is no problem but when gevent is in use it’s monkey patching ends up causing some problems. As noted in the post using monkey.patch_all(thread=False, socket=False) can be a solution. What this ends up doing is patching everything except thread and socket. But even this wasn’t enough for us to get past problems we were facing between multiprocessing, gevent, and Ansible. The closest patch we found was to also disable os, subprocess and a few other things making most of gevents great features unavailable. At this point it seemed pretty obvious gevent was not going to be a good fit.

Looking Elsewhere

There are no lack of options when looking for a Python web application server. Here are the requirements that I figured we would need:

Requirements

  • Importable as a library
  • Supports WSGI
  • Supports TLS
  • Active user base
  • Active development
  • Does not require a reverse proxy
  • Does not require greenlets
  • Supports Python 2 and 3

Based on the name of this post you already know we chose CherryPy. It hit all the requirements and came with a few added benefits. The plugin system which allows for calls to be published over an internal bus let’s us decouple our data saving internals (though couples us with CherryPy as it is doing the abstraction). The server is also already available in many Linux distributions at new enough versions. That’s a big boon hoping to have software easily installed via traditional means.

The runner up was Waitress. Unlike CherryPy which assumes you are developing within the CherryPy web framework, Waitress assumes WSGI. Unfortunately, Waitress requires a reverse proxy for TLS. If it had first class support for TLS we would have probably have picked it.

Going back to a more traditional threading server is definitely not as sexy as utilizing greenlets/coroutines but it has provided a consistent result when paired with a multiprocessing worker process and that is what matters.

Porting Time

Porting to a different library can be an annoying task and can feel like busy work. Porting can be even worse when you liked the library in use in the first place as I did (and still do!) with gevent.

Initial porting of main functionality from gevent to CherryPy took roughly four hours. After that, porting it took about another 6 hours to iron out some rough edges followed by updating unit tests. Really, the unit testing updates ended up being more work, in terms of time, than the actual functionality. A lot of that was our fault in how we use mock, but I digress. That’s really not much time!

So What

So far I’m happy with the results. The application functionality works as expected, the request/response speeds are more than acceptable, and CherryPy as a server has been fun to work with. Assuming no crazy corner cases don’t crop up I don’t see use moving off CherryPy anytime soon.

I Wish Fossil Would Take Off

Fossil is the coolest distributed SCM you are not using. No seriously. It boasts features not found in any of the common distributed SCMs used by nearly every developer today.

About 5 or 6 years ago I started getting a little frustrated with Git. The main complaint I kept coming back to over and over was, to use Git effectively, one needed to use GitHubTrac, or any other ways to add an interface with issues and information. There was also the problem of getting many CVS/SVN folks comfortable with Git terminology which fueled my recommendation of Mercurial, but I digress. It was around this time that a friend of mine and I started looking at a way we could include issues within a Git repository. At that time we looked a projects like BugsEverywhere which provide a separate tool to track bugs within the repository. We gave it a go for a little while but eventually fell away from it as, at the time, it really felt like a second class citizen in the Git toolchain. We spent a little time developing our on solution but then gave up realizing that Git was so tied to the GitHub way.

Around this time one of us found Fossil and started to play around with it. I was blown away at how it took care of code, issues, wiki, tracking, and code hosting. You essentially get a distributed version of Trac for every clone. All the data comes along and you are able to update documentation, code, issues, etc.. all as part of a fossil push.

As of the time of writing Fossil boasts (from the main page):

  1. Integrated Bug Tracking, Wiki, and Technotes
  2. Built-In Web Interface
  3. Self-Contained
  4. Simple Networking
  5. CGI/SCGI Enabled
  6. Autosync
  7. Robust & Reliable
  8. Free and Open-Source

I touched a little bit on 1 and 2, but 3 is also a pretty cool feature. If you do an install of Git you really are installing a bit more than you may realize. For example, Fedora’s Git package requires:

  1. asciidoc
  2. desktop-file-utils
  3. emacs
  4. expat-devel
  5. gettext
  6. libcurl-devel
  7. libgnome-keyring-devel
  8. openssl-devel
  9. pcre-devel
  10. perl(Error)
  11. perl(ExtUtils::MakeMaker)
  12. pkgconfig(bash-completion)
  13. python
  14. rpmlib(CompressedFileNames) <= 3.0.4-1
  15. rpmlib(FileDigests) <= 4.6.0-1
  16. systemd
  17. xmlto
  18. zlib-devel >= 1.2

In other words you need a specific editor, 2 languages available on the system, a specific init system, and a part of GNOME. Plain Git directly from source requires less, but still more than one would think. Fossil notes it’s dependencies as:

Fossil needs to be linked against zlib. If the HTTPS option is enabled, then it will also need to link against the appropriate SSL implementation. And, of course, Fossil needs to link against the standard C library. No other libraries or external dependences are used.

Philosophy

Fossil and Git have very different philosophies. The most interesting point to me when reading up on the differences was this:

Git puts a lot of emphasis on maintaining a “clean” check-in history. Extraneous and experimental branches by individual developers often never make it into the main repository. And branches are often rebased before being pushed, to make it appear as if development had been linear. Git strives to record what the development of a project should have looked like had there been no mistakes.

Fossil, in contrast, puts more emphasis on recording exactly what happened, including all of the messy errors, dead-ends, experimental branches, and so forth. One might argue that this makes the history of a Fossil project “messy”. But another point of view is that this makes the history “accurate”. In actual practice, the superior reporting tools available in Fossil mean that the added “mess” is not a factor.

One commentator has mused that Git records history according to the victors, whereas Fossil records history as it actually happened.

While pretty, (nearly) liner history is a simple read it rarely is actually true.

githistory

Using Fossil

There is a pretty decent quick start to get one started. At first run through it feels clunky. For instance, when doing a checkout you have to open the repository with fossil open but then again people felt (and some still feel) that git add $FILES, git commit, git push $PLACE $BRANCH feels wrong. I think that with enough time one can be just as comfortable with fossil’s commands and flow as they would be with git.

Truth Be Told

My biggest want for Fossil to take off is to be able to offline and merge bugs/issues and documentation without forcing everyone to adopt third party tools to integrate with an SCM. I also would like to keep my hands on my keyboard rather than logging into GitHub to review stuff (yeah, I know there are keyboard shortcuts …). Anyway, here is hoping more people will give Fossil a try!

Flask-Track-Usage 1.1.0 Released

A few years ago the initial Flask-Track-Usage release was announced via my blog. At the time I thought I’d probably be the one user. I’m glad to say I was wrong! Today I’m happy to announce the release of Flask-Track-Usage 1.1.0 which sports a number enhancements and bug fixes.

Unfortunately, some changes are not backwards compatible. However, I believe the backwards incompatible changes make the overall experience better. If you would like to stick with the previous version of Flask-Track-Usage make sure to version pin in your requirements file/section:

flask_track_usage==1.0.1

Version 1.1.0 has made changes requested by the community as well as a few bug fixes. These include:

  • Addition of the X-Forwarded-For header as xforwardedfor in storage. Requested by jamylak.
  • Configurable GeoIP endpoint support. Requested by jamylak.
  • Migration from pymongo.Connection to pymongo.MongoClient.
  • Better SQLStorage metadata handling. Requested by gouthambs.
  • SQLStorage implementation redesign. Requested and implemented by gouthambs.
  • Updated documentation for 1.1.0.
  • Better unittesting.

I’d like to thank Gouthaman Balaraman who has been a huge help authoring the SQLStorage based on the SQLAlchemy ORM and providing feedback and support on Flask-Track-Usage design.

As always, please report bugs and feature requests on the GitHub Issues Page.

How I’m Cold Brewing Coffee

As long as I can remember I’ve enjoyed writing code but in the last few years I have really started to enjoying creating physical things on my own as well. My foray into making beverages started with beer and it’s ingredients.  While the first attempt didn’t go so well I was able to learn from it and make something I enjoyed. Then came making yogurt. Not exactly a drink but a good component in making them. The next jump was into Kombucha which has turned out to be way easier than I could have imagined (so easy I didn’t write a thing about it!). So let’s talk cold brew coffee.

While pulling the hops out of a collaborative brew with my awesome teammates Andrew made a comment about starting to make cold brew coffee. Andrew’s process included an french press which is something I don’t have. However, his description rattled in my brain for a few days before it took root and I decided to give it a go for myself.

I don’t have this.

I searched the internet for some information on how to make cold brew coffee with few to no special appliances. As expected the Internet delivered a ton of information much of it being the same steps rattled over and over with a smattering of contradictory data.

I do have these.

One thing everyone seemed to agree on is that you need to have beans that you enjoy. If you don’t like the coffee the beans make then try another origin and/or roaster. Armed with a pint mason jar, a plastic lid, plastic spoon, good medium roasted coffee beans, good dark roasted coffee beans, a grinder, funnel and a filter I went to work.

What You Need

Obviously there are many different ways to make cold brew coffee. The following are what I use to make what I consider pretty tasty stuff. To replicate the process I use you will need the following items:

Step 1

Ground 2/3rd cups of beans. You can do a 1/6th cup medium decaf and 1/6th cup dark or do full 2/3rds of one and blend to taste later. For me the combination of the two provides a rich and smooth taste I really enjoy.

I’ve read people noting fine grind to slightly corse. Since I am using one of the cheaper blade grounders I aim for around the same grind I’d use for a Moka Pot. For me the combination of the two provides a rich and smooth taste I really enjoy.

Step 2

Dump those grinds into a pint mason jar!

Step 3

Pour in 3 cups of filtered water and stir with a plastic (or non-metal) spoon.

Step 4

Put the lid on the jar, place it in a location with little or no sunlight and walk away.

Step 5

Show back up about 10-12 hours later. Take another pint mason jar and place the funnel with filter inside of it on top of the new mason jar. Open the original jar and slowly pour the contents through the funnel and filter. Don’t let it overflow!

Step 6

Place the lid on the new mason jar filled with filtered cold brewed coffee and place it on the fridge.

Step 7

Enjoy carefully! This is a concentrate (though not as concentrated as some recipes) so you’ll need to find your own level of flavor with water/milk dilution (or drink a smaller amount).

So..

I’ve been thinking about what I could do next with the process. I could keg and put push it with nitro (like dry stouts) or find out what kind of flavor additives work well with the flavors in cold brew.

As I noted, this is not the only way to make cold brew but I really enjoy the result. If you have any ideas to make it even better let me know!

I Have To Make Things

We are all consumers of things. These things are everything from food to software based services. We are trained to want more things, use more things and find that one thing that will finally make us not want any other things but it never ends up working that way. Over the last 10 years I figured out that making something is way more fulfilling.

Years ago I figured out that I enjoy writing code. Specifically FLOSS. While I always heard FLOSS was great because you “scratch your own itch” I found myself looking at what others were looking for and trying to figure out a way to get it done. That didn’t keep me from coming up with my own ideas, but I found trying to implement someone’s idea of things was more of a challenge — like a puzzle. Imprinting my own way of thinking in code is “easy”, but trying to wrap my mind around someone else’s way of tackling problems isn’t so straight forward.

I believe my want to create has also shaded my view on things like tablets as laptop replacements. I can’t produce things I consider valuable with a tablet (with exceptions to adding a keyboard and having an ssh client). I can produce communications and consume but that doesn’t cut it. I want tools to create that let me produce well crafted results I can feel satisfied with.

Over the last few years I’ve found my want to create things does not stop at producing software. I’ve picked up brewing which has really been a challenge I’ve enjoyed. I still have so far to go but with each attempt I find things I could do better and improve my results. I’ve also picked up more baking which I had done a bit of before. For some reason making bread is a very relaxing process for me.

Over the weekend I found myself with nothing on my plate to make. I felt bored, frustrated and found myself grasping at things to do. For instance, I start to rewrite some code in a different language just to do it. Of course this didn’t actually make me feel any better as it didn’t really serve any real purpose. I turned to cook some food to eat later in the week which helped, but didn’t really do it for me. All this reminded me that I am one of those folks who has to actually make things. I need to create to things. Maybe it’s a way of expression or maybe it just proves I have “value” (Hi Tim) but no matter what I need to make things.

In My Mind It’s Time For Go

I have been, and continue to be, a fan of the Python programming language. It’s clean looking, portable, quick to write, has tons of libraries and a great community. However, I, like many other software developers, don’t think that one must be married to a language. Go has received a lot of attention as of late and it finally makes sense to me as to why.

I’ve gone through, and added to my tool belt, many languages before Python becoming my go to star. The first was Perl. Being so versatile and having a large community of both professional and hobby developers made it an easy early choice for me. The biggest issue for me with Perl was I started to learn how much I liked simple, easy to read code which follows a coding standard. PHP became my fall back web language. It was so simple to write an application it was almost dumb not to use it at the time. No language is perfect and I found many developers at the time didn’t know how to write safe PHP or easy to follow code. I slowly drifted away from PHP as a mainstay. C, while fast and being, well, C, never became a go to mainly because of the time to get things done.

Java just never really did it for me. University tried to shove it down my throat and it didn’t work. Funny thing is I really tried to like it. I gave it multiple chances but would always walk away feeling better with my hands untied from the JVM. I also found myself loathing IDE’s due to people’s insistence that a Java developer needs to use one to be productive. Like many cool kids I became a Java hater for a while and did everything I could to keep friends away from it. But enough of this, let’s talk about Go.

Go has been on my radar since its initial announcement. If I remember right my first thoughts were not positive ones. Everyone and their brother seemed to have their own languages coming out or, at the very least, a DSL which would be the next big thing. I decided to stay back for a bit and see what would happen with the language rather than diving in or slamming the door. Since then there has been a good amount of libraries created for the language, some pretty interesting users such as Docker and DropBox and this post which sums up why Go is a good option when considering Node.js.

What I’ve found is that no tutorial, video nor explanation from a Go fan could convince me to actually see why Go could be so great. After all, the idea behind Go’s OO support sounds half hearted. So one night I did a Google search for what features make people fawn over Go and channels and goroutines came up quite a lot. The next step was to write a Hello World like application that would utilize both features and see how I felt. This is what I came up with:

package main

import "fmt"


// Coms turns into a communication instance. in/out are channels.
type Coms struct {
    in    chan string
    out   chan string
}

// Closes both in and out channels in a Coms instance.
func (c Coms) Close() error {
    fmt.Println("Closing channels")
    close(c.in)
    close(c.out)
    return nil
}
// -------------------


// Main function which sends ping and pong back in two goroutines
// using a Coms instance.
func main() {
    // Creating an "object"
    coms := Coms{make(chan string), make(chan string)}
    // At the end of the function run coms.Close.
    defer coms.Close()

    // First goroutine which ping's out and responds
    // back to pong's with a ping.
    go func() {
        coms.in <- "ping"
        for {
            data := <- coms.out
            fmt.Println(data)
            coms.in <- "ping"
        }
    }()

    // Second goroutine which pong's in response to ping's.
    go func() {
        for {
            data := <- coms.in
            fmt.Println(data)
            coms.out <- "pong"
        }
    }()

    // To exit the application hit enter
    var input string
    fmt.Scanln(&input)
}

 

The result of this code was:

$ go run pingpong.go
ping
pong
ping
pong
ping
....
⏎
Closing channels
$

 

It’s a simple program but it pulled me in. The OO style was not nearly as clunky as I thought it would be and the goroutines were so simple to use I was almost shocked. I also got first hand feel for a non-intrusive strongly typed system. It felt almost like a dynamic language. For things that need speed I felt hooked!

Will I continue to use Python? Most surely! But for things that need to be super fast I think Go will be my default. If someone told me to greenfield a SaaS service tier I’d probably lobby for using Go while keeping the web tier Python or something(s) similar. uWSGI server anyone?.

If you are still on the fence with go take a quick look at it’s feature set and take the 20 minutes to write a simple application utilizing them. Examples are nice but there is nothing better than trying the syntax and structures yourself. Writing some code should tell you if Go is for you.

Red Hat Developer Blog: Git Bonsai, or Keeping Your Branches Well Pruned

Code repositories are the final resting place for code, acting as equal parts bank vault, museum, and graveyard. Unlike a vault, content is almost always added faster than it is removed, but much like a graveyard, there is a definite miasma when a repository becomes too full. That smell is developer frustration, as they have to search through dozens, or eventually, hundreds of branches to find the one they want.

We’ve had sporadic cases where branches did not get merged into masters (and sometimes fixes were overwritten in later releases) and have wasted collectively hundreds of developer hours on “which branch is this in?” exchanges.

Sam and I talk about a simple yet helpful git tool to squash bad branches over at the Red Hat Developer Blog.

Slack Isn’t New, It’s New

New tech tools show up daily. You’ve probably heard about Slack already as it’s being talked about a lot, but just in case you haven’t I’ll give you the tl;dr: Slack integrates your development and infrastructure notifications, chat and documents from different providers into one chat like interface. It’s pretty much the same as Hipchat. All in all slack is a pretty cool and sleek system which allows for easy chatting within a group. But as I tested the service I kept having a feeling of deja vu. This cool new service feels familiar to me. Then it hit me, I have seen this before, haven’t I?

For a long time many engineers, especially in the Free/open arena, have utilized IRC as a way of communication while working. It’s efficient, simple, client agnostic and supports chat rooms as well as private messages. Slack’s main chat interface is very similar to IRC. It’s a chat room with a list of people present and the ability to send private messages. Just like IRC people post messages to the chat room and everyone in the room is able to read them. In a way it’s a little funny to think about chat rooms being “new” but, then again, people have been using instant messaging and SMS style message systems for so long that the concept of the chat room may seem fresh. So the chat interface is similar to IRC, but what about the integration? Aren’t they new?

With Slack the integrations are set up via the web interface. Each integration can send information to a channel with an icon and message. Obviously IRC does not have this functionality directly, but IRC bots do! Many developers set up bots like Supybot with integration with their external development tools. Announcements of new builds, code pushes, deployments, support requests, etc.. show up in channel from the bot. While it’s not as flashy as Slack the same basic integration idea occurs.

Don’t get me wrong, the point of this post isn’t to say that Slack is dumb or simply a copy of something “better”. The point of this post is that, while Slack isn’t really something brand new, it is quite cool. There is a reason developers and ops folks have been setting up things similar to Slack in their own chat rooms for years! The ability to see the development process actually flow can be pretty exciting and empowering. Those who have or will not be able to set up their own integrations have an option to use Slack as a pre-baked set up which, depending on team/company may be more user friendly for the less technical minded. And letting the non-technical see how much is happening day to day can open their eyes to just how much a team is getting done. Probably way more than they realize.