Mikołaj Koziarkiewicz

Intro

Have you ever:

  • regretted that you only remember some bare trivia from a course or a book that you studied several months ago? Or maybe even stuff from university/college that you wish you’d recall now?

  • constantly annoy yourself with looking up some implementation detail, API definition or another fact, that you need every couple of weeks, but juuuust manage to slip your mind the next time you require it?

  • feel like learning new stuff (perhaps to progress beyond your current job) is a Sisyphean task that doesn’t net you anything?

If so, than this series of articles is for you.

The Perils of Ad-Hoc Learning

The nice thing about living now is having slightly more free time for your basic non-survival needs. The absolutely scary thing, in turn, is that the amount of information you have access to, and can gain, is simply overwhelming. Understandably this becomes even worse for IT professionals, what with a new library/framework coming out every week [1].

Of course, a lot of this information is fire-and-forget. Another portion you use day to day and, by this virtue, remember without any problems.

What’s left can be divided into two broad categories.

One is knowledge that you use in irregular, but relatively frequent intervals. An example would be an API quirk that you keep re-reading about every couple of weeks. Irritating, isn’t it? Also wasteful.

The other contains "core" knowledge - stuff that you’re not necessarily using directly, but nevertheless benefit from recalling it readily. Forgetting this kind of information is much more insidious - you just end up doing stuff less effectively; or, perhaps one day you suddenly realize that you completely forgot all the things you wanted to learn from the course you did half a year ago.

But, but, the Internet!

A commonly raised counterargument against investing time in recall methods boils down to:

Why bother? Everything is on the Internet anyway, if I don’t remember something, I’ll just look it up!

And sure, you can do that. But how much does it cost you?

Here’s an example: remind yourself of the last small fact or info tidbit that you required and you needed to search for. Now, take your favorite stopwatch app [2], and try out finding the answer on the Internet.

I’d be willing to bet that for most of you, the total time between switching away from this article, and returning to it after searching will be slightly north of 10 seconds.

That’s not a lot, right? But those search bits add up during the day, and you need to factor in the context switch costs [3].

Now try to recall the answer again, from memory. You, quite likely, still can, and it probably took you less than a second to remember. That is an order of magnitude of a difference. Making a somewhat strained analogy, it’s quite similar to the difference between reading from disk and reading from RAM.

Of course, you’re still spending time by putting additional effort into knowledge retention - but that’s dedicated time, as opposed to hacking up your work routine into disjointed bits.

Spaced Repetition - a solution

Obviously no one is going to sit around every day working on remember their cumulatively growing knowledge by rote.

But there are shortcuts to do something very similar much more efficiently.

The concept of Spaced Repetition offers one such shortcut. There are many implementations exploiting the idea, but they all boil down to taking advantage of particularities of the human brain in order to achieve effective recall of various facts and concepts, with relatively minimal effort.

An SR-based approach

One such implementation is Anki. It’s flashcard-based spaced repetition software. Its documentation can be found here, and it’s available for download on:

The tl;dr version of the process of working with flashcards and SRS looks like this:

  • you create a flashcard. In the simplest version it’s a note with two faces: the prompt for what you want to learn, and the thing you want to learn. Here’s a couple of examples.

anki cards overview
A selection of Anki cards: normal text, source code, image, LaTeX formulas.
  • after several initial repetitions, Anki prompts you with the cards, in increasing time intervals: initially a couple of days, then a couple of weeks, months, and so on. At each repetition, when viewing the answer, you are prompted to choose one of the following options:

    • "Again" which means you forgot about the entry, and need to reset the learning process,

    • "Good" meaning you have good recollection, i.e. the vanilla option,

    • "Hard" implying thatas you’ve sorta learned, but aren’t quite confident, leading to a shorter time to repetition,

    • "Easy" that tells the system to extend the time interval to a greater extent than with "Good".

As you have probably figured it out, the biggest benefit from this system is that the time intervals are managed automatically, and you get automatic reminders to repeat your cards (which is especially useful if you install the mobile client).

Local Flavor

Anki’s repetition selection algorithm is a derivative of SM2, which was invented in Poland.

Coming Up

In the second episode of this series, we will talk about a learning scheme developed empirically by Yours Truly, that takes advantage of Anki when acquiring technical (and other) knowledge.


1. and a Javascript implementation following up in 5 days.
2. I expect very few people still have actual stopwatches, digital or analog.
3. Also, let’s not kid ourselves, you probably took this as a challenge and done the search than you would normally do under this kind of situation.
Mikołaj Koziarkiewicz

This is a set of loosely-related remarks and observations, accrued due to recent use of Ansible in several work and personal projects (and on the latter note, I made a thing).

The entry is mostly intended as a writing exercise, so if you’re evaluating Ansible at the moment, please keep this in mind.

A word of Introduction

Ansible, like Chef, Puppet and Salt, is configuration management software. Despite my best intentions to reach the widest audience possible, providing a comprehensive overview of the field would require creating a whole series of blog posts.

Therefore, wanting to convey the following statements in reasonable time, I’m forced to limit the target readership to those that posses basic knowledge of the subject. In other words, if you are unfamiliar with the subject, please do take your time to browse the included links. Sorry!

Ansible - the good, the bad, the ugly

Context

All those remarks are made by someone who is strictly more a dev than an op, so the things I’m writing about here may or may not apply to your situation as much as they did to mine.

Also, the entry was writen when going through the Through of Disillusionment, halfway to the Slope Enlightenment, and therefore may sound more negative than intended.

Good: Workerless setup

The absolutely wonderful thing about Ansible is its agentless architecture - you don’t need to setup anything on the services machines, other than a valid SSH connection to a user (usually with possible sudo access). No special nodes requiring additional setup (like in Chef), nothing of the sort.

This in fact makes it very convenient to bootstrap setups for CI and the like, or even a quasi Inversion-of-Control setup, using the ansible-pull utility.

Such a feature may seem like a small thing, but it reduces the error rate during the "metagame" of setting up your servers.

Good: Easy to understand syntax

Ansible uses YAML with embedded Jinja2 for its configuration definitions, and most of the basic stuff can be expressed that way. Getting the majority of desirable output defined is pretty straightforward, once you learn the basics.

Bad: …​that’s sometimes not as intuitive as it should be

A big stumbling block I’ve encountered is correctly specifying the conditions in when blocks (saying when to execute a task) and similar ones. Truth be told, even after viewing the parsing source code for the "playbooks", I’m still not entirely confident on what is and isn’t allowed.

I think this is due to the fact that the Jinja2-based syntax sits pretty squarely in the Uncanny Valley for someone with an off-and-on Python background. In effect, you end up writing those conditions like in Python, which works…​ about 90% of the time. The remaining 10% will piss you off to no end.[1]

Good: Has a strong focus on idempotency

Pretty much a given for all modern configuration management software, but nevertheless I could excitedly rave and rant about that at you the entire day - it’s awesome that Ansible specifically focuses on what the state should be, rather than what tasks to do.

Good: Trivial to customize

To recap, here’s how config management works in Ansible:

  • the unit of work is a task, meant to encapsulate a single "end-state" quantum, i.e. something that should be ensured to be fulfilled once this task is done.

  • tasks use modules, which do the actual grunt work and can be implemented in most languages (a lot of them use Python, obviously since Ansible is written in that). There exists a cornucopia of Ansible built-in modules, from user management through ensuring a given is in a file, to EC2 instance setup.

  • tasks can be grouped into roles, which can also contain common variables, custom modules etc..

  • task and role mixes are codified into playbooks, which describes what your actual configuration should look like. You "run" the playbooks on your "inventory"[2] to achieve that desired state.

Creating roles

By the way, if you’ve dug through the role documentation and wondered how to create the role scaffolding automatically, here’s how you can do that:

ansible-galaxy init <rolename>

This is of course mentioned only in the Ansible Galaxy documentation, later on.

You’ll notice that such a structure allows you to modularize your configuration management logic as you see fit. That includes creating several "internal" modules for complex actions and sticking them into your roles.[3]

Bad: …​but no code reuse for Python modules

Funilly enough, due to how Ansible is structured, you cannot have "common" code files for Python-based modules.

That sucks, but is probably an edge case for most stuff.

Good: Good introductory documentation

The docs [4] do a very nice job of showing you the ropes. They are very much example-based, and subsequent steps build upon previous knowledge in a logical way.

Ugly: …​with no formal sections

I fail to see, in the documentation, a "Big Picture" overview of how the various components of an Ansible definition are structured.

Apparently, looking at the the latest Stack Overflow poll, I’m in the minority when it comes to education and this may skew my perception, but I would give my right spleen for an EBNF or similarly-formatted formal specification of the Playbook syntax.

Good: A globally shared role repository

Ansible has something called Ansible Galaxy, which is its way to share you common configuration functionality.[5]. It’s pretty easy to use if you just want to find something (the thing I made is also there).

Ugly: …​very nascent in the current state

However, especially if you display perfectionist tendencies [6], you will spend quite a bit of time examining the roles for the functionality your require.

One problem is that, while a rating system exists, it’s severely underused, and you have to fish for the well-made roles.

Another is the presence of very clearly clashing philosophies when designing roles. It’s best demonstrated when comparing ansible roles from Jeff Geerling and the Stouts group.

The former offer a definitely correctly functioning role, but make a metric f-ton of assumption about what a given piece of setup software is going to operate as. The latter, meanwhile, while not being entirely correct idempotency-wise, allow for very diverse configuration variants.

To be honest, I came to prefer the Stouts roles for their ability to set up as I want them over the geerlingguy ones, despite Mr. Geerling’s foray into book writing on Ansible[7].

Overall

The general picture that I’ve painted hopefully shows a framework with a number of nits that you can pick, but built on solid foundations nevertheless. Those solid foundations will provide a payoff as the framework grows and matures, eliminating the smaller problems along the away. Be aware of the shortcomings, but rest assured that I recommend you check out Ansible for you configuration management needs.


1. Of course, for someone working primary in Python, this may not be a problem, due to their probable contact with and prior use of Jinja2.
2. A list of target servers with some labelling and variables.
3. Normally, modules are quite a "big thing", able to be shared stand-alone, but sometimes the convenience of writing in an actual Turing-complete programming language is too great to miss.
4. …​which admittedly took a nearly-non-trivial amount of time for me to fish out out of the sales-pitch-filled landing page…​.
5. If you’ve never used configuration management software, think of it as the equivalent of the Docker Hub Repository.
6. like Yours Truly
7. Can’t comment on the book’s quality, but the roles show good craftsmanship.