The missing link of Ops tools

It’s like we went from horse to spaceship, skipping everything in between.


Let’s say you are managing your system in AWS. Amazon provides you with API to do that. What are your options for consuming that API?

Option 1: CLI or library for API access

AWS CLI let’s us access the API from the command line and bash scripts. Python/Ruby/Node.js and other languages can access the API using appropriate libraries.

Option 2: Declarative tools

You declare how the system should look like, the tool figures out dependencies and performs any API calls that are needed to achieve the declared state.

Problem with using CLI or API libraries

Accessing API using CLI or libraries is fine for one off tasks. In many cases, automation is needed and we would like to prepare scripts. Ideally, these scripts would be idempotent (can be run multiple times, converging to the desired state and not ruining it). We then quickly discover how clunky these scripts are:

# Script "original"
if resource_a exists then
  if resource_a_property_p != desired_resource_a_property_p then
    set resource_a_property_p to desired_resource_a_property_p
  if resource_a_property_q != desired_resource_a_property_q then
  # resource_a does not exist
  create resource_a
  set resource_a_property_p to desired_resource_a_property_p
# more chunks like the above

It’s easy to see why you wouldn’t want to write and maintain a script such as above.

How the problem was solved

What happened next: jump to “Option 2”, declarative tools such as CloudFormation, Terraform, etc.


Other possible solution that never happened

If you have developed any code, you probably know what refactoring is: making the code more readable, deduplicate shared code, factoring out common patterns, etc… without changing the meaning of the code. The script above is an obvious candidate for refactoring, which would be improving “Option 1” (CLI or a library for API access) above, but that never happened.

All the ifs should have been moved to a library and the script could be transformed to something like this:

# Script "refactored"
create_or_update(resource_a, {
  property_p = desired_resource_a_property_p
  property_q = desired_resource_a_property_q
# more chunks like the above

One might say that the “refactored” script looks pretty much like input file of the declarative tools mentioned above. Yes, it does look similar; there is a huge difference though.

Declarative tools vs declarative primitives library

By “declarative primitives library” I mean a programming language library that provides idempotent functions to create/update/delete resources. In our cases these resource are VPCs, load balancers, security groups, instances, etc…

Differences of declarative tools vs declarative primitives library

  1. Declarative tools (at least some of them) do provide dependency resolution so they can sort out in which order the resources should be created/destroyed.
  2. Complexity. The complexity of mentioned tools can not be ignored; it’s much higher than one of  declarative primitives library. Complexity means bugs and higher maintenance costs. Complexity should be considered a negative factor when picking a tool.
  3. Some declarative tools track created resources so they can easily be destroyed, which is convenient. Note that on the other hand this brings more complexity to the tool as there must be yet another chunk of code to manage the state.
  4. Interacting with existing resources. Between awkward to impossible with declarative tools; easy with correctly built declarative primitives library. Example: delete all unused load balancers (unused means no attached instances): AWS::Elb().reject(X.Instances).delete()
  5. Control. Customizing behaviour of your script that uses declarative primitives library is straightforward. It’s possible but harder with declarative tools. Trivial if in a programming language can look like count = "${length(var.public_subnets) > 0 ? 1 : 0}" (approved Terraform VPC module).
  6. Ease of onboarding has declarative tools as a clear winner – you don’t have to program and don’t even need to know a programming language, but you can get stuck without knowing it:
  7. Getting stuck. If your declarative tool does not support a property or a resource that you need, you might need to learn a new programming language because the DSL used by your tool is not the programming language of the tool itself (Terraform, Puppet, Ansible). When using declarative primitives library on the other hand you can always either extend it when/if you wish (preferable) or make your own easy workaround.
  8. Having one central place where potentially all resources are described as text (Please! don’t call that code, format is not a code!). It should be easier done with declarative tools. In practice, I think it depends more on your processes and how you work.

As you can see, it’s not black and white, so I would expect both solutions be available so that we, Ops, could choose according to our use case and our skills.

My suggestion

I don’t only suggest to have something between a horse and a spaceship; I work on a car. As part of the Next Generation Shell (a shell and a programming language for ops tasks) I work on declarative primitives library. Right now it covers some parts of AWS. Please have a look. Ideally, join the project.

Next Generation Shell –


Do you agree that the jump between API and declarative tools was too big? Do you think that the middle ground, declarative primitives approach, would be useful in some cases? Comment here or on Reddit.

Have a nice day!

Why I have no favorite programming language

TL;DR – because for me there is no good programming language.

I’m doing mostly systems engineering tasks. I manage resources in Cloud and on Linux machines mostly. I can almost hear your neurons firing half a dozen names of programming languages. I do realize that they are used by many people for systems engineering tasks:

  • Go
  • Python
  • Ruby
  • Perl
  • bash

The purpose of this post is not to diminish the value of these languages; the purpose is to share why I don’t want to use any of the languages above when I write one of my systems-engineering-task scripts. My hope is that if my points resonate with you, the reader, you might want to help spread the word about or even help with my suggested solution described towards the end.


So let’s go over and see why I don’t pick one of the languages:

Why not language X?

All languages

  • Missing smart handling of exit codes of external processes. Example in bash: if test -f my_file (file is not there, exit code 1) vs if test --f my_file (syntax error, exit code 2). If you don’t spot the syntax error with your eyes, everything behaves as if the file does not exist.
  • Missing declarative primitives libraries (for Cloud resources and local resources such as files and users). Correction: maybe found one, in Perl – (R)?ex ; unfortunately it’s not clear from the documentation how close it is to my ideas.

All languages except bash

  • Inconvenient/verbose work with files and processes. Yes, there are libraries for that but there is no syntax for that, which would be much more convenient. Never seen something that could compare to my_process > my_file or echo my_flag > my_file .


  • Compiled
  • Error handling is a must. When I write a small script, it’s more important for me for it to be concise than to handle all possible failures; in many cases I prefer an exception over twice-the-size script. I do understand how mandatory and explicit error handling can be a good thing for larger programs or programs with greater stability requirements.
  • Dependencies problem seem to be unresolved issue


  • Functional programming is second level citizen. In particular list/dictionary comprehension is the Pythonic way while I prefer map and filter. Yes, that’s probably one of the features that makes Python easier to learn and suggested first language. Not everything that’s optimized for beginners must be good for more experienced users. It’s OK.
  • Mixed feelings about array[slice:syntax] . It’s helpful but slice:syntax is only applicable inside [ ] , in other places you must use slice(...) function to create the same slice object

Ruby and Perl

  • The Sigils syntax does not resonate with me.


I can’t put my finger on something specific but Ruby does not feel right for me.


  • Contexts and automatic flattening of lists in some cases make the language more complicated than it should.
  • Object orientation is an afterthought.
  • Functions that return success status. I prefer exceptions. Not the default behaviour in Perl but an afterthought: autodie.
  • Overall syntax feeling (strictly matter of personal taste).


Note that bash was created in a world that was vastly different from the world today: different needs, tasks, languages to take inspiration from.

  • Missing data structures (flat arrays and hashes is not nearly enough). jq is a workaround, not a solution in my eyes.
  • Awkward error handling with default of ignoring the errors completely (proved to be bad idea)
  • Expansion of undefined variable to empty string (proved to be bad idea)
  • -e ,  -u and other action at a distance options.
  • Unchecked facts but my feelings:
    • When bash was created, it was not assumed that bash will be used for any complex scripting.
    • bash was never “designed” as a language, started with simple commands execution and other features were just bolted on as time goes by while complete redesign and rewrite were off the table, presumably for compatibility.
  • Syntax
  • No widely used libraries (except few for init scripts) and no central code repository to search for modules (Correct me if I’m wrong here. I haven’t heard of these).

My suggested solution

I would like to fill the gap. We have systems-engineering-tasks oriented language: bash. We have quite a few modern programming languages. What we don’t have is a language that is both modern and systems-engineering-tasks oriented. That’s exactly what I’m working on: Next Generation Shell. NGS is a fully fledged programming language with domain specific syntax and features. NGS tries to avoid the issues listed above.

Expected questions and my answers

People work with existing languages and tools. Why do you need something else?

  • I assume I have lower bullshit tolerance than many others. Some people might consider it to be normal to build more and more workarounds (especially around anemic DSLs) where I say “fuck this tool, I would already finish the task without it (preferably using appropriate scripting language)”. I don’t blame other people for understandable desire to work with “standard” tools. I think it’s not worth it when the solutions become too convoluted.
  • I am technically able to write a new programming language that solves my problems better than other languages.

Another programming language? Really? We have plenty already.

  • I would like to remind you that most of the programming languages were born out of dissatisfaction with existing ones.
  • Do you assume that we are at global maximum regarding the languages that we have and no better language can be made?


Would you use NGS? Which features it must have? What’s the best way to ease the adoption? Please comment here, on Reddit (/r/bash , /r/ProgrammingLanguages) or on Hacker News.

Update: following feedback roughly of the form “Yes, I get that but many Ops tasks are done using configuration management tools and tools like CloudFormation and Terraform. How NGS compares to these tools” – there will be a blog post comparing NGS to the mentioned tools. Stay tuned!

Have a nice day!