Fundamental flaws of bash and its alternatives


Quoting Steve Bourne from  An in-depth interview with Steve Bourne, creator of the Bourne shell, or sh:

I think you are going to see, as new environments are developed with new capabilities, scripting capabilities developed around them to make it easy to make them work.

Cloud happened since then. Unfortunately, I don’t see any shells that could be an adequate response to that. Such shell should at the very least have data structures. No, not like bash, I mean real data structures, nested, able to represent a JSON response from an API and having a sane syntax.

In many cases bash is the best tool for the job. Still, I do think that current shells are not a good fit for today’s problems I’m solving. It’s like the time has frozen for shells while everything else advanced for decades.

As a systems engineer, I feel that there is no adequate shell nor programming language exist for me to use.


Bash was designed decades ago. Nothing of what we expect from any modern programming language is not there and somehow I get the impression that it’s not expected from a shell. Looks like years changed many things around us and bash is not one of them. It changed very little.

Looks like it was designed to be an interactive shell while the language was a bit of afterthought. In practice it’s used not just as an interactive shell but as a programming language too.

What’s wrong with bash?

Sometimes I’m told that there is nothing wrong with Bash and another shell is not needed.

Even if we assume that nothing is wrong with Bash, there is nothing wrong with assembler and C languages either. Yet we have Ruby, Python, Java, Tcl, Perl, etc… . Productivity and other concerns might something to do with that I guess.

… except that there are so many things wrong with bash: syntax, error handling, completion, prompt, inadequate language, pitfalls, lack of data structures, and so on.

While jq is trying to compensate for lack of data structures can you imagine any “normal” programming language that would outsource handling of data? It’s insane.

Silently executing the rest of your code after an error by default. I don’t think this requires any further comments.

Do you really think that bash is the global maximum and we can’t do better decades later?

Over the years there were several attempts to make a better alternative to bash.

Project Focus on shell and shell UX Powerful programming language
Bash No No
Csh No No
Fish shell Yes No
Plumbum No Yes, Python
RC shell No No
sh No Yes, Python
Tclsh No Yes, Tcl
Zsh Yes No

You can take a look at more comprehensive comparison at Wikipedia.

Flaw in the alternatives

A shell or a programming language? All the alternatives I’ve seen till this day focus either on being a good interactive shell with a good UX or on using a powerful language. As you can see there is no “yes, yes” row in the table above and I’m not aware of any such project. Even if you find one, I bet it will have one of the problems I mention below.

Focusing on the language

The projects above that focus on the language choose existing languages to build on. This is understandable but wrong. Shell language was and should be a domain-specific language. If it’s not, the common tasks will be either too verbose or unnecessarily complex to express.

Some projects (not from the list above) choose bash compatible domain-specific language. I can not categorize these projects as “focused on the language” because I don’t think one can build a good language on top of bash-compatible syntax. In addition these projects did not do anything significant to make their language powerful.

Focusing on the interactive experience

Any projects that I have seen that focus on the shell and UX do neglect the language, using something inadequate instead of real, full language.

What’s not done

I haven’t seen any domain-specific language developed instead of what we have now. I mean a language designed from ground up to be used as a shell language, not just a domain-specific language that happened to be an easy-to-implement layer on top of an existing language.

Real solution

Do both good interactive experience and a good domain-specific language (not bash-compatible).

List of features I think should be included in a good shell:

Currently I’m using bash for what it’s good for and Python for the rest. A good shell would eliminate the need to use two separate tools.

The benefits of using a good shell over using one of the current shells plus a scripting language are:

Development process

With a good shell, you could start from a few commands and gradually grow your script. Today, when you start with a few commands you either rewrite everything later using some scripting language or get a big bash/zsh/… script which uses underpowered language and usually looks pretty bad.


Same libraries available for both interactive and scripting tasks.

Error handling and interoperability

Having one language for your tasks simplifies greatly both the integration between pieces of your code and error handling.

Help needed

Please help to develop a better shell. I mean not an easy-to-implement, a good one, a shell that would make people productive and a joy to use. Contribute some code or tell your friend developers about this project.

I’m using Linux. I’m not using Windows and hope I will never have to do it. I don’t really know what’s going on there with shells and anyhow it is not very relevant to my world. I did take a brief look at Power Shell it it appears to have some good ideas.

Small intro to threads, race conditions and locking


Programming with threads has some pitfalls. This post deals with the basic problem with concurrent code execution – race conditions.


Code – part of a program

Threads – code executing in parallel (concurrently) with other code

Race condition (in our context) – several threads running in parallel and their overall result depends on the scheduling of the threads.

The problem

Below is a very small naive code that demonstrates the problem.

We have a global variable i which is incremented by several threads (which run concurrently). The code for incrementing i is i = i + 1 . Multiple threads executing the code below have a race condition. One would think that 100 threads doing such increment 100 times each would result i being 10000 at the end.

Incorrect code


THREADS.ptimes(F() {

Incorrect code explanation

The code is in the NGS language but the logic would be the same across many languages that have the same concurrency model: C, Java, etc.

Side note (skip freely):

This example would not apply to some languages which guarantee atomic variable increment. There would be no problem there. If we change the computation to something more complex, even in these languages we’re back to code similar to the above.

ptimes – “parallel times” function – runs the given code in parallel threads.

N.ptimes(code) – runs N threads executing code.

F() { ... } – literal function

for(j;N) – loops with values of j from zero to N (not including N). Sugar for for(j=0; j<N; j=j+1).

Code summary: in 100 parallel threads do i = i + 1 100 times in each thread.

Incorrect code output

The scheduling of the threads is done by the operating system. Since the scheduling is out of our control, the result is unpredictable when a race condition is present.

The output is actually about somewhere between 7900 and 8400 on my system, different each time I run this code, not the 10000 you might expect.

So, why the result of i = i + 1 is dependent on scheduling of the threads? Let’s examine the two following scheduling alternatives:

Incorrect code threads scheduling alternatives

Scheduling alternative A:

Thread 1 runs all the code (i = i + 1), then thread 2 runs all the code. No problem there. i would be incremented by 2.

Scheduling alternative B:

Thread 1 runs i + 1 : fetch the value of i and add one to it. Thread 2 also runs i + 1. Thread 1 saves the computed value of i + 1 to i. Then thread 2 does the same. The problem is that the fetched value of i was the same for both threads. Summary: both threads fetched the same value of i , incremented it and stored. Total increment of i is one.

Since scheduling A vs B have different outcomes, it’s a race condition, the output is unpredictable and sometimes incorrect. The incorrect code above is purposely such that the output is mostly incorrect.

Fixing the code

Add l = Lock() at the beginning of the code.

Replace i = i + 1 with l.acquire(F() i=i+1) .

Code explanation

Lock() – creates a new lock

l.acquire(code) – acquires the lock l, runs the code and releases the lock.


Locks provide a way to make sure that only one thread executes the given section of a code. In our case, using the lock allows only “Scheduling alternative A”.

When several threads try to acquire a lock simultaneously, only one will succeed and then enter the code. When this thread finishes executing the given code it releases the lock. After the lock is released, it can be acquired by another thread.

Any locking mechanism must be based on an atomic hardware operation such as Test-and-set or Compare-and-swap. Trying to come up with your implementation of a lock which is based on code only will not work, you will be shifting the race condition from one place to another.


Watch out for race conditions as they are a common problem when using threads. Use locks to avoid race conditions. For best performance the code which is run when holding a lock should be as small as possible.

See also:

  1. Concurrent data structures
  2. Message Passing Concurrency / Message passing
  3. Event Loop Concurrency

Full incorrect and correct code:

Tips for beginning systems and software engineers


From time to time I’m toying with the idea to give a lecture to newcomers in the  IT industry (systems or software engineers). Here are some of the points that I would include in it:

Human factor


Software bugs are inevitable. Just accept it. Thinking that you will get any significant piece of code right the first time is statistically wrong. The correct assumption is that bugs will be there, various kinds of bugs. The answer to this problem is the correct development process. The process should include:

  1. Automated tests – minimize chance of bugs getting into the production environment. There will be bugs not covered by tests. Production environment usually differs in some ways from the testing environment. This means some bugs will make their way to the production environment so monitoring is needed. Automated tests help combat the following problems:
    1. Slower development because one becomes hesitant to make changes and has to test manually.
    2. More bugs as manual tests are not as good.
    3. Bad code as it will less likely be refactored.
  2. Gradual deployment of new versions.
  3. Monitoring – detect errors/bugs in running software. Goes well with gradual deployment.
  4. Metrics – detect performance issues and errors  in running code.
  5. Automated healing / rollback – tricky technique, use it carefully. Example: the system detects that the last deployed version of the application causes increased latency (or error rate) and automatically rolls back to a previous known-good version.

The aim of the correct process is to minimize the number of bugs that reach the production environment and minimize duration and effect of presence of such bugs in production.

Ready-made solution pitfall

You will read articles and blog posts that state that problem P has S as best solution.  Your circumstances may seem similar to what other people engage with but in reality they are unique just for your situation. Think twice before deciding that the problem P is exactly the problem you are solving and whether the solution S is applicable and is best for your situation.

Related: see Hypes section below.

Ease vs simplicity confusion

Simple: a line of code that does something very small and specific.

Easy: a line of code that does a lot by calling a framework function causing thousands of lines of code to be executed.

See also:


The IT industry is very hype-prone. Companies develop products for the industry. The bigger the budget, the more hype will be created around such product in order to sell or cause adoption. Even when the company provides you with a product that is free, the company may very well still profit in some indirect way so there is still a commercial interest. Remember that such hypes have nothing to do with how good the product is and especially how well it fits as a solution to a specific problem at hand. Note that advertisements usually do not include a “not for” section.

Detecting a product hype: it’s “cool”, it’s all over the news sites and blogs, the impression is that everyone uses it already and you are a laggard which does not understand how cool it is.

Detecting a product you might want to use: your friends that use the product tell you it’s good or it’s the best alternative and/or it matches the criteria you thought of before searching for the product.

Quoting Avishai Ish-Shalom:

Here’s how I recognise hypsters and hyped technology:
“it solves everything! it has no disadvantages!”
Here’s how I recognise experience and serious technology:
“it’s good for X but can also do Y to a degree, but beware of T, Z”

See also: Prove your tool is the right choice


Be a skeptic

When considering doing a task, assume the following (unrelated to each other) aspects:

  1. Your task will take more time and effort than you estimated.
  2. The resulting code will make the system much more complex and will be a nightmare to maintain.
  3. The feature will rarely or never be used. That is often the case.

Professionals usually avoid doing tasks which are “nice to have” and are not really required. Being a skeptic usually has almost no penalty when you are wrong but can save a lot of time and effort when you are right.


Code duplication

Don’t duplicate code (copy+paste):

  1. Duplicated code means more maintenance.
  2. Duplicated code may some day be updated in one place and not the other, causing bugs.
  3. You will be looked down upon by senior developers as this is a big no-no.
  4. This rule has rare exceptions, use your judgment. If not sure, don’t duplicate.

Code reuse

Do it. When using any existing code whether it’s a function, program, utility, library or framework – take the time to read the documentation first. Understand the functionality, limitations, architecture. It is a very good investment of your time. You can learn it the hard way after several times when you discover that the particular code is not what you need. Such mismatch usually needs either a work around or picking another library, function or tool.

Code style

Pick one style and stick to it. In cases when a project has few styling conventions for some reason, the local convention wins. Example: all files use tabs except the file that you are editing, which uses spaces. In this case use spaces. Inconsistent code style causes higher maintenance costs and developers’ frustration.

See also:

System complexity and code complexity

Strive for simplicity. More complex systems:

  1. Are harder to maintain
  2. Are harder to add new code to
  3. Have more bugs
  4. Have bugs that are harder to find

Complex and complicated are different things. The system or code might need to be complex when solving a complex problem. Complication is an unnecessary byproduct of bad software architecture, bad design or bad implementation. The more professional the programmer is the simpler and cleaner the produced code gets.


Benefits of automation are:

  1. Avoiding boring repetitive tasks
  2. Minimizing chances of mistakes (common in manual tasks)
  3. Describing the task for your future self and others
  4. Allowing others to do the task easily
  5. Increased productivity

Important defaults

Here are important defaults the can save you many tears. Deviation from these requires justification.

  1. UTF-8 for character encoding
  2. For servers: UTC time zone

Knowledge and proficiency

Theoretical knowledge

A must:

  1. Big O notation.
  2. Common data structures and how they are implemented.
  3. Common algorithms and their time and space complexity.
  4. What a CPU does, which opcodes exist for a CPU of your choice.
  5. Compilers and compilation.
  6. Interpreters.
  7. What a kernel does, and it’s main system calls.

Practical knowledge

A must:

  1. Learn at least 3 very different programming languages. I’d suggest C, Lisp and Python. The chances are that you will not find a Lisp job but it surely will make you a better programmer.
  2. How the Internet works: IP, UDP, TCP, Routing, BGP, NAT, DNS, HTTP, HTTPS, SMTP, load balancing, browser. Pick one of the above and read an RFC that describes it.
  3. Cloud concepts

Nice to have:

  1. HTML / Javascript (reading the spec would be very nice)
  2. Character encodings and UTF-8 in particular


Become proficient with your tools if you want to be professional (read: work smarter and faster and be paid more). Imagine that you edit a text for just a few hours a day as part of your job. Invest a few days to learn more editor commands in order to become 10 or 20 percent more productive. This pays off quickly. Continue smart investment of your time by learning bits that will help you the most.

Programming languages are among the tools you will be using. One of the best investments of your time would be a deeper understanding of the language, and it’s tools.

Never stop learning! Among other benefits, this should improve your learning skills. This is one of the important skills you need to do your job.

See also: discussion about this post on Hacker News.

That was my opinion. Would you add any other points?

Bashing bash – undefined variables

This is the second post in the “bashing bash” series. The aim of the series is to highlight problems in bash and convince systems and software engineers to help building a better alternative.

The problem

What does the following command do?

rm -rf $mydir/

Looks like the author would like to delete $mydir directory and everything in it. Actually it may do unexpected things because of missing quotes. The rant about quotes is in the previous post. This post is about yet another issue.

The correct commands should be:

set -u
rm -rf "$mydir/"

The important thing here is set -u . Without it, when $mydir is undefined for some reason, such as a bug in code preceding the rm command, there is a chance to brick the machine because an undefined variable becomes an empty string so the command is silently expanded to

rm -rf /


While more experienced engineers will usually use set -eu at the beginning of the script, omitting this declaration is a big trap for others.

Side note. You could ask why the original command has a trailing slash. The trailing slash is common and is used to signify a directory. While to the best of my knowledge the rm should work the same without the slash, some commands are actually more correct with trailing slash. For example cp myfile mydir/ would copy the file into the directory if it exists and would cause error if it doesn’t. On the other hand,  cp myfile mydir would behave the same if directory exists but would create a mydir file if there is no such directory nor file, which was not intended. Other commands such as rsync also behave differently with and without the slash. So it is common to use the slash.

See also: – bash options

The suggested solution

In NGS, any use of an undefined variable is an exception.

ngs -e 'echo(a)'

It’s going to look prettier but even in current implementation you have all the information about what happened:

========= Uncaught exception of type 'GlobalNotFound' =========
====== Exception of type 'GlobalNotFound' ======
=== [ dump name ] ===
* string(len=1) a
=== [ dump index ] ===
* int 321
=== [ backtrace ] ===
[Frame #0] /etc/ngs/bootstrap.ngs:156:1 - 156:10 [in <anonymous>]
[Frame #1] /etc/ngs/bootstrap.ngs:152:17 - 152:29 [in bootstrap]
[Frame #2] <command line -e switch>:1:8 - 1:9 [in <anonymous>]

While bash options probably have historical justification, a new language should not have such a mechanism. It complicates things a lot. In addition, the programmer should always be aware what are the current options.

Please help building a better alternative

Go to and contribute some code.

Bashing bash – variable substitution

This is the first post in the “bashing bash” series to highlight problems in bash and convince systems and software engineers to help building a better alternative.

Bash is a very powerful and useful tool, doing a better job than many other shells and programming languages when used for the intended tasks. Still, it’s hard to believe that writing a software decades later can not be done better.

The problem

What does the following command do?

cp $src_file $dst_file

One might think it copies the given file to the specified destination. Looking at the code we can say it was the intention. What would actually happen? It can not be known from the line above. Each $src_file and $dst_file expand to zero to N arguments so unexpected things that could happen. The correct command would be

cp "$src_file" "$dst_file"

Forgetting the quotes or leaving them out assuming  $src_file and $dst_file will always contain one bash word, expanding to exactly one argument each is dangerous.

Keeping quoting everything makes code cluttered.

The suggested solution

In NGS, $var expands to exactly one argument similar to "$var" in bash. The new syntax $*var, consistent with similar syntax in other NGS parts, would expand to zero to N arguments.

Please help building a better alternative

Go to and contribute some code.

Israeli Banks Web Security Mini Survey – 2016



I have used Qualsys HTTPS checker tool to survey Israeli banks and a few reference sites. Main points summarized in the table below.

I did no “hacking” nor “cracking” nor break-in attempts.

I am not a security specialist. I just have some basic understanding of security.

List of banks is from Banking in Israel article on Wikipedia.

Comparison points

  1. SSL3 – insecure, old protocol, should not be used since June 2015
  2. RC4 – unsupported by recent versions of major browsers since January 2016 because it’s considered to be an insecure protocol. Deprecation started in 2015.
  3. SHA256 certificate – as opposed to deprecated SHA1 certificate.
  4. TLS 1.2 – The recommended version of TLS, invented in 2008, plenty of time to implement, one would think… The most important in my opinion (and Qualsys’ too, according to ratings).
  5. The forward secrecy supporting protocols protects your current sessions, which are probably recorded by NSA and others, from being decrypted later, when the server is compromised. A site gets “yes” if there are some protocols one could use to connect to the site that support the forward secrecy feature.
  6. Qualsys overall rating

Note that presence of SSL3 or RC4 is not a problem for up-to-date browsers as they just will not use it. It enables insecure connections for older browsers (in some cases the alternative would be no connection at all).


Web Site SSL3 (bad) RC4 (bad) SHA256 certificate TLS 1.2 Forward secrecy
Qualsys rating
Hapoalim ( no no yes no no C
Leumi ( no no yes no no C
Discount ( no no yes yes no A-
Mizrahi Tfahot ( no no yes yes partial A-
First International Bank of Israel ( no yes no yes no C
Gmail ( yes yes no yes yes B
Yahoo mail ( no no yes yes yes A
Facebook ( no yes yes yes yes B
Bank of America ( no no yes yes no A-

Opinion / Rant

Banks that do not support TLS 1.2 should close the web site, heads of security along with their bosses should do Seppuku and the banks should be closed. Do you think that banking information security is less important than emails or Facebook? Maybe it’s “duopoly of Hapoalim and Leumi” manifestation?

Banks that do not support forward secrecy – it’s about damn time!

When one of my clients asked me to improve HTTPS security (when it became important), it went from C to A in about half a day of work for several Nginx and ELB endpoints. Yes, a bank has more complex security and more variety in types of clients but it also has a security team, not one part-time operations guy. The security situation is outrageous.

Most JQ you will ever need

I’m using jq (think sed/awk/grep for JSON) for a while now. From time to time, people ask me how to do this or that using jq. Most of the questions can be distilled into these two cases:

Show tabular data

Given array of objects, show it in a tabular, human readable format.

Sample jq input: output of aws cloudformation describe-stacks:

    "Stacks": [
            "DisableRollback": false,
            "StackStatus": "CREATE_COMPLETE",
            "Tags": [],
            "CreationTime": "2016-05-10T14:16:06.573Z",
            "StackName": "CENSORED",
            "Description": "CENSORED",
            "NotificationARNs": [],
            "StackId": "arn:aws:cloudformation:CENSORED",
            "Parameters": [

To show this in a human-readable form you could pipe it to

jq -r '.Stacks[] | "\(.StackName) \(.StackStatus) \(.CreationTime) \(.Tags)"'


  1. -r make the output “raw”. Such output does not include enclosing quotes for each of the strings we are generating
  2. .Stacks[] | – for each element in the Stacks array evaluate and output the expression that follows the vertical bar (inexact explanation but it fits this case). When evaluating the expression to the right, the context will be set to one element of the array at a time.
  3. "..." – a string
  4. Inside the string, \(.StackName) – The .StackName attribute of current element from the Stacks array.

The output columns will not be visually aligned. I suggest solving the alignment issue by introducing special columns separator character such as % and the using the columns command to visually align the columns. Full solution suggestion:

aws cloudformation describe-stacks | \
jq -r '.Stacks[] | "\(.StackName)%\(.StackStatus)%\(.CreationTime)%\(.Tags)"' | \
column -t -s '%'

Note: I don’t have a good solution for the case when more than one or two tags are present. The output will not look very good.

More AWS-specific JSON formatting is in my .bashrc_aws.

Do something with each object

Given array of objects, iterate over it accessing fields in each array element.

Continuing the example above, assume you want to process each stack somehow.

aws cloudformation describe-stacks | \
jq -rc '.Stacks[]' | while IFS='' read stack;do
    name=$(echo "$stack" | jq .StackName)
    status=$(echo "$stack" | jq .StackStatus)
    echo "+ $name ($status)"
    # do your processing here


The only new thing here is the -c switch, for “compact”. This causes each resulting JSON representing a stack to be output on one line instead of few lines.

UPDATE: AWS tags handling

I was dealing with instances which had several tags and it was very annoying so using jq man and Google I’ve arrived to this:

aws ec2 describe-instances | jq -r '.Reservations[].Instances[] | "\(.InstanceId)%\(if .Tags and ([.Tags[] | select ( .Key == "Name" )] != []) then .Tags[] | select ( .Key == "Name" ) | .Value else "-" end)%\(.KeyName)%\(if .PublicIpAddress then .PublicIpAddress else "-" end)%\(.LaunchTime)%\(.InstanceType)%\(.State.Name)%\(if .Tags then [.Tags[] | select( .Key != "Name") |"\(.Key)=\(.Value)"] | join(",") else "-" end)"' | sed 's/.000Z//g' | column -t -s '%'

Explanation of the new parts:

  1. if COND_EXPR then VAL1 else VAL2 end construct which returns VAL1 or VAL2 depending on whether COND_EXPR evaluates to true or false. Works as expected.
  2. Why if .Tags is needed? I want to display something (-) if there are no tags to keep the columns aligned. If you didn’t want do display anything special there… you just had to have this if .Tags anyway. Why? Thank AWS for the convoluted logic around tags! OK, you made tags (which should be a key-value map) a list but then if the list is empty it’s suddenly not a list anymore, it’s a null! I guess this comes from Java developers trying to save some memory… and causing great suffer for the users. The .Tags[] expression fails if .Tags is null.
  3. [...] builds an array
  4. select(COND_EXPR) filters the elements so that only elements for which the COND_EXPR evaluates to true are present in it’s output.
  5. join("STR") – predictably joins the elements of the array using the given separator STR.

I think that any instances I have that have tags, have the Name tag. I guess you would need to adjust the code above if it’s not the case on your side. If I will have this problem and a fix, I’ll post an update here.

Handling large or complex JSONs

When your JSON is large, sometimes it’s difficult to understand it’s structure. The tool show-struct will show which jq paths exist in the given JSON and summarize which data is present at these paths.

For the example above, the output of aws cloudformation describe-stacks | - is

.Stacks -- (Array of 10 elements)
.Stacks[].CreationTime -- 2016-04-18T08:55:34.734Z .. 2016-05-10T14:16:06.573Z (10 unique values)
.Stacks[].Description -- CENSORED1 .. CENSORED2 (7 unique values)
.Stacks[].DisableRollback -- False
.Stacks[].LastUpdatedTime -- 2016-04-24T14:08:22.559Z .. 2016-05-10T10:09:08.779Z (7 unique values)
.Stacks[].NotificationARNs -- (Array of 0 elements)
.Stacks[].Outputs -- (Array of 1 elements)
.Stacks[].Outputs[].Description -- URL of the primary instance
.Stacks[].Outputs[].OutputKey -- CENSORED3 .. CENSORED4 (2 unique values)
.Stacks[].Outputs[].OutputValue -- CENSORED5 .. CENSORED6 (2 unique values)
.Stacks[].Parameters -- (Array of 3 elements) .. (Array of 5 elements) (2 unique values)
.Stacks[].Parameters[].ParameterKey -- AMI .. VpcId (11 unique values)
.Stacks[].Parameters[].ParameterValue --  .. CENSORED7 (13 unique values)
.Stacks[].StackId -- arn:aws:cloudformation:CENSORED8
.Stacks[].StackName -- CENSORED9 .. CENSORED10 (10 unique values)
.Stacks[].StackStatus -- CREATE_COMPLETE .. UPDATE_COMPLETE (2 unique values)
.Stacks[].Tags -- (Array of 0 elements)

Extras from a friend

AMIs with joined EBS snapshots list and tags

Demonstrates AWS tags values and join for normalization.

aws ec2 describe-images --owner self --output json | jq -r '.Images[] | "\(if .Tags and ([.Tags[] | select ( .Key == "origin" )] != []) then .Tags[] | select ( .Key == "origin" ) | .Value else "-" end)%\(.ImageId)%\(.Name)%\(if .Tags and ([.Tags[] | select ( .Key == "stamp" )] != []) then .Tags[] | select ( .Key == "stamp" ) | .Value else "-" end)%\(.State)%\(.BlockDeviceMappings | map(select(.Ebs.SnapshotId).Ebs.SnapshotId) | join(",") | if .=="" then "-" else . end)"' | (echo "Source%AMI%Desc%Stamp%State%Snaps"; cat;) | column -s % -t


Source   AMI           Desc                    Stamp       State      Snaps
serv001  ami-11111111  serv001_20171031054504  1509428704  available  snap-1111111111111111
-        ami-22222222  prd-db002-20170911      -           available  snap-2222222222222222
app333   ami-33333333  app333_20171031054504   1509428704  available  snap-4444444444444444,snap-7777777777777777

Load-balancers with de-normalized instances

Demonstrates vars for de-normalization


aws elb describe-load-balancers --output json | jq -r '.LoadBalancerDescriptions[] | . as $l | .Instances[] as $i | [$l.LoadBalancerName] + [$i.InstanceId] | @csv' | sed 's#"##g; s#,#\t#g;' | (echo -e "LB\tInstance"; cat;) | column -t


LB            Instance
wordpress-lb  i-11111111
app-lb        i-11111111
webapp-prd    i-33333333
webapp-prd    i-99999999

External links

  1. StackOverflow – How to convert arbirtrary simple JSON to CSV using jq?

Hope this helps.

Please let me know in comments if there is some another common use case that should be added here.

AWS CLI inhuman output and what I suggest

The problem

If you use AWS you probably use AWS CLI. This tool gives you control over your resources from the command line. AWS CLI can output the results of your commands in several formats. None of the formats seem to be useful for humans.

AWS CLI can output the data in the following formats:

  1. JSON, the default – good for processing with jq or programming languages, definitely not good for a human looking at list of few or more instances.
  2. text – good for processing with command line tools such as awk, very hard to read for a human.
  3. table – WTF? Probably meant for humans. Cluttered and difficult for a human to process. See the screenshot:
AWS CLI. List of instances. –output table. Is it for humans?

You can make it look better but what you see above is in no way reasonable default and making it look prettier should not be as unpleasing as described in the user guide towards the end of the page.

My takes on solving this

Take 1 – old script

I have a script which uses outdated Ruby AWS client which does not work with eu-central regions because it’s old. It was originally built for EC2 classic so it fails to show VPC security groups.


If newer regions and VPC security groups issues are solved this is a fine tool for human output.

Take 2 – JQ

While the old script is not fixes I have this temporary solution: jq transformation of AWS CLI output JSON (from my .bashrc_aws). The output is one line per instance.

DESC () 
    aws ec2 describe-instances | jq -r '.Reservations[].Instances[] | "\(.InstanceId)%\(.KeyName)%\(.PublicIpAddress)%\(.LaunchTime)%\(.InstanceType)%\(.State.Name)%\(.Tags)"' | sed 's/.000Z//g' | column -t -s '%'

It works well when there are no tags or small amount of tags per instance. With few tags on an instance, the line becomes too long and wraps. Reformatting sample output here for your convenience. Original is one logical line, wrapped, takes almost two lines in the terminal window.


Take 3 – ec2din.ngs

This is one of the demo scripts in the NGS project that I’m working on.

The column after state (“running”) is the key name.

ec2din.ngs – human output

It would be better if actual available screen width would be considered so there would be no need to arbitrary trim security groups’ names and tags’ keys and values (the ... in the screenshot above).

How I think it should be solved

NGS (the new shell I’m working on) implements the Table type which is used to do the layout you see on “ec2din.ngs – human output” screenshot. Table is part of the standard library in NGS. As you see, it’s already useful in it’s current state. There is plenty of room for improvement and the plan is to continue improving Table beyond simply “useful”.

This is the right direction with the following advantages:

  1. The script that needs to output a table (for example ec2din.ngs) does not deal with it’s layout at all, making it smaller and more focused on the data.
  2. The script can set defaults but does not handle which columns are actually displayed and in which order. The Table type handles it using another NGS facility: config which currently gets the configuration from environment variables but should be able to read it from files and maybe other sources in future.
  3. Output of tabular data is a very common task so factoring it out to a library has all the advantages of code deduplication.

I would also like the shell to automatically detect heuristically or otherwise which data comes out of the programs that you run and display accordingly. Most of the data that is manipulated in shells is tabular and could be displayed appropriately, roughly as ec2din.ngs displays it. Maybe ec2din.ngs will become unnecessary one day.

In my dream, the output is semantic so that the shell knows what’s on the screen and allows you to navigate for example to i-xxxxx part, press a key and choose which operation you would like to perform on the instance. When you pick an operation, the shell constructs a command for such operation and executes it exactly as if you typed it in (so it’s in history and all other CLI advantages over GUI apply).

Common Sense Movement – where is it?


Let’s look at systems and software engineering.

We hear about all kinds of “cool” methodologies, ideas, frameworks: DevOps, Scrum, Agile, Kanban, Continuous Integration, Continuous Delivery, Puppet, Chef, Ansible … and combinations of these.

Why do you hear about all this cool stuff all the time? Why there are hypes all the time? Simple: money. Another hype, another opportunity to sell you something [at higher price]: products, support, certifications, guidance. You don’t necessarily need these things but they definitely would like to sell you all of it.

I’m starting Common Sense Movement

This humorous thought crossed my mind yesterday. You know, just to balance the situation a bit. Then I realized why don’t you hear much about the shiny cool “common sense” methodology: money.

You can’t sell a “common sense” certificate. That’s because if someone really has some common sense, he/she is not going to pay you for this crap and the only certificate you can sell would be “Failed Common Sense Practitioner”. Actually I doubt you can  hardly sell any crap to common sense practitioners. That’s a problem. So, there is no Common Sense Movement for you 😦

I really would like the hype waves to stop so we could just do our work in peace. Meanwhile, when you are approached by a client or a boss with yet another “We need tool X” or “We need technology Y” you should really answer with “Prove that you need it and it’s optimal usage for our money and time considering other alternatives” no matter how new,shiny or cool that thing is.

Prove your tool is the right choice

I see two common approaches at work when it comes to choosing tools.

Photo of different tools
How do you choose?


I have heard about this new shiny cool framework/library/tool, let’s use it!

It might be a good approach when you want to learn something in your spare time, just don’t do this at work.


We have this problem and found a few alternative solutions (frameworks/libraries/tools) that might solve it. We will investigate and see if any of them fit our needs.

Note that depending on your situation, it might be the case when none of the existing tools will be a good fit for your project.

It might be tempting to look at existing solutions in a positive light thinking “our problem is not unique, there are tools solving it already”. The problem might appear to be similar to the problems that these tools solve but remember to look and focus and what’s unique to your circumstances.

Suggested approach

You should prove that the points below are wrong.  Assume that the tool you are evaluating:

  1. Does not cover your needs.
  2. Has crappy/new/unstable code.
  3. Has no community so getting answers to your questions or help is a nightmare.
  4. Has none or very costly official support.
  5. It will have a steep learning curve.
  6. It will be harder to maintain than any other alternative or your own code.
  7. You will need to modify the tool and it will be tough.
  8. Provides so little that it’s not worth another external dependency.
  9. Will cause lock-in with very high switching costs. (You can affect this one to some degree with your design).
  10. Has big company behind it which
    1. had huge investment and only cares about things like ROI and market share and not you, even if they did at some point in the past.
    2. has more interesting markets than you and optimizes the product for these markets, making it not a good fit for your situation.
    3. generates hype making it all look rosy even when the product has grave flaws.
    4. makes it look like everybody uses this tool.
    5. makes it feel like the next best thing.
  11. Has a big alliance behind it with many interests of different partners with zero intersection with your interests.
  12. Most places that use this tool suffer immensely but either don’t talk about it or you don’t hear that for other reasons. Dig deeper to find such information.

Calling some friends and asking them about the tool is very good but note that your situation is different so most of the points listed above still stand.

I guess that when you start positively when evaluating a tool, you might fall into the confirmation bias trap. Beware!