“Use Dumb Shell, don’t Reinvent the Wheel”

Opening Rant

You don’t hear one developer saying “Just use Notepad” to a colleague with argumentation that goes roughly like this:

Why are you using this horrible Visual Studio Code? It has built-in debugger! No!

JetBrains IDEs? No! They do too much! They are so into the code!

Vim? Emacs? Not pure enough! Who needs that stupid syntax highlighting?

Keep text editing pure! Any semantic understanding by the text editor is undesirable, other programs should handle that. You don’t want to complicate the text editor.

Developers are not saying that because user experience and productivity matter. Yet, “Use Dumb Shell” is considered to be an acceptable opinion. Is that so common that people fall on their heads so hard (alternatively, did not give it any thought)? WTF?

The solution (shell) should be as simple as possible but not simpler than possible. Current shells are simpler than required by good user experience. Wrong trade-off. Keeping something simple is important but not more important than the outcomes.

Source: https://www.flickr.com/photos/toddle_email_newsletters/15413603567/
Image is a link to http://www.workcompass.com/

Additional food for thought:

  1. Why use a car when bicycle is so much simpler?
  2. Why use electricity when fire is so much simpler?
  3. Why have water in your house when a wells are so much simpler?

Background

I was doing consulting. The usual suspects: AWS, bash, Python, Puppet, Chef. Got to Terraform later. I had and I am still having subpar experiences with these tools. Anything I wanted to do, was overly burdensome, complicated and full of pitfalls.

Since I can’t attempt to fix everything, I picked the worst offender and started working on the alternative programming language and shell combo. The motivating opinion is that Ops have no good programming language nor adequate shell.

The absence of good programming language for ops was covered in another post. In this post I will cover some of the things that are wrong with the interactive shell.

The Shell

The dominant player is bash. It didn’t change much for decades: you type commands and get a dump of text on your screen. Most of the alternatives are essentially the same in this regard, for decades.

Is this because of the brilliant design? I would ask: which design? This? Quoting:

I wrote quite complex shell scripts and my first suggestion is “don’t”. The reason is that is fairly easy to make a small mistake that hinders your script, or even make it dangerous.

The “Dumb Shell” Approach

In this post I would like to address common thought that I hear from people regarding Next Generation Shell, a new programming language and a shell that I’m working on. Note that other shells which are more advanced than POSIX shells also get this. Quoting @cup from lobste.rs:

Wouldnt it be better to just have a dumb shell, that can call programs to do heavy lifting (read: programming languages). This way you have a “division of labor”. Shell works best for launching executables, and programming languages work best for handling data structures and algorithms.

No, it would not. I refuse to accept under-powered tools.

Dumbness is Fundamental Flaw

The “dumb shell” has no semantic understanding and doesn’t care about programs’ inputs nor outputs. Let’s see how it plays out.

Today, “Understanding” of programs’ inputs is covered by completion. Completion was added because “dumb shell” had horrible user experience. It’s slightly better now when the shell “understands” programs’ input to some degree. To some people completion is a scope creep. I think of it as better user experience and productivity gain.

“Understanding” of programs’ outputs? We are not there yet. It also seems that interacting with objects on the screen is too novel of an idea for the shell. Considering how much time this idea is out there: WTF?

Let’s see how this “dumbness” manifests as bad user experience even at the very basic, “intended” functionality:

Programs’ Output – Size

Do you know of any real world scenario when a human supposed to go over 10K lines on the screen? I mean just sit there and read it. Let me know. I’ve never seen such use case.

The shell is dumb, the shell “does not intervene” in programs’ outputs. Sounds good until you get unlimited number of lines dumped on your screen.

“Should have used less” you think later. Right. What if you forgot? The buffer is now filled with useless output and you can’t see outputs of previous programs. Are you being punished? No, just nobody cared about the UX. Alternatively, “it would be to complicated to implement”.

Programs’ Output – All Mixed

  1. Want to know what’s on your screen is stdout and what is stderr? Well… you can’t. Your shell is dumb, it doesn’t deal with things like that.
  2. Want to know from which program the output came from? Nope. Some programs cope with that to some degree by prepending their name to error messages: ls xxx gives you ls: xxx: No such file or directory. What a wonderful strategy! Keep the shell dumb and push the burden to all the programs.
  3. You can’t type because some background job is continuing to dump text on the screen where you are trying to work? Too bad, should have used redirection because guess what … you shell doesn’t handle that either… and you can’t add redirection after the program is running; again not shell’s business.

Programs’ Output – Semantic Understanding

You just typed aws ec2 describe instances --filters ... and now you have some output.

You now see on your screen instance you would like to stop. The ID of the instance is right in front of your face. Now you type aws ec2 stop-instances --instance-ids. You would like to append the instance ID that you see on the screen. Nope. Your shell doesn’t do that. Too dumb. Select with the mouse and paste, because f*ck you!

Side note: amazing AWS engineers did not include any human readable output format so you get JSON dumped on your screen (or any other format which is still non-human-compatible).

Let’s imagine for a moment that the command output had some semantic meaning to the shell.

  1. The shell would display the output as a table.
  2. The table would be interactive (interactive output, what a heresy!) and one could navigate with arrow keys and have a shortcut for copy/paste the current cell value to the command line (for completion).
  3. You could interact with the objects in the table with the mouse (very new concept, another heresy for the shell).
  4. How about instead of typing aws ec2 stop-instances --instance-ids you navigate to the correct line, press enter, choose “stop” from the menu and the command is constructed for you? aws ec2 stop-instances --instance-ids i-123... amazing, ha? Well, your shell can’t do that.

Meaning, do you speak it mo***er?

How about after performing operations using the UI you would get as per your choice one of the below snippets which would re-create the operation:

  1. CLI commands
  2. CloudFormation tempalte
  3. Terraform “code”

Solution: UI for the Shell

Suppose I agree for a second, what do you suggest?

https://github.com/ngs-lang/ngs/wiki/UI-Design

I personally don’t see how the described features could be implemented as external programs, keeping the shell “dumb”.

We Can Do Better Today

The reality has changed. What was once amazing is subpar by today’s standards. The world outside of the shell moved forward while the shell stayed almost the same. Brilliant design? Brilliant what?

Let’s move this industry together from the stone age of bash shell to the bronze age of something a bit less subpar – Next Generation Shell.

Closing Rant

Imaginary UNIX people:

We wanted to separate things because they are semantically different so we split the things into stdout and stderr. Well… stderr was is actually for everything that is not stdout.

One bit of metadata (stdout vs stderr) for semantic meaning of the output should be enough for everyone forever. Well… at least it’s simple for us to implement.


Update: discussion on lobste.rs

JQ is a symptom

jq is a great tool. It does what bash can not – work with structured data. I use it. I would like not to use it.

In my opinion, working with structured data is such a basic thing that it makes much more sense to be handled by the language itself. I want my shell to be capable and I strongly disagree with the view that a shell “is not supposed to do that”. Shell is supposed to do whatever is needed to make my life easier. Handling structured data is one of these things.

If “shell is not supposed to do that”, by that logic, bash is not supposed to do anything except for running external commands and routing the data between them. Doesn’t it seem odd that bash does have builtin string manipulation then? Maybe bash shouldn’t have added associative arrays in version 4? … or arrays in version 2? How about if and while ? Maybe bash shouldn’t have them either?

woman-698943_640

jq is a symptom that bash can’t handle today’s reality: structured data. The world is increasingly more about APIs. APIs consume and return structured data. I do work with APIs from shell. Don’t you guys use AWS CLI or any other API that returns JSON?

The reality has changed. bash hasn’t. I’m working on bash alternative. Please help me with it. Or at least spread the word.

If you don’t like my project, join Elvish . Elvish is another shell that supports structured data.


Happy coding! Hope it’s not in bash.

Terraform 0.12 language looks bad

I was hoping that smart guys vs bad situation will have another outcome but Terraform language for version 0.12 looks bad… as languages of Puppet and Ansible.

I’m not saying that people that made Puppet and Ansible are not smart. It’s that we could learn from the mistakes they made… unless we don’t consider those being mistakes.

Puppet and Ansible went through very similar difficult situation. They have limited themselves to a declarative format and then they tried to accommodate the real life. Terraform has this situation right now.

The situation is:

  • Declarative format being used
  • People need something more powerful, like a programming language because … real life where conditionals, loops and data transformations make much more sense than working around declarative languages limitations.

Interestingly enough, they all did not switch to a proper programming language. Maybe because that would be at least partially admitting that the product should have been a library in the first place?

Terraform is actually in very crappy situation because even if they decide to expose everything as a library as the main interface, I don’t see people start using Go for “infrastructure as code”. Not as smooth as Ruby or Python anyway.

Happy coding, everyone!

Update (2018-07-21):

On a bit more positive note, the new splat operator looks like an improvement.

Update (2018-07-27):

Terraform looks even more like a “normal” language with Conditional Operator Improvements and null value. The conditional operator fixes previous oddities that it had.

Update (2018-08-02):

Terraform got type system. Looks powerful. Just need to see that Terraform does not evolve to Scala 🙂

Update (2018-08-11):

New template syntax brings more raw power. Looks good.

Update (2018-08-26):

  • HCL to JSON one-to-one mapping. When I read “having a clean 1:1 mapping between HCL and JSON, and ensuring every feature of HCL is supported in JSON” I immediately thought that there must be converting tools then… and was not disappointed 🙂 “In future versions of Terraform, we will also support native tooling to convert HCL to JSON and JSON to HCL cleanly (including comments)”
  • “Comments in JSON” – nice!

 

Terraform becomes a programming language

Declarative languages failure

Approach that in my eyes failed, again and again, is to start with your own declarative language and then with time grow the language. (SQL being among notable exceptions)

Puppet is the best example. map and each, added in Puppet 4.0.0 are, in my opinion, just two in a sea of evidence that the envisioned simple format has failed to handle the needs of the real world.

Ansible’s loop looks bad as the whole idea of making top levels of programs in YAML based syntax (and the rest in Python).

In my opinion, it makes more sense to create a language first and then libraries for it, not a library and then a language around it.

My hope for Terraform

I think Terraform guys are smart. Among other things, it manifests in implementing data sources. Data sources make Terraform much more flexible. I think it’s very clever.

Terraform, which started declarative, are now inventing their own programming language. They are going the way of Puppet and Ansible. I hope they can do better, in this awkward situation: there are quite a lot of constraints on the programming language because of the existing syntax and semantics.

Happy coding, everyone!

 

No, not everybody uses X

How likely are you to give the wrong answer when everybody in the group gives the wrong answer? More likely than without the group. That’s what Asch conformity experiments have proven.

Marketing people know that. So when you get the impression  that “everybody uses X”, please be aware that it can be intentional and maybe does not match the reality. It can be just a trick. Bloggers for example, have incentives to write about X (consultants that can make money if you adopt X).

analytics-3291738_640
Makers of X must continue to grow no matter what

I don’t want to accuse any specific firm or product in this post but I suspect the “coolest”, the most advertised and the most pushed down our throats products. There is no guarantee makers of X are interested in your success. They are surely interested in their own growth and success.

Hope this post increases your chances to survive next marketing attack just by being aware of yet another deceitful marketing technique.

 

How fucked is AWS CLI/API

In the 21st century, we can do significantly better than this crap AWS CLI. We just need to think from the users’ perspective for a change and not just dump on users whatever we were using internally or seen in a nightmare and then implemented.

thinking-3082823_640
Think from the users’ perspective

I’m working on a solution which is described towards the end of the post. It’s not ready yet but the (working btw) examples will convince the reader that “significantly better” is very possible… I hope.

Background

I’m building AWS library and a command line tool. Since I’m doing it for a shell-like language (NGS), the AWS library that I’m building uses AWS CLI. Frustration and anger are prominent feelings when working with AWS CLI. I will just list here some of the reasons behind that feeling.

furious-2514031_640
AWS CLI/API – WTF were they thinking?

Overall impression

Separate teams worked on different services and were not talking to each other. Then the something like this happened: “OK, we have these APIs here, let’s expose them. User experience? No, we don’t have time for that / we don’t care”.

Tags

Representing a map (key-value pairs) as an array ( "Tags": [ { "Key": "some tag name", "Value": "some tag value" }, ... ] ), as AWS CLI does is insane… but only when you think about the user (developer in this case) experience.

shocked-2681488_640
AWS Tags – no, not in this form!

Reservations

When listing EC2 instances using aws ec2 describe-instances, the data is organized as list of Reservations and each Reservation has list of instances. I’m using AWS for quite a few years and I never needed to do anything with Reservations but I did spent some time unwrapping again and again the interesting data (list of instances) from the unwanted layer of madness. It really feels like “OK, Internally we have Reservations and that’s how our API works, let’s just expose it as it is”.

annoyed-3126442_640
Who da fuck ever used Reservations?

Results data structure

AWS CLI usually (of course not always!) returns a map/hash at the top level of the result. The most interesting data would be called for example LoadBalancerDescriptions, or Vpcs, or Subnets, or … ensuring difficulty making generic tooling around AWS CLI. Thanks! Do you even use your own AWS CLI, Amazon?

Inconsistencies

beer-2370783_640
Kind of the same but not really

Security Groups

These are of course special. Think of aws ec2 create-security-group ...

  1. --group-name and --description are mandatory command line arguments. For now, I haven’t seen any other resource creation that requires both name and description.
  2. The command returns something like { "GroupId": "sg-903004f8" } which is unlike many other commands which return a hash with the properties of the newly created resource, not just the ID.

Elastic Load Balancer

Oh, I like this one. It’s unlike any other.

  1. The unique key by which you find a load balance is name, unlike other resources which use ids.
  2. Tags on load balancers work differently. When you list load balancers, you don’t get the tags with the list like you have when listing instances, subnets, vpcs, etc. You need to issue additional command: aws elb describe-tags --load-balancer-names ...
  3. aws elb describe-load-balancers – items in the returned list have field VPCId while other places usually name it VpcId .

Target groups

aws elbv2 deregister-targets --target-group-arn ...

Yes, this particular command and it’s “target group” friends use ARN, not a name (as ELB) and not an ID (like most EC2 resources).

DHCP options (sets)

(Haven’t used but considered so looked at documentation and here is what I found)

Example: aws ec2 create-dhcp-options --dhcp-configuration "Key=domain-name-servers,Values=10.2.5.1,10.2.5.2"

Yes, the create syntax is unlike other commands and looks like filter syntax. Instead of say --name-servers ns1 ns2 ... switch you have "Key=domain-name-servers,Values=" WTF?

Route 53

aws route53 list-hosted-zones does not return the tags. Here is an output of the command:

{
 "HostedZones": [
 {
 "Id": "/hostedzone/Z2V8OM9UJRMOVJ",
 "Name": "test1.com.",
 "CallerReference": "test1.com",
 "Config": {
 "PrivateZone": false
 },
 "ResourceRecordSetCount": 2
 },
 {
 "Id": "/hostedzone/Z3BM21F7GYXS7Y",
 "Name": "test2.com.",
 "CallerReference": "test2.com",
 "Config": {
 "PrivateZone": false
 },
 "ResourceRecordSetCount": 2
 }
 ]
}

Wanna get the tags? F*ck you! Here is the command: aws route53 list-tags-for-resources --resource-type hostedzone --resource-ids Z2V8OM9UJRMOVJ Z3BM21F7GYXS7Y . Get it? You are supposed to get the Id field from the list generated by list-hosted-zones, split it by slash and then use the last part as resource ids. Tagging zones also uses the rightmost part of id: aws route53 change-tags-for-resource --resource-type hostedzone --resource-id Z2V8OM9UJRMOVJ --add-tags Key=t1,Value=v11

… but that apparently was not enough differentiation 🙂 Check this out: aws elb add-tags --load-balancer-names NAME --tags TAGS vs aws route53 change-tags-for-resource --resource-type hostedzone --resource-id ID --add-tags TAGS and aws elb remove-tags --load-balancer-names NAME --tags TAGS vs aws route53 change-tags-for-resource --resource-type hostedzone --resource-id ID --remove-tag-keys TAGS . Trick question: on how many dimensions this is different? So wrong on so many levels 🙂

Here is another one: when you fetch records from a zone you use the full id, /hostedzone/Z2V8OM9UJRMOVJ , not Z2V8OM9UJRMOVJaws route53 list-resource-record-sets --hosted-zone-id /hostedzone/Z2V8OM9UJRMOVJ

 

(many more to come)

Expected comments and answers

feedback-2044700_640
Internet discussions are the best 😉

Just use Terraform

I might some day. For the cases I have in mind, for now I prefer a tool that:

  1. Is conceptually simpler (improved scripting, not something completely different)
  2. Doesn’t require a state file (I don’t want to explain to a client with two servers where the state file is and what it does)
  3. Can be easily used for ad-hoc listing and manipulation of the resources, even when these resources were not created/managed by the tool.
  4. Can stop and start instances (impossible with Terraform last time I checked a few months ago)

Just use CloudFormation

I don’t find it convenient. Also, see Terraform points above.

By the way, the phrases of the form “Just use X” are not appreciated.

You just shit on what other people do to push your tools

Yes, I reserve the right to shit on things. Here I mean specifically AWS CLI. Freedom of speech, you know… especially when:

  1. The product (AWS API) is technically subpar
  2. The creator of the product does not lack resources
  3. The product is used widely, wasting huge amounts of time of the users

If I’m investing my time to write my own tool purely out of frustration, I will definitely tell why I think the product is crap.

Is that just a rant or you have alternatives?

I’m working on it. The command line tool is called na (Ngs Aws). It’s a thin wrapper around declarative primitives library for AWS.

excited-3126450_640
There might be a solution to this madness!!!

The command line tool that I’m working on is not anywhere ready but here is a gist of what you can do with it:

# list vpcs
na vpc

# Find the default VPC
na vpc IsDefault:true

# List security groups of all vpcs which have tag "Name" "v1".
na vpc Name=v1 sg

# Idempotent operations, in this case deletion.
# No error when this group does not exist.
# Delete "test-group-name" in the given vpc(s).
na vpc Name=v1 sg GroupName:test-group-name DEL

# List all volumes which are attached to stopped instances
na i State:stopped vol

# Delete all volumes that are not attached to instances
na vol State:available DEL

# Stop all instances which are tagged as
# "role" "ocsp-proxy" and "env" "dev".
na i role=ocsp-proxy env=dev SET State:stopped
  1. Na never dumps huge amounts of data to your terminal. As a human, you will probably not be able to process it so when number of items (rows in a table actually) is above certain configurable threshold, you will see something like this:
    $ na vol 
    === Digest of 78 rows ===
    ...

    It will show how many unique values there are in each column, min and max value for each column. Thinking about displaying top 5 (or so) top and bottom values for each column.

  2. Na has concept of “related” resources so when you write something like na i State:stopped vol , it knows that the volumes you want to show are related to the instances that you mentioned. In this particular case, it means volumes attached to the instances.
  3. Note the consistency of what you see in output and arguments to CLI. If something is called “State” in the data structure returned by AWS CLI, it will be called “State” in the query, not “state” (“–state”).

I will be updating this post as things come along.

Just use X

feedback-2044700_640

Typical conversation on the Internet.

I’m having this situation, I’m trying to do blah using blah and it doesn’t work for me because blah. How do I proceed?

Invariably, one of the answers is

Just use X

And to that I would like to answer right now:

Go fuck yourself!

Your answer shows lack of thought and arrogance. In addition, chances are that X is a blogosphere-hyped tool or product. Here are some suggestions for next time, instead of “Just use X”:

  1. Is there a reason you are not using X? I had similar situation, tried to achieve what you are trying to achieve and had positive experience.
  2. I haven’t tried myself, but I heard about X which I think should solve your problem, you should probably take a look if you haven’t yet.

Hope this helps both sides of the discussion.


You are welcome to link to this post when you get “Just use X” response.