Why Next Generation Shell?

Background

I’m a systems engineer. The job that I’m doing is also called system, SRE, DevOps, production engineer, etc. I define my job as everything between “It works on my machine” of a developer and real life. Common tasks are setting up and maintaining cloud-based infrastructure: networking, compute, databases and other services. Other common tasks are setting up, configuring and maintaining everything inside a VM: disks+mounts, packages, configuration files, users, services. Additional aspects include monitoring, logging and graphing.

The problem

If we take specifically systems engineering tasks such as running a VM instance in a cloud, installing and running programs on a server and modifying configuration files, typical scripting (when no special tools are used) is done in either bash or Python/Ruby/Perl/Go.

Bash

The advantage of bash is that bash is domain specific. It is convenient for running external programs and files manipulation.

# Count lines in all *.c files in this directory and below
wc -l $(find . -name '*.c')

# Make sure my_file has the line my_content
echo my_content >my_file

# Run a process and capture the output
out=$(my_process)

The disadvantage of bash is horrible syntax, pitfalls everywhere, awkward error handling and many features one would expect from a programming language (such as data structures, named functions parameters, etc) are missing.

# Inconsistent, awkward syntax,
# result of keeping backwards compatibility
if something;    then ... fi
while something; do ... done

# Can remove / if MY_DIR is not defined
# unless in "set -u" mode
rm -rf "$MY_DIR/"

# Removes files "a" and "b" instead of "a b"
myfile="a b"
rm $myfile

# Silently ignores the error unless in "set -e" mode
my_script

# Function parameters can't be named, they are
# in $1, $2, ... or in $@ and $*
myfunc() {
  FILE="$1"
  OPTION_TO_ENABLE="$2"
  ...
}

Leave bash alone, it was not intended for programming, don’t do anything in bash, just use external programs for everything.

What do you observe? Is it or is it not used as a programming language in real life?

General-Purpose programming languages

Python/Ruby/Perl/Go are general-purpose programming languages.

The advantage of general-purpose programming languages is in their power, better syntax, ability to handle arbitrary data structures.

orig = [1,2,3]
doubled = [x*2 for x in orig]

The disadvantage of general-purpose programming languages is that they are not and can not be as convenient for systems engineering tasks because they are not focusing on this particular aspect of programming (in contrast to bash and other shells for example).

# Write whole file - too verbose
f = open('myfile', 'w+')
f.write('mycontent')
f.close()

# Run a process and capture the output
# https://docs.python.org/3.5/library/subprocess.html
proc = subprocess.Popen(...)
try:
    outs, errs = proc.communicate(timeout=15)
except TimeoutExpired:
    proc.kill()
    outs, errs = proc.communicate()

Summary

My conclusion is that there is no handy language for systems engineering tasks. On one hand there is bash that is domain specific but is not a good programming language and does not cover today’s needs, on the other hand there are general-purpose programming languages which do not specialize on this kinds of tasks.

You can use Puppet, Chef, Ansible, Terraform, CloudFormation, Capistrano and many other tools for common systems engineering tasks. What if your task is not covered by existing tools? Maybe one-off? Maybe a case where using one of the existing tools is not an optimal solution? You would like to write a script, right? In that case, your life sucks because scripting sucks. That’s because there is no convenient language and libraries to get systems engineering tasks done with minimal friction and effort.

Solution

I suggest, creating a new programming language (with a shell) which is domain specific, as bash, and which incorporates important features of general-purpose programming languages: data structures, exceptions, types, multiple dispatch.

My way of looking at it: imagine that bash was created today, taking into account today’s reality and things that became clear with time. Some of them are:

  • The shell is used as a programming language.
  • A system is usually a set of VMs and APIs, not a single machine.
  • Most APIs return JSON so data structures are needed as multiple jq calls are not convenient.
  • Silently ignoring errors proved to be bad strategy (hence set -e switch which tries to solve the problem).
  • Silently substituting undefined variables with empty strings proved to be bad strategy (hence set -u switch).
  • Expanding $x into multiple arguments proved to be error prone.
  • Syntax matters.
  • History entries without context have limited usefulness (cd $DIR for example: what was the current directory before cd and what was in $DIR ?)
  • UX
    • Spitting lots of text to a terminal is useless as it can not be processed by a human.
    • Feedback is important.
      • Exit code should be displayed by default.
      • An effort should be made to display status and progress of a process.
      • Ideally, something like pv should be integrated into the shell.

I’m not only suggesting the solution I’ve just described. I’m working on it. Please give it a try and/or join to help developing it: NGS – Next Generation Shell.

NGS LOGO

# Make sure my_file has the line my_content
echo my_content >my_file

# Run a process and capture the output
out=`my_process`

# Get process handle (used to access output, exit code, killing)
p=$(my_process)

# Get process output and parse it, getting structured data
amis=``aws ec2 describe-images --owner self``
echo(amis.len()) # number of amis, not lines in output

# Functional programming support
orig = [1,2,3]
doubled = orig.map(X*2)

# Function parameters can be named, have default values, etc
F myfunc(a,b=1,*args,**kwargs) {
  ...
}

# Create AWS VPC and Gateway (idempotent)
NGS_BUILD_CIDR = '192.168.120.0/24'
NGS_BUILD_TAGS = {'Name': 'ngs-build'}
vpc = AWS::Vpc(NGS_BUILD_TAGS).converge(CidrBlock=NGS_BUILD_CIDR, Tags=NGS_BUILD_TAGS)
gw  = AWS::Igw(Attachments=[{'VpcId': vpc}]).converge(Tags=NGS_BUILD_TAGS)

I don’t think scripting is the right approach.

It really depends on the task, constraints, your approach and available alternative solutions. I expect that situations needing scripting will be with us for a while.

Another programming language? Really? Why the world needs yet another programming language?

I agree that creating a new language needs justification because the effort that goes into creating a language and learning a language is considerable. Productivity gains of using the new language must outweigh the effort of learning and switching.

NGS creation is justified in exactly the same way as many other languages were justified: dissatisfaction with all existing programming languages when trying to solve specific problem or a set of similar problems. In case of NGS the dissatisfaction is specifically how existing programming languages address the systems engineering tasks niche. NGS addresses this particular niche with a unique combination of features and trade offs. Productivity of using NGS comes from best match between the tool and the problems being solved.

Yet another shell? We have plenty already but they all have serious adoption problems.

NGS will be implementing ideas which are not present in other shells. Hopefully, the advantages will be worthy enough to justify switching.

I’ll be just fine with bash/Python/Ruby/Perl/Go

You will. The decision to learn and use a new language depends on your circumstances: how many systems engineering tasks you are doing, how much you suffer, how much easier the tasks will become with NGS, how easily this can be done in your company / on your project and whether you are willing to take the risk.

You could just write a shell based on Ruby or Python or whatever, leveraging all the time and effort invested in existing language.

I could and I didn’t. Someone else did it for Python and for Scala (take a look, these are interesting projects).

  • I don’t think it’s the right solution to stretch existing language to become something else.
  • NGS has features that can not be implemented in a straightforward way as a library: special syntaxes for common tasks, multiple dispatch.

One could just write a library for Python or Ruby or whatever happens to be his/her favorite programming language, leveraging all the time and effort already invested in existing language.

In order to be similar to NGS, one would not only have to build a library but also change language syntax. I personally know only two languages that can do that: Lisp (using reader macros) and Perl6 (using grammar facility). These are general-purpose programming languages. Turning them into something NGS-like will be a significant effort, which I don’t think is justified.

PowerShell appears to be similar to what you describe here.

Note that I have very limited experience with PowerShell. The only aspect I definitely like is consistent usage of the $ sigil.

  • It’s probably a matter of taste and what you are accustomed to but I like NGS’ syntax more. PowerShell is pretty verbose.
  • DSC appears to be focused on resources inside a server/VM. NGS plans similar functionality. Meanwhile, NGS uses this approach in the AWS library: vpc = AWS::Vpc(NGS_BUILD_TAGS).converge(CidrBlock=NGS_BUILD_CIDR, Tags=NGS_BUILD_TAGS)

There are libraries for Python that make systems engineering tasks easier.

Right, sh for example. Such solution can’t be used as shell, it just improves the experience of calling external program from Python.


Was this post convincing? Anything is missing to convince you personally? Let me know!

Have a nice day!