Why I have no favorite programming language

TL;DR – because for me there is no good programming language.

I’m doing mostly systems engineering tasks. I manage resources in Cloud and on Linux machines mostly. I can almost hear your neurons firing half a dozen names of programming languages. I do realize that they are used by many people for systems engineering tasks:

  • Go
  • Python
  • Ruby
  • Perl
  • bash

The purpose of this post is not to diminish the value of these languages; the purpose is to share why I don’t want to use any of the languages above when I write one of my systems-engineering-task scripts. My hope is that if my points resonate with you, the reader, you might want to help spread the word about or even help with my suggested solution described towards the end.

man-2756206_640

So let’s go over and see why I don’t pick one of the languages:

Why not language X?

All languages

  • Missing smart handling of exit codes of external processes. Example in bash: if test -f my_file (file is not there, exit code 1) vs if test --f my_file (syntax error, exit code 2). If you don’t spot the syntax error with your eyes, everything behaves as if the file does not exist.
  • Missing declarative primitives libraries (for Cloud resources and local resources such as files and users). Correction: maybe found one, in Perl – (R)?ex ; unfortunately it’s not clear from the documentation how close it is to my ideas.

All languages except bash

  • Inconvenient/verbose work with files and processes. Yes, there are libraries for that but there is no syntax for that, which would be much more convenient. Never seen something that could compare to my_process > my_file or echo my_flag > my_file .

Go

  • Compiled
  • Error handling is a must. When I write a small script, it’s more important for me for it to be concise than to handle all possible failures; in many cases I prefer an exception over twice-the-size script. I do understand how mandatory and explicit error handling can be a good thing for larger programs or programs with greater stability requirements.
  • Dependencies problem seem to be unresolved issue

Python

  • Functional programming is second level citizen. In particular list/dictionary comprehension is the Pythonic way while I prefer map and filter. Yes, that’s probably one of the features that makes Python easier to learn and suggested first language. Not everything that’s optimized for beginners must be good for more experienced users. It’s OK.
  • Mixed feelings about array[slice:syntax] . It’s helpful but slice:syntax is only applicable inside [ ] , in other places you must use slice(...) function to create the same slice object

Ruby and Perl

  • The Sigils syntax does not resonate with me.

Ruby

I can’t put my finger on something specific but Ruby does not feel right for me.

Perl

  • Contexts and automatic flattening of lists in some cases make the language more complicated than it should.
  • Object orientation is an afterthought.
  • Functions that return success status. I prefer exceptions. Not the default behaviour in Perl but an afterthought: autodie.
  • Overall syntax feeling (strictly matter of personal taste).

bash

Note that bash was created in a world that was vastly different from the world today: different needs, tasks, languages to take inspiration from.

  • Missing data structures (flat arrays and hashes is not nearly enough). jq is a workaround, not a solution in my eyes.
  • Awkward error handling with default of ignoring the errors completely (proved to be bad idea)
  • Expansion of undefined variable to empty string (proved to be bad idea)
  • -e ,  -u and other action at a distance options.
  • Unchecked facts but my feelings:
    • When bash was created, it was not assumed that bash will be used for any complex scripting.
    • bash was never “designed” as a language, started with simple commands execution and other features were just bolted on as time goes by while complete redesign and rewrite were off the table, presumably for compatibility.
  • Syntax
  • No widely used libraries (except few for init scripts) and no central code repository to search for modules (Correct me if I’m wrong here. I haven’t heard of these).

My suggested solution

I would like to fill the gap. We have systems-engineering-tasks oriented language: bash. We have quite a few modern programming languages. What we don’t have is a language that is both modern and systems-engineering-tasks oriented. That’s exactly what I’m working on: Next Generation Shell. NGS is a fully fledged programming language with domain specific syntax and features. NGS tries to avoid the issues listed above.

Expected questions and my answers

People work with existing languages and tools. Why do you need something else?

  • I assume I have lower bullshit tolerance than many others. Some people might consider it to be normal to build more and more workarounds (especially around anemic DSLs) where I say “fuck this tool, I would already finish the task without it (preferably using appropriate scripting language)”. I don’t blame other people for understandable desire to work with “standard” tools. I think it’s not worth it when the solutions become too convoluted.
  • I am technically able to write a new programming language that solves my problems better than other languages.

Another programming language? Really? We have plenty already.

  • I would like to remind you that most of the programming languages were born out of dissatisfaction with existing ones.
  • Do you assume that we are at global maximum regarding the languages that we have and no better language can be made?

Feedback

Would you use NGS? Which features it must have? What’s the best way to ease the adoption? Please comment here, on Reddit (/r/bash , /r/ProgrammingLanguages) or on Hacker News.


Update: following feedback roughly of the form “Yes, I get that but many Ops tasks are done using configuration management tools and tools like CloudFormation and Terraform. How NGS compares to these tools” – there will be a blog post comparing NGS to the mentioned tools. Stay tuned!


Have a nice day!

NGS unique features – exit code handling

smilies-1607163_640

How other languages treat exit codes?

Most languages that I know do not care about exit codes of processes they run. Some languages do care … but not enough.

Update / Clarification / TL;DR

  1. Only NGS can throw exceptions based on fine grained inspection of exit codes of processes it runs out of the box. For example, exit code 1 of test will not throw an exception while exit code 1 of cat will throw an exception by default. This allows to write correct scripts which do not have explicit exit codes checking and therefore are smaller (meaning better maintainability).
  2. This behaviour is highly customizable.
  3. In NGS, it is OK to write if $(test -f myfile) ... else ... which will throw an exception if exit code of test is 2 (test expression syntax error or alike) while for example in bash and others you should explicitly check and handle exit code 2 because simple if can not cover three possible exit codes of test (zero for yes,  one for no, two for error). Yes, if /usr/bin/test ...; then ...; fi in bash is incorrect! By the way, did you see scripts that actually do check for three possible exit codes of test? I haven’t.
  4. When -e switch is used, bash can exit (somewhat similar to uncaught exception) when exit code of a process that it runs is not zero. This is not fine grained and not customizable.
  5. I do know that exit codes are accessible in other languages when they run a process. Other languages do not act on exit codes with the exception of bash with -e switch. In NGS exit codes are translated to exceptions in a fine grained way.
  6. I am aware that $? in the examples below show the exit code of the language process, not the process that the language runs. I’m contrasting this to bash (-e) and NGS behaviour (exception exits with non-zero exit code from NGS).

Let’s run “test” binary with incorrect arguments.

Perl

> perl -e '`test a b c`; print "OK\n"'; echo $?
test: ‘b’: binary operator expected
OK
0

Ruby

> ruby -e '`test a b c`; puts "OK"'; echo $?
test: ‘b’: binary operator expected
OK
0

Python

> python
>>> import subprocess
>>> subprocess.check_output(['test', 'a', 'b', 'c'])
... subprocess.CalledProcessError ... returned non-zero exit status 2
>>> subprocess.check_output(['test', '-f', 'no-such-file'])
... subprocess.CalledProcessError: ... returned non-zero exit status 1

bash

> bash -c '`/usr/bin/test a b c`; echo OK'; echo $?
/usr/bin/test: ‘b’: binary operator expected
OK
0

> bash -e -c '`/usr/bin/test a b c`; echo OK'; echo $?
/usr/bin/test: ‘b’: binary operator expected
2

Used /usr/bin/test for bash to make examples comparable by not using built-in test in bash.

Perl and Ruby for example, do not see any problem with failing process.

Bash does not care by default but has -e switch to make non-zero exit code fatal, returning the bad exit code when exiting from bash.

Python can differentiate zero and non-zero exit codes.

So, the best we can do is distinguish zero and non-zero exit codes? That’s just not good enough. test for example can return 0 for “true” result, 1 for “false” result and 2 for exceptional situation. Let’s look at this bash code with intentional syntax error in “test”:

if /usr/bin/test --f myfile;then
  echo OK
else
  echo File does not exist
fi

The output is

/usr/bin/test: missing argument after ‘myfile’
File does not exist

Note that -e switch wouldn’t help here. Whatever follows if is allowed to fail (it would be impossible to do anything if -e would affect if and while conditions)

How NGS treats exit codes?

> ngs -e '$(test a b c); echo("OK")'; echo $?
test: ‘b’: binary operator expected
... Exception of type ProcessFail ...
200

> ngs -e '$(nofail test a b c); echo("OK")'; echo $?
test: ‘b’: binary operator expected
OK
0

> ngs -e '$(test -f no-such-file); echo("OK")'; echo $?
OK
0

> ngs -e '$(test -d .); echo("OK")'; echo $?
OK
0

NGS has easily configurable behaviour regarding how to treat exit codes of processes. Built-in behaviour knows about false, test, fuser and ping commands. For unknown processes, non-zero exit code is an exception.

If you use a command that returns non-zero exit code as part of its normal operation you can use nofail prefix as in the example above or customize NGS behaviour regarding the exit code of your process or even better, make a pull request adding it to stdlib.

How easy is to customize exit code checking for your own command? Here is the code from stdlib that defines current behaviour. You decide for yourself (skipping nofail as it’s not something typical an average user is expected to do).

F finished_ok(p:Process) p.exit_code == 0

F finished_ok(p:Process) {
    guard p.executable.path == '/bin/false'
    p.exit_code == 1
}

F finished_ok(p:Process) {
    guard p.executable.path in ['/usr/bin/test', '/bin/fuser', '/bin/ping']
    p.exit_code in [0, 1]
}

Let’s get back to the bash if test ... example and rewrite the it in NGS:

if $(test --f myfile)
    echo("OK")
else
    echo("File does not exist")

… and run it …

... Exception of type ProcessFail ...

For if purposes, zero exit code is true and any non-zero exit code is false. Again, customizable. Such exit code treatment allows the if ... test ... NGS example above to function properly, somewhat similar to bash but with exceptions when needed.

NGS’ behaviour makes much more sense for me. I hope it makes sense for you.

Update: Reddit discussion.


Have a nice weekend!

Bashing bash – unexceptional

This is the third post in the “bashing bash” series. The aim of the series is to highlight problems in bash and convince systems and software engineers to help building a better alternative.

The problem

What is the output of the following script?

#!/bin/bash

set -eu

echo Here
false
echo Not here

Right, the output is

Here

Imagine that instead of false you have a lot of code. Since there are no exceptions, you have no idea where the error occurred.

inspector-160143_640

Solutions using bash:

  1. Use set -x to trace the code.
  2. Add echo something every here and there to know between which two echo‘s the error occurred.
  3. Catch the error using trap and print the line number as suggested on StackOverflow . Writing this additional catching snippet at the top of every script is not really convenient.

This problem of unclear error location is unimaginable in any normal programming language.

“There is no problem, just don’t do it”

Bash was not intended to be a “normal” programming language. Some people say it’s an abuse of bash to use it as such. Looking at the code written in Bash I can tell it really is an abuse in many cases.

The reality though is that bash is still (ab)used for programming. In some cases Bash has positive aspects which outweigh the need to use other languages. In other cases a program starts as a small Bash script and is just not rewritten in another language after the script grows.

I suggest making a better shell rather than convincing people not to abuse Bash. People will keep on doing what they are doing. Let’s make their lives easier by providing them with a better shell.

The suggested solution

Use NGS. In NGS, any failed process throws an exception. Let’s take a look at the script below

#!/usr/bin/env ngs
echo Here
false
echo Not here

What’s the output?

Here
Not here

WAT?

Well, actually false returning an exit code of 1 is not an exception, it’s normal. If any command returning non-zero code would cause an exception, you wouldn’t be able to write for example if $(test -e myfile) do_something .

Failed process is a process that returns an unexpected exit code. Here is the part of stdlib that defines what’s a fail and what’s not:

F finished_ok(p:Process) p.exit_code == 0

F finished_ok(p:Process) {
    guard p.executable.path == '/bin/false'
    p.exit_code == 1
}

F finished_ok(p:Process) {
    guard p.executable.path == '/usr/bin/test'
    p.exit_code in [0, 1]
}

Such definitions also mean that you can easily extend NGS to work properly with any other command, simply by adding another finished_ok function. (Or add it to stdlib if it’s a common command so everyone would benefit).

So where are the exceptions?

We’ll have to modify the code to get an unexpected exit code. Example:

#!/usr/bin/env ngs
echo Here
ls nosuchfile
echo Not here

Output:

Here
ls: cannot access 'nosuchfile': No such file or directory
========= Uncaught exception of type 'ProcessFailed' =========
====== Exception of type 'ProcessFailed' ======
=== [ backtrace ] ===
[Frame #0] /etc/ngs/bootstrap.ngs:158:1 - 158:10 [in <anonymous>]
[Frame #1] /etc/ngs/bootstrap.ngs:154:17 - 154:29 [in bootstrap]
[Frame #2] ./2.ngs:3:4 - 3:14 [in <anonymous>]
[Frame #3] /usr/share/ngs/stdlib.ngs:1116:11 - 1116:15 [in $()]
[Frame #4] /usr/share/ngs/stdlib.ngs:1050:29 - 1050:42 [in wait]
[Frame #5] /usr/share/ngs/stdlib.ngs:1006:7 - 1006:20 [in ProcessFailed]
=== [ dump process ] ===
(a lot of not very well formatted output with info about the process)

Please help building a better alternative

Go to https://github.com/ilyash/ngs/ and contribute some code.

Bashing bash – undefined variables

This is the second post in the “bashing bash” series. The aim of the series is to highlight problems in bash and convince systems and software engineers to help building a better alternative.

The problem

What does the following command do?

rm -rf $mydir/

Looks like the author would like to delete $mydir directory and everything in it. Actually it may do unexpected things because of missing quotes. The rant about quotes is in the previous post. This post is about yet another issue.

The correct commands should be:

set -u
...
rm -rf "$mydir/"

The important thing here is set -u . Without it, when $mydir is undefined for some reason, such as a bug in code preceding the rm command, there is a chance to brick the machine because an undefined variable becomes an empty string so the command is silently expanded to

rm -rf /

800px-Brick

While more experienced engineers will usually use set -eu at the beginning of the script, omitting this declaration is a big trap for others.

Side note. You could ask why the original command has a trailing slash. The trailing slash is common and is used to signify a directory. While to the best of my knowledge the rm should work the same without the slash, some commands are actually more correct with trailing slash. For example cp myfile mydir/ would copy the file into the directory if it exists and would cause error if it doesn’t. On the other hand,  cp myfile mydir would behave the same if directory exists but would create a mydir file if there is no such directory nor file, which was not intended. Other commands such as rsync also behave differently with and without the slash. So it is common to use the slash.

See also: http://www.tldp.org/LDP/abs/html/options.html – bash options

The suggested solution

In NGS, any use of an undefined variable is an exception.

ngs -e 'echo(a)'

It’s going to look prettier but even in current implementation you have all the information about what happened:

========= Uncaught exception of type 'GlobalNotFound' =========
====== Exception of type 'GlobalNotFound' ======
=== [ dump name ] ===
* string(len=1) a
=== [ dump index ] ===
* int 321
=== [ backtrace ] ===
[Frame #0] /etc/ngs/bootstrap.ngs:156:1 - 156:10 [in <anonymous>]
[Frame #1] /etc/ngs/bootstrap.ngs:152:17 - 152:29 [in bootstrap]
[Frame #2] <command line -e switch>:1:8 - 1:9 [in <anonymous>]
...

While bash options probably have historical justification, a new language should not have such a mechanism. It complicates things a lot. In addition, the programmer should always be aware what are the current options.

Please help building a better alternative

Go to https://github.com/ilyash/ngs/ and contribute some code.

Bashing bash – variable substitution

This is the first post in the “bashing bash” series to highlight problems in bash and convince systems and software engineers to help building a better alternative.

Bash is a very powerful and useful tool, doing a better job than many other shells and programming languages when used for the intended tasks. Still, it’s hard to believe that writing a software decades later can not be done better.

The problem

What does the following command do?

cp $src_file $dst_file

One might think it copies the given file to the specified destination. Looking at the code we can say it was the intention. What would actually happen? It can not be known from the line above. Each $src_file and $dst_file expand to zero to N arguments so unexpected things that could happen. The correct command would be

cp "$src_file" "$dst_file"

Forgetting the quotes or leaving them out assuming  $src_file and $dst_file will always contain one bash word, expanding to exactly one argument each is dangerous.

Keeping quoting everything makes code cluttered.

The suggested solution

In NGS, $var expands to exactly one argument similar to "$var" in bash. The new syntax $*var, consistent with similar syntax in other NGS parts, would expand to zero to N arguments.

Please help building a better alternative

Go to https://github.com/ilyash/ngs/ and contribute some code.