NGS unique features – Hash methods I wish I had in other languages

NGS is a language and a shell that I am building for systems administration tasks. Enough of the language is implemented to enable writing some useful scripts. The shell is not there yet.

Some of the Hash methods in NGS

The methods for working with Hash I have not seen all at once in other languages are:

  1. filterk – filter Hash by key (produces Hash)
  2. filterv – filter Hash by value (produces Hash)
  3. mapk – map Hash keys (produces Hash)
  4. mapv – map Hash values (produces Hash)
  5. mapkv – map Hash keys and values (produces Hash as opposed to map which produces an array)
  6. without – filters out specific key

How these are actually used? Following is an excerpt from the pollute method (function), which is a part of the AWS module. It uses several of the Hash methods I mentioned, making the method a good example. pollute method (as in “pollute global namespace”) enables using Vpc variable for example instead of AWS::Vpc and so on. I would like to have this behaviour for small quick-and-dirty scripts but not as default so it’s in a method that one can optionally call.

F pollute(do_warn=true) {

    vars =
        _exports.filterk(/^AMI_OWNER/) +
        _exports.filterv(Type).without('Res').without('ResDef') +
        ...

    if do_warn {
        warn("Polluting ...: ${vars.keys().join(', ')}")
    }

    vars.mapk(resolve_global_variable).each(set_global_variable)
}

Let’s go over the code above step by step:

F pollute(do_warn=true) { ... } defines the pollute method with optional parameter do_warn that has default value of true.

_exports is a Hash containing all of the AWS’ module variables and functions, similar to NodeJS module.exports but members are added automatically rather than explicitly. Only the methods and variables that do not start with _ (underscore) are added. One can modify _exports in any way before the end of the module. I will write more in detail about require() and modules in NGS in another post.

filterk(/^AMI_OWNER/) filters all the variables that match the given RegExp

filterv(Type) filters all the variables that are of type Type. These are AWS types’ definitions, such as Vpc, Subnet or Instance.

without('...') filters out the types I don’t like to override.

+ between and after _exports.filterk(...) and _exports.filterv(...) joins the hashes.

mapk translates variables’ names into their index (using resolve_global_variable)

each runs set_global_variable with variable index and the value to set

Hash methods in other languages

I am aware that some of the methods above are present in other languages or libraries. Some examples:

  1. Ruby has mapv (transform_values) method.
  2. Rails has mapk (transform_keys) and mapv.
  3. Perl 6 can modify values in a convenient manner:for %answers.values -> $v is rw { $v += 10 };.

What I have not seen is a language which has all the methods above out of the box. I have a feeling that arrays get all the fame methods while hashes (dictionaries) often get less attention in other languages.

Why NGS has all these methods?

NGS is aiming to be convenient for systems administration tasks. More often than not these tasks include data manipulation. NGS has many functions (methods) for data manipulation, including the ones listed in this post.

Update: reddit discussion


Have a nice day!

NGS unique features – exit code handling

smilies-1607163_640

How other languages treat exit codes?

Most languages that I know do not care about exit codes of processes they run. Some languages do care … but not enough.

Update / Clarification / TL;DR

  1. Only NGS can throw exceptions based on fine grained inspection of exit codes of processes it runs out of the box. For example, exit code 1 of test will not throw an exception while exit code 1 of cat will throw an exception by default. This allows to write correct scripts which do not have explicit exit codes checking and therefore are smaller (meaning better maintainability).
  2. This behaviour is highly customizable.
  3. In NGS, it is OK to write if $(test -f myfile) ... else ... which will throw an exception if exit code of test is 2 (test expression syntax error or alike) while for example in bash and others you should explicitly check and handle exit code 2 because simple if can not cover three possible exit codes of test (zero for yes,  one for no, two for error). Yes, if /usr/bin/test ...; then ...; fi in bash is incorrect! By the way, did you see scripts that actually do check for three possible exit codes of test? I haven’t.
  4. When -e switch is used, bash can exit (somewhat similar to uncaught exception) when exit code of a process that it runs is not zero. This is not fine grained and not customizable.
  5. I do know that exit codes are accessible in other languages when they run a process. Other languages do not act on exit codes with the exception of bash with -e switch. In NGS exit codes are translated to exceptions in a fine grained way.
  6. I am aware that $? in the examples below show the exit code of the language process, not the process that the language runs. I’m contrasting this to bash (-e) and NGS behaviour (exception exits with non-zero exit code from NGS).

Let’s run “test” binary with incorrect arguments.

Perl

> perl -e '`test a b c`; print "OK\n"'; echo $?
test: ‘b’: binary operator expected
OK
0

Ruby

> ruby -e '`test a b c`; puts "OK"'; echo $?
test: ‘b’: binary operator expected
OK
0

Python

> python
>>> import subprocess
>>> subprocess.check_output(['test', 'a', 'b', 'c'])
... subprocess.CalledProcessError ... returned non-zero exit status 2
>>> subprocess.check_output(['test', '-f', 'no-such-file'])
... subprocess.CalledProcessError: ... returned non-zero exit status 1

bash

> bash -c '`/usr/bin/test a b c`; echo OK'; echo $?
/usr/bin/test: ‘b’: binary operator expected
OK
0

> bash -e -c '`/usr/bin/test a b c`; echo OK'; echo $?
/usr/bin/test: ‘b’: binary operator expected
2

Used /usr/bin/test for bash to make examples comparable by not using built-in test in bash.

Perl and Ruby for example, do not see any problem with failing process.

Bash does not care by default but has -e switch to make non-zero exit code fatal, returning the bad exit code when exiting from bash.

Python can differentiate zero and non-zero exit codes.

So, the best we can do is distinguish zero and non-zero exit codes? That’s just not good enough. test for example can return 0 for “true” result, 1 for “false” result and 2 for exceptional situation. Let’s look at this bash code with intentional syntax error in “test”:

if /usr/bin/test --f myfile;then
  echo OK
else
  echo File does not exist
fi

The output is

/usr/bin/test: missing argument after ‘myfile’
File does not exist

Note that -e switch wouldn’t help here. Whatever follows if is allowed to fail (it would be impossible to do anything if -e would affect if and while conditions)

How NGS treats exit codes?

> ngs -e '$(test a b c); echo("OK")'; echo $?
test: ‘b’: binary operator expected
... Exception of type ProcessFail ...
200

> ngs -e '$(nofail test a b c); echo("OK")'; echo $?
test: ‘b’: binary operator expected
OK
0

> ngs -e '$(test -f no-such-file); echo("OK")'; echo $?
OK
0

> ngs -e '$(test -d .); echo("OK")'; echo $?
OK
0

NGS has easily configurable behaviour regarding how to treat exit codes of processes. Built-in behaviour knows about false, test, fuser and ping commands. For unknown processes, non-zero exit code is an exception.

If you use a command that returns non-zero exit code as part of its normal operation you can use nofail prefix as in the example above or customize NGS behaviour regarding the exit code of your process or even better, make a pull request adding it to stdlib.

How easy is to customize exit code checking for your own command? Here is the code from stdlib that defines current behaviour. You decide for yourself (skipping nofail as it’s not something typical an average user is expected to do).

F finished_ok(p:Process) p.exit_code == 0

F finished_ok(p:Process) {
    guard p.executable.path == '/bin/false'
    p.exit_code == 1
}

F finished_ok(p:Process) {
    guard p.executable.path in ['/usr/bin/test', '/bin/fuser', '/bin/ping']
    p.exit_code in [0, 1]
}

Let’s get back to the bash if test ... example and rewrite the it in NGS:

if $(test --f myfile)
    echo("OK")
else
    echo("File does not exist")

… and run it …

... Exception of type ProcessFail ...

For if purposes, zero exit code is true and any non-zero exit code is false. Again, customizable. Such exit code treatment allows the if ... test ... NGS example above to function properly, somewhat similar to bash but with exceptions when needed.

NGS’ behaviour makes much more sense for me. I hope it makes sense for you.

Update: Reddit discussion.


Have a nice weekend!

NGS unique features – execute and parse

I am developing a shell and a language called NGS. I keep repeating it’s domain specific. What are the unique features that make NGS most suitable for today’s system administration tasks (a.k.a “Operations” or hype-compatible word “DevOps”)?

This post is first in series that show what makes NGS unique.

one-979261_640

Execute and parse operator

Execute-and-parse operator … executes a command and parses it’s output. This one proved to be central in working with AWS API. Citing ec2din.ngs demo script:

``aws ec2 describe-instances $*filters``

The expression above returns a data structure. The command is run, the output is captured and then fed to parse() method. Whatever the parse() method returns is the result of the ``exec-and-parse syntax`` expression above.

Built-in parsing

By default, NGS parses any JSON output when running a command using ``exec-and-parse`` syntax. (TODO: parse YAML too)

In case with AWS CLI commands additional processing takes place to make the data structure coming out of exec-and-parse operator more useful:

  1. The top level of AWS responses is usually a hash that has one key which has an array as value: {"LoadBalancerDescriptions": [NGS, returns, this] } . While I can guess few reasons for such format, I find it much more useful to have an array as a result of running an AWS CLI command and that’s what NGS returns if you run ``aws ...`` commands.
  2. Specifically for aws ec2 describe-instances I’ve removed the annoyance of having Reservations list with instances as sub-lists. NGS returns flat instances list. Sorry, Amazon, this is much more productive.

Customizable parsing

What if you have your own special command with it’s own special output format?

The parsing is customizable via defining your own parse(s:Str, hints:Hash) method implementation. That means you can define how your command is parsed.

 No parsing

Don’t want parsed data? No problem, stick with the `command` syntax instead of ``command``. In case you need original data structure you can use `command`.decode_json() for example.

Why exec-and-parse is an operator?

Why adding an exec_parse() function would not be sufficient?

  1. Execute-and-parse is common operation in system tasks so it should be short. NGS has taken the pragmatic approach: the more common the operation, the shorter the syntax.
  2. Execute-and-parse should look similar `execute-and-capture-output` syntax which already existed when I was adding execute-and-parse.
  3. Making it an operator allows the command to be executed to be written in “commands syntax” (a bit bash-like) which is a better fit.

“I can add this as a function to any language!”

Sure but:

  1. Your chances of getting same brevity are not very good
  2. Making exec-and-parse as flexible as in NGS in other languages would be an additional effort
  3. ``some-command arg1 arg2`` – would it be exec_parse(['some-command', 'arg1', 'arg2']) ? How do you solve the syntax of the passed command? The array syntax does not look good here. Not many languages will allow you to have special syntax for commands to be passed to exec_parse().

If your language is not domain-specific for system tasks, adding exec-and-parse to it will be a task with dubious benefit.

How extreme opposite looks like

Just came across build configuration file of Firefox: settings.gradle (sorry, could not find a link to this file on a web in a sane amount of time). Here is excerpt with lines wrapped for convenience.

def commandLine = ["${topsrcdir}/mach", "environment", "--format",
    "json", "--verbose"]
def proc = commandLine.execute(null, new File(topsrcdir))
def standardOutput = new ByteArrayOutputStream()
proc.consumeProcessOutput(standardOutput, standardOutput)
proc.waitFor()

...

import groovy.json.JsonSlurper
def slurper = new JsonSlurper()
def json = slurper.parseText(standardOutput.toString())

...

if (json.substs.MOZ_BUILD_APP != 'mobile/android') {
...
}

Here is how roughly equivalent code looks in NGS (except for the new File(topsrcdir) which I don’t understand):

json = ``"${topsrcdir}/mach" environment --format json --verbose``
...
if json.substs.MOZ_BUILD_APP != 'mobile/android' {
...
}

Yes, there are many languages where exec-and-parse functionality looks like something in between Gradle and NGS. I don’t think there is one that can do what NGS does in this regard out of the box. I’m not saying NGS is better than other languages for all tasks. NGS is aiming to be better at some tasks. Dealing with I/O and data structures is definitely a target area.


Have a nice day!

Why NGS has no “undefined”

Since I know JavaScript, some Ruby and a bit of Perl which all have the concept of undefined it was a decision I had to make whether I implement undefined in NGS. This article shows why I decided not to have the undefined value/data type.

Update (thanks /u/EldritchSundae): what you observe in Ruby example below is nil, not undefined. In bash the undefined value is empty string. Ruby does not have undefined but it has the ability to read non-existing hash keys without causing an exception like JavaScript and Perl. In Ruby’s case the result is nil, not undef (Perl) or undefined (JavaScript).

Undefined in other languages

Showing here few common cases, not all possible usages.

JavaScript

> nodejs -e 'const a; console.log(a)'
undefined

> nodejs -e 'const h={}; console.log(h["xyz"])'
undefined

> nodejs -e '(function f(a,b) { console.log(a,b) })(1)'
1 undefined

Ruby

> ruby -e 'h={}; puts h["xyz"]' # outputs empty line

Perl

> perl -e '%h=(); print $h{"xyz"}' # outputs nothing

bash

> bash -c 'echo $a' # outputs empty line

Absence of undefined in NGS

Adding yet another data type to NGS needs justification. I can’t find any justification for undefined. I do consider the usages above bugs. Accessing a variable or a place that were not assigned any value is an error.

Conveying absence of a value in NGS is done similar to other languages with the special null value. There are also somewhat experimental Box, FullBox and EmptyBox types, similar to Option, Some and None in Scala.

Undefined as a hash value for non-existing keys

Having undefined returned when looking up non-existing hash key is a trade-off. It’s more convenient and more error-prone. I have chosen Python-like approach: it’s an error.

> ngs -e 'h={}; h["xyz"]'
... Exception of type KeyNotFound ...

# and added convenience method "get"
> ngs -p 'h={}; h.get("xyz", "NONE")'
NONE

Undefined when accessing unset variable

While bash gives you an empty string by default and Perl gives you undef, I do think accessing unset variable is definitely a bug. I guess it was understood at some point by the creators of bash and Perl so bash has -u flag that makes accessing undefined variable an error and Perl has use strict mode which does the same among other things.

> bash -c 'echo $a' # no error
> bash -c 'set -u; echo $a'
bash: a: unbound variable

> bash -c 'a=(); echo ${a[0]}' # no error, just horrible syntax :)
> bash -c 'set -u; a=(); echo ${a[0]}'
bash: a[0]: unbound variable

> perl -e 'print $a' # no error
> perl -e 'use strict; print $a;'
# no error - I have no idea why, probably some "special" variable

# Perl - take number two:
> perl -e 'print $abc' # no error
> perl -e 'use strict; print $abc;'
Global symbol "$abc" requires explicit package name
(did you forget to declare "my $abc"?) at -e line 1.
Execution of -e aborted due to compilation errors.

Undefined as value for parameters without arguments

Calling an NGS function with less arguments than it expects is an error as in most languages:

> ngs -e '(F (a,b) 10)(1)'
... Exception of type ArgsMismatch ..

By the way, I do cringe every time I see JavaScript code that explicitly uses undefined:

function f(optional_a, optional_b) { }
f(undefined, 10)

The programmer took a decision not to pass a value. How in the world is this undefined? Use null for f*ck sake!


Have a nice day!

Unicode characters as operators in a programming language¿

Wouldn’t it be cool to use Unicode characters as operators and maybe even function names?

if a ≠ b { ... }

range = 0…10

all_above_ten = myarr ∀ F(x) x > 10

Looks good, concise, expressive. Perl 6, Julia, Scala appear to support Unicode operators.

Why don’t I add Unicode operators and function names to NGS then?

If NGS would allow Unicode, it would be optional as I don’t want additional entry barrier or possible problems typing Unicode using remote connection. If I do add optional Unicode to NGS, here is what I think will happen next:

Some people start using Unicode in NGS while others don’t. Mixed code style emerges. It’s easy to imagine such mixed code style even in the same file as someone without the Unicode setup on his/her keyboard  is doing a quick fix. ViralBShah sums it up pretty well.

What do you think? Your comments are welcome.

 

Ruby’s do block in NGS experiment

Ruby has a very nice feature – a special syntax to pass a block of code as an argument to a method. I did want something like this NGS. Background, reasons and what this feature became follow.

file-534178_640

Background

NGS – Next Generation Shell is a language and a shell I’m working on for a few years now.

This article is about new NGS syntax inspired by and compared to the counterpart  Ruby’s do ... end syntax. The new syntax provides more convenient way of passing anonymous function to another function. Passing functions to other functions is (sometimes) considered to be a “functional programming” feature of a language.

In NGS, up until recently you had to pass all of your arguments (except for the first argument which can be passed before .method() ) inside the parenthesis of the method call syntax:

mymethod(arg1, arg2, ... argN)

or syntactically equivalent

arg1.mymethod(arg2, arg3, ... argN)

Passing an anonymous function could look like the following in NGS:

required_props.each(F(prop_name) {
    prop_name not in rd.anchor throws InvalidArgument("...")
})

The problem

In the code snippet above I don’t like the closing }) part. A single } would look much better. You could use the each operator % to get rid of the closing ).  So the code above becomes

required_props % F(prop_name) {
    prop_name not in rd.anchor throws InvalidArgument("...")
}

Which is shorter but not very clear (at least for newcomers) because of the % operator. I prefer to use the each operator in small expressions such as myarr % echo and not to use it in bigger expressions such as the snippet above. The closing ) syntax issue is not solved for the general case then so I wanted something like Ruby’s do block syntax.

Importing the “do” block

I’ve made quite a few modifications to fit NGS and to fit some OCD-ness:

The keyword

I prefer “with” because in my opinion, it’s much clearer that it links left-hand-side and right-hand-side. mymethod() with { ... some code ... } looks more clear to me than mymethod() do { ... some code ... }. It might be that I was affected by constructing a mini test framework where test("my feature") with { test code } looks very good.

Not only for code blocks

I don’t like “special” things so whatever comes after with can be any expression.

required_props.each() with F(prop_name) {
    prop_name not in rd.anchor throws InvalidArgument("...")
}

or

results = mylist.map() with my_super_mapper_func_name

Note than when using each it could probably be clearer to have the do keyword. I’m still unsure that with is the right choice

Code blocks syntax

You might ask how test("my feature") with { test code } works if what’s after with is any expression.

The test("my feature") with { test code } syntax works because { ... } is a syntax for anonymous function (with 3 optional parameters called A, B and C) and not because it’s part of the with syntax. You could get the same result using test("my feature", { test code }) .

Not just one

I don’t see any reason why with would be limited to one occurrence.

finally()
    with {
        while entry = c_readdir(d) {
            ...
        }
    }
    with {
        r = c_closedir(d)
        r != 0 throws DirFail('...')
    }

Not a special parameter

As compared to Ruby, the finally() method does not use the special yield keyword to run the given arguments nor parameter should be declared with &. Here is the definition of finally() method (including the bug that cleanup can run twice, which I intend to fix).

F finally(body:Fun, cleanup:Fun) {
    try {
        ret = body()
        cleanup()
        ret
    } catch(e) {
        cleanup()
        throw e
    }
}

Not special at all, just an additional syntax for arguments

The with syntax allows to write method call arguments outside the parenthesis. That’s all. What follows from this definition is the possibility of using keyword arguments:

finally()
        with body = {
                while entry = c_readdir(d) {
                        ...
                }
        }
        with cleanup = {
                r = c_closedir(d)
                r != 0 throws DirFail('...')
        }

Let’s shorten the syntax

The with name = val argument syntax looked a bit too verbose for me and with was stealing attention focus from the name. I’ve added the fat arrow syntax to combat that.

finally()
    body => {
        while entry = c_readdir(d) {
            ...
        }
    }
    cleanup => {
        r = c_closedir(d)
        r != 0 throws DirFail('...')
    }

How the feature looks like now?

The snippet above refers to real code in stdlib. Additional examples of the with syntax follow:

Stdlib’s retry()

Definition:

F body_missing_in_retry() throw InvalidArgument("...")

F retry(times=60, sleep=1, ..., progress_cb={null}, success_cb={A},
    fail_cb=null, body:Fun=body_missing_in_retry) {
...
}

Usage:

F assert_resolvable(h:Str, title="Resolve host", times=45, sleep=2) {
	retry(times=times, sleep=sleep, title=title)
		body       => { `dig "+short" $h`.lines() }
		success_cb => { log_test_ok("...") }
		fail_cb    => { throw TestFail("...") }
}

test() mini framework for … tests!

Definition:

F test(name:Str, f:Fun) {
...
}

Usage:

test("Resolving gateway FQDN $gw_fqdn") with {
    assert_resolvable(gw_fqdn, "Resolve gateway FQDN")
}

Have your own ideas?

Please comment with your ideas regarding this, other and proposed NGS features, they can make it into the language. Or fork/improve/pull request.


Have a nice weekend!

Creating a language is easier now

I do look at other languages when designing and implementing a new shell and a language called NGS. As time goes by, there are more languages to look at and learn from. Hence, I think creating a language in recent years is easier than ever before. I would like to thank authors and contributors of all the languages!cubes-677092_640

NGS is heavily based on ideas that already existed before NGS. Current languages provide many of them and work on NGS includes sorting out which of these ideas resonate with my way of thinking and which don’t.

Of course NGS does have it’s own original ideas (read: I’m not aware of any other language that implemented these ideas) and they are critical to NGS, creating NGS is not just filtering others’ ideas 🙂

Some stolen ideas inspired by other languages

  1. Multi-methods & kind-of-object-orientation in NGS – mostly from CLOS
  2. External commands execution and redirection syntax – bash (while changing behaviour of the $ expansion, now known to be source of many bugs).
  3. Anonymous functions and closures – not sure where I got it from, I’d guess Lisp.
  4. Declarative primitives concept – heavily based on basic concept behind configuration management tools
  5. inspect() – Ruby‘s inspect()
  6. code() – the .perl property from Perl. Not fully implemented in NGS. Should return the code that would evaluate to the given value. Probably comes from Lisp originally which usually prints the values in a way it can read them back.
  7. Ordered hash (When you iterate a Hash, the items are in the same order as they were inserted) – PHP, also implemented in Ruby and if I’m correct in V8.
  8. The in and not in operators – Python.
  9. Many higher order functions such as map and filter probably originated in Scheme or Lisp.
  10. Translate anything that possible into function calls. + operator for example is actually a function call – Python, originally probably from Scheme or Lisp.

Ideas trashed before implementation

While working on NGS I think about features I can add. I have to admit that many of these candidate features seem fine for a few seconds to a few minutes… till I realize that something similar is already implemented in language X and it does not resonate with my way of thinking (also known as “This does not look good”).

Square brackets syntax for creating a list

Suppose we have this common piece of code:

list = []
some_kind_of_loop {
    ...
    list.push(my_new_item)
    ...
}

In this case it would advantageous to express the concept of building a list with a special syntax so it would be obvious from a first glance what’s happening. Since array literal already had [1,2,3,...] syntax, the new building-a-list syntax would be [ something here ]. And that’s how we get list comprehensions in Python (also in other languages). The idea is solid but when I see it implemented, it’s very easy for me to realize that I don’t like the syntax. List comprehension syntax is the preferred way to construct lists in Python and seems like it’s being widely used. I’d like to make clear that I do like Python and list comprehensions is one of the very few things I’m not fond of.

So what NGS has for building lists? Let’s see:

# Python
[x*2 for x in range(10) if x > 5]

# NGS straightforward translation
collector
  for(x;10)
    if x > 5
      collect(x * 2)

Doesn’t look like much of an improvement. That’s because straightforward translations are not representative. The collector‘s use above is not a good example of it’s usage. More about collector later in this article.

Here are the NGS-y ways:

# Python
[x*2 for x in range(10) if x > 5]

# NGS alternatives, pick your compactness vs clarity
(0..10).filter(F(elt) elt>5).map(F(elt) elt*2)
10.filter(F(elt) elt>5).map(F(elt) elt*2)
10.filter(X>5).map(X*2)
10?(X>5)/(X*2)

NGS does not have special syntax for building lists. NGS does have special syntax for building anything and it’s called collector.

Syntax for everything

Just say “no”. It is tempting at first to make syntax for all the concepts in the language or at least for most of them. When taken to it’s extreme, we have Perl and APL.

I don’t like Perl’s syntax, especially sigils, especially that there is more than one sigil, (unlike $ in bash). Note that Perl kind of admitted sigils were used inconsistently (or confusingly? not sure) so sigils syntax was partially changed in Perl 6. On the other hand, Perl 6 added sigil modifiers, called “twigil“s. I counted 9 of them.

APL has many mathematical symbols in it’s syntax. Many of them are not on a keyboard. While the language is very terse and expressive, I don’t like the syntax and the fact that typing in a program in APL is not a straightforward task. [Update: Clarification: I’m not familiar with APL as opposed to other languages mentioned in this article. I was pointed out that I was quick to judge APL. It does have mathematical symbols but it actually doesn’t have much syntax. Most of the symbols are just built-in functions and operators. I’ve also found out one symbol for “go to line” and one for “define function”/”end function”. Actually APL syntax rules are interesting “think different” approach as they are implementation of mathematical notation. Thanks for the link, geocar.]

I do agree with Python’s developers reluctant attitude towards adding new syntax, especially new characters. Perl appears to be on the other end of the spectrum.

I’d like to come back to the collector syntax and show that even when adding a syntax, it’s advantageous to add less of it and reuse existing language concepts if possible.

The initial thought was to base the collector syntax on Lisp’s (loop …) macro , specifically the macro’s collect keyword: (loop ... (collect ...) ...). I was about do add two new keywords: collector and collect. The code would look like this:

mylist = collector
  for(x;10)
    if x > 5
      collect x * 2

I’ve started considering implementation strategies and came up with the idea that it would be simpler to have one keyword only, collector. collector wraps single expression to the right (which can be a { code block }) in a function with one argument named collect. The code becomes this:

mylist = collector
  for(x;10)
    if x > 5
      collect(x * 2)

When the collector expression is evaluated, the wrapped code runs with provided collect function. When collect is a function and not a special syntax it allows to use all the facilities that work with functions. Example: somelist.filter(...).each(collect) .

There is more to collector! It has optional initialization argument. Let’s see it in the following example:

collector/0
    arr.each(F(elt) {
        if predicate(elt)
            collect(1)
    })

The code above counts number of elements in arr which satisfy the predicate. That’s roughly how standard library’s count() is implemented.

When collector is not followed by / (slash) and initialization value, the default initialization argument is newly created empty list. The default initial value was chosen to be empty list as it appears that most uses of collector

Advantages of the implementation where collect is a function and not a keyword:

  1. Simpler implementation
  2. Collector behaviour customization – there is a way to provide your own collect function for your types which I will not describe here. Standard library defines such functions for array, hash and integer types.
  3. Functional facilities available for the collect function
  4. collect function can have more than one argument making it possible to collect key-value pairs if constructing a Hash: collector ... collect(k, v) ... (this is from filter() implementation for Hash in standard library).

I consider collector to be a good example of “less syntax is better”.

Easier, not easy

I do enjoy the situation that I have many other languages to look at for ideas, good or bad.

NGS needs original features for it’s domain (systems administration tasks niche). These domain-specific features make NGS stand out. Other languages don’t have them because they are either not domain-specific or have wider domain or are too old to have features for the today’s systems administration domain. Every original feature has a risk attached to it. If you haven’t seen it work before there is practically no way to predict how good or bad a feature will turn out. More about original features in NGS in another post.

While creating a language is now easier than before it’s still not an easy task 🙂 … but you can help. Fork, code, make a pull request.


Have a nice weekend!

Update: discussion on reddit

Bashing bash – unexceptional

This is the third post in the “bashing bash” series. The aim of the series is to highlight problems in bash and convince systems and software engineers to help building a better alternative.

The problem

What is the output of the following script?

#!/bin/bash

set -eu

echo Here
false
echo Not here

Right, the output is

Here

Imagine that instead of false you have a lot of code. Since there are no exceptions, you have no idea where the error occurred.

inspector-160143_640

Solutions using bash:

  1. Use set -x to trace the code.
  2. Add echo something every here and there to know between which two echo‘s the error occurred.
  3. Catch the error using trap and print the line number as suggested on StackOverflow . Writing this additional catching snippet at the top of every script is not really convenient.

This problem of unclear error location is unimaginable in any normal programming language.

“There is no problem, just don’t do it”

Bash was not intended to be a “normal” programming language. Some people say it’s an abuse of bash to use it as such. Looking at the code written in Bash I can tell it really is an abuse in many cases.

The reality though is that bash is still (ab)used for programming. In some cases Bash has positive aspects which outweigh the need to use other languages. In other cases a program starts as a small Bash script and is just not rewritten in another language after the script grows.

I suggest making a better shell rather than convincing people not to abuse Bash. People will keep on doing what they are doing. Let’s make their lives easier by providing them with a better shell.

The suggested solution

Use NGS. In NGS, any failed process throws an exception. Let’s take a look at the script below

#!/usr/bin/env ngs
echo Here
false
echo Not here

What’s the output?

Here
Not here

WAT?

Well, actually false returning an exit code of 1 is not an exception, it’s normal. If any command returning non-zero code would cause an exception, you wouldn’t be able to write for example if $(test -e myfile) do_something .

Failed process is a process that returns an unexpected exit code. Here is the part of stdlib that defines what’s a fail and what’s not:

F finished_ok(p:Process) p.exit_code == 0

F finished_ok(p:Process) {
    guard p.executable.path == '/bin/false'
    p.exit_code == 1
}

F finished_ok(p:Process) {
    guard p.executable.path == '/usr/bin/test'
    p.exit_code in [0, 1]
}

Such definitions also mean that you can easily extend NGS to work properly with any other command, simply by adding another finished_ok function. (Or add it to stdlib if it’s a common command so everyone would benefit).

So where are the exceptions?

We’ll have to modify the code to get an unexpected exit code. Example:

#!/usr/bin/env ngs
echo Here
ls nosuchfile
echo Not here

Output:

Here
ls: cannot access 'nosuchfile': No such file or directory
========= Uncaught exception of type 'ProcessFailed' =========
====== Exception of type 'ProcessFailed' ======
=== [ backtrace ] ===
[Frame #0] /etc/ngs/bootstrap.ngs:158:1 - 158:10 [in <anonymous>]
[Frame #1] /etc/ngs/bootstrap.ngs:154:17 - 154:29 [in bootstrap]
[Frame #2] ./2.ngs:3:4 - 3:14 [in <anonymous>]
[Frame #3] /usr/share/ngs/stdlib.ngs:1116:11 - 1116:15 [in $()]
[Frame #4] /usr/share/ngs/stdlib.ngs:1050:29 - 1050:42 [in wait]
[Frame #5] /usr/share/ngs/stdlib.ngs:1006:7 - 1006:20 [in ProcessFailed]
=== [ dump process ] ===
(a lot of not very well formatted output with info about the process)

Please help building a better alternative

Go to https://github.com/ilyash/ngs/ and contribute some code.

Fundamental flaws of bash and its alternatives

pebbles-1335479_640

Quoting Steve Bourne from  An in-depth interview with Steve Bourne, creator of the Bourne shell, or sh:

I think you are going to see, as new environments are developed with new capabilities, scripting capabilities developed around them to make it easy to make them work.

Cloud happened since then. Unfortunately, I don’t see any shells that could be an adequate response to that. Such shell should at the very least have data structures. No, not like bash, I mean real data structures, nested, able to represent a JSON response from an API and having a sane syntax.

In many cases bash is the best tool for the job. Still, I do think that current shells are not a good fit for today’s problems I’m solving. It’s like the time has frozen for shells while everything else advanced for decades.

As a systems engineer, I feel that there is no adequate shell nor programming language exist for me to use.

Bash

Bash was designed decades ago. Nothing of what we expect from any modern programming language is not there and somehow I get the impression that it’s not expected from a shell. Looks like years changed many things around us and bash is not one of them. It changed very little.

Looks like it was designed to be an interactive shell while the language was a bit of afterthought. In practice it’s used not just as an interactive shell but as a programming language too.

What’s wrong with bash?

Sometimes I’m told that there is nothing wrong with Bash and another shell is not needed.

Even if we assume that nothing is wrong with Bash, there is nothing wrong with assembler and C languages either. Yet we have Ruby, Python, Java, Tcl, Perl, etc… . Productivity and other concerns might something to do with that I guess.

… except that there are so many things wrong with bash: syntax, error handling, completion, prompt, inadequate language, pitfalls, lack of data structures, and so on.

While jq is trying to compensate for lack of data structures can you imagine any “normal” programming language that would outsource handling of data? It’s insane.

Silently executing the rest of your code after an error by default. I don’t think this requires any further comments.

Do you really think that bash is the global maximum and we can’t do better decades later?

Over the years there were several attempts to make a better alternative to bash.

Project Focus on shell and shell UX Powerful programming language
Bash No No
Csh No No
Fish shell Yes No
Plumbum No Yes, Python
RC shell No No
sh No Yes, Python
Tclsh No Yes, Tcl
Zsh Yes No

You can take a look at more comprehensive comparison at Wikipedia.

Flaw in the alternatives

A shell or a programming language? All the alternatives I’ve seen till this day focus either on being a good interactive shell with a good UX or on using a powerful language. As you can see there is no “yes, yes” row in the table above and I’m not aware of any such project. Even if you find one, I bet it will have one of the problems I mention below.

Focusing on the language

The projects above that focus on the language choose existing languages to build on. This is understandable but wrong. Shell language was and should be a domain-specific language. If it’s not, the common tasks will be either too verbose or unnecessarily complex to express.

Some projects (not from the list above) choose bash compatible domain-specific language. I can not categorize these projects as “focused on the language” because I don’t think one can build a good language on top of bash-compatible syntax. In addition these projects did not do anything significant to make their language powerful.

Focusing on the interactive experience

Any projects that I have seen that focus on the shell and UX do neglect the language, using something inadequate instead of real, full language.

What’s not done

I haven’t seen any domain-specific language developed instead of what we have now. I mean a language designed from ground up to be used as a shell language, not just a domain-specific language that happened to be an easy-to-implement layer on top of an existing language.

Real solution

Do both good interactive experience and a good domain-specific language (not bash-compatible).

List of features I think should be included in a good shell: https://github.com/ilyash/ngs/blob/master/readme.md

Currently I’m using bash for what it’s good for and Python for the rest. A good shell would eliminate the need to use two separate tools.

The benefits of using a good shell over using one of the current shells plus a scripting language are:

Development process

With a good shell, you could start from a few commands and gradually grow your script. Today, when you start with a few commands you either rewrite everything later using some scripting language or get a big bash/zsh/… script which uses underpowered language and usually looks pretty bad.

Libraries

Same libraries available for both interactive and scripting tasks.

Error handling and interoperability

Having one language for your tasks simplifies greatly both the integration between pieces of your code and error handling.

Help needed

Please help to develop a better shell. I mean not an easy-to-implement, a good one, a shell that would make people productive and a joy to use. Contribute some code or tell your friend developers about this project.

https://github.com/ilyash/ngs/


I’m using Linux. I’m not using Windows and hope I will never have to do it. I don’t really know what’s going on there with shells and anyhow it is not very relevant to my world. I did take a brief look at Power Shell it it appears to have some good ideas.

Bashing bash – undefined variables

This is the second post in the “bashing bash” series. The aim of the series is to highlight problems in bash and convince systems and software engineers to help building a better alternative.

The problem

What does the following command do?

rm -rf $mydir/

Looks like the author would like to delete $mydir directory and everything in it. Actually it may do unexpected things because of missing quotes. The rant about quotes is in the previous post. This post is about yet another issue.

The correct commands should be:

set -u
...
rm -rf "$mydir/"

The important thing here is set -u . Without it, when $mydir is undefined for some reason, such as a bug in code preceding the rm command, there is a chance to brick the machine because an undefined variable becomes an empty string so the command is silently expanded to

rm -rf /

800px-Brick

While more experienced engineers will usually use set -eu at the beginning of the script, omitting this declaration is a big trap for others.

Side note. You could ask why the original command has a trailing slash. The trailing slash is common and is used to signify a directory. While to the best of my knowledge the rm should work the same without the slash, some commands are actually more correct with trailing slash. For example cp myfile mydir/ would copy the file into the directory if it exists and would cause error if it doesn’t. On the other hand,  cp myfile mydir would behave the same if directory exists but would create a mydir file if there is no such directory nor file, which was not intended. Other commands such as rsync also behave differently with and without the slash. So it is common to use the slash.

See also: http://www.tldp.org/LDP/abs/html/options.html – bash options

The suggested solution

In NGS, any use of an undefined variable is an exception.

ngs -e 'echo(a)'

It’s going to look prettier but even in current implementation you have all the information about what happened:

========= Uncaught exception of type 'GlobalNotFound' =========
====== Exception of type 'GlobalNotFound' ======
=== [ dump name ] ===
* string(len=1) a
=== [ dump index ] ===
* int 321
=== [ backtrace ] ===
[Frame #0] /etc/ngs/bootstrap.ngs:156:1 - 156:10 [in <anonymous>]
[Frame #1] /etc/ngs/bootstrap.ngs:152:17 - 152:29 [in bootstrap]
[Frame #2] <command line -e switch>:1:8 - 1:9 [in <anonymous>]
...

While bash options probably have historical justification, a new language should not have such a mechanism. It complicates things a lot. In addition, the programmer should always be aware what are the current options.

Please help building a better alternative

Go to https://github.com/ilyash/ngs/ and contribute some code.