Event Loop for Beginners

The aim of the post is to give a simple, concise, and high level description of the event loop. I’m intentionally leaving out many details and being somewhat imprecise.

If you need detailed picture, follow the links in this post. In this case, I recommend reading this short post first.

Why Event Loop?

To deal with concurrency (multiple “things” need to be happening at what looks like the same time), there are few models. To discuss the models, we need to know what a thread is.

Roughly, a thread is a sequence of computing instructions that runs on a single CPU core.

Roughly, Event Loop manages on which tasks the thread works and in which order.

Queue

The queue contains what’s called in different sources messages, events, and tasks.

Message/event/task is a reference to a piece of code that needs to be scheduled to run. Example: “the code responsible for handling the click of button B + full details of the click event to pass to the code”.

Event Loop

  1. The queue is checked for new tasks. If there is none, we wait for one to appear.
  2. The first task in the queue is scheduled and starts running. The code runs till completion. In many languages, await keyword in the code counts as completion and everything after await is scheduled to run later – new task in the queue.
  3. Repeat from number 1. That’s the loop. It’s called Event Loop because it processes events from the queue in a loop.

Adding Events to the Queue

Tasks are added to the queue for two reasons:

  1. Something happened (user clicked on a button, network packet arrived, etc).
  2. The code that was running in step 2 of the event loop scheduled something to run later.

See Also

  1. Event Loop documentation at MDN.
  2. What is the difference between concurrency and parallelism? at StackOverflow

Hope this helps with high level understanding of Event Loop. Have a nice day!

Arguments and Parameters

These two words are used interchangeably. Please don’t. They mean different things. Here is my concise explanation.

Argument

A value passed into a function/method during invocation.

my_func(arg1, arg2)

Additional names for “argument” are “actual argument” and “actual parameter”.

Parameter

A name of a variable in the function/method definition. During invocation, the variable is used in the function/method body to refer to the value of the passed argument.

F my_func(param1, param2) {
  ...
  # Using param1 and param2 for a computation
  ...
}

Additional name for “parameter” is “formal argument”.

Tip – Parametrize

If you struggle to remember which one is which, this might help: when you “parameterize” a piece of code, you add parameters to the code. Then you have the code with the parameter used in it, with the first occurrence in the function/method definition.

# Initial version

echo("Hello, Joe")

# Parametrized version. "name" is a parameter.

F hello(name) {
  echo("Hello, ${name}")
}

See Also


Hope this helps! Have a nice day!


Updates after Reddit discussion:

  • I never asked the difference as an interview question. If I would:
    • Getting this wrong – tiny negative point
    • Not understanding why using correct terminology matters – big negative point
    • Understanding the difference and using these words interchangeably (knowingly incorrectly) – huge negative point
    • Providing fake facts to support your opinion that these words are interchangeable – huge negative point
  • Explaining why using correct terminology matters is out of scope of this post

The new Life of tap()

Background

I’m designing and implementing Next Generation Shell, a programming language (and a shell) for “DevOps” tasks (read: running external commands and data manipulation are frequent).

I came across a programming pattern (let’s call it P) as follows:

  1. An object is created
  2. Some operations are performed on the object
  3. The object is returned from a function (less frequently – stored in a variable)

P Using Plain Approach

The typical code for P looks in NGS like the following:

F my_func() {
  my_obj = MyType()
  my_obj.name = "blah"
  my_obj.my_method(...)
  my_obj  # last expression is evaluated and returned from my_func()
}

The above looks repetitive and not very elegant. Given the frequency of the pattern, I think it deserves some attention.

Attempt 1 – set()

In simpler but pretty common case when only assignment to fields is required after creating the object, one could use set() in NGS:

F my_func() {
  MyType().set(name = "blah")
}

or, for multiple fields:

F my_func() {
  MyType().set(
    name = "blah"
    field2 = 100
    field3 = "you get the idea"
  )
}

Side note: parameters to methods can be separated by commas or new lines, like in the example above.

I feel quite OK with the above but the cons are:

  1. Calling a method is not supported (unless that method returns the original object, in which case one could MyType().set(...).my_method())
  2. Setting of fields can not be interleaved in a straightforward manner with arbitrary code (for example to calculate the fields’ values)

Attempt 2 – tap()

I’m familiar with tap() from Ruby. It looked quite useful so NGS also had tap() for quite a while. Here is how P would look like in NGS when implemented with tap():


F my_func() {
  MyType().tap({
    A.name = "blah"
    A.my_method()
  })
}

Tap takes an arbitrary value, runs the given callback (passing that value as the only argument) and returns the original value. It is pretty flexible.

Can’t put my finger on what’s exactly is bothering me here but the fact is that I was not using tap() to implement P.

Attempt 3 – expr::{ … }

New Life of tap()

This one is very similar to tap() but it is syntactically distinct from tap.

F my_func() {
  MyType()::{
    A.name = "blah"
    # arbitrary code here
    A.my_method()
  }
}

I think the main advantage is that P is easily visually distinguishable. For example, if you only want to know the type of the expression returned, you can relatively easy skip everything between ::{ and } . Secondary advantage is that it’s a slightly less cluttered than tap().

Let’s get into the details of how the above works.

Syntax

  1. MyType() in our case is an expression. Happens to be a method call which returns a new object.
  2. :: – namespace field access operator. Typical use case is my_namespace::my_field.
  3. { ... } – anonymous function syntax. Equivalent to a function with three optional parameters (A, B, and C, all default to null).

Note that all three syntax elements above are not unique to this combination. Each one of them is being used in other circumstances too.

Up until recently, the :: syntax was not allowing anonymous function as the second argument. That went against NGS design: all methods should be able to handle as many types of arguments as possible. Certainly limiting arguments’ types syntactically was wrong for NGS.

Semantics

In NGS, any operator is transformed to a method call. :: is no exception. When e1::e2 is encountered, it is translated into a call to method :: with two arguments: e1 and e2.

NGS relies heavily on multiple dispatch. Let’s look at the appropriate definition of the :: method from the standard library:

F '::'(x, f:Fun) {
  f(x)
  x
}

Not surprisingly, the definition above is exactly like the definition of F tap() ... (sans method and parameters naming).

Examples of expr::{ … } from the Standard Library

# 1. Data is an array. Each element is augmented with _Region field.
data = cb(r)::{
  A._Region = ConstIter(r)
}


# 2. push() returns the original object, which is modified in { ... }
F push(s:Set, v) s::{ A.val[v] = true }


# 3. each() returns the original object.
# Since each() in { ... } would return the keys() and not the Set,
# we are working around that with s::{...}
F each(s:Set, cb:Fun) s::{ A.val.keys().each(cb) }


# 4. Return what c_kill() returns unless it's an error
F kill(pid:Int, sig:Int=SIGNALS.TERM) {
  c_kill(pid, sig)::{
    A == -1 throws KillFail("Failed to kill pid $pid with signal $sig")
    A != 0 throws Error("c_kill() did not return 0 or -1")
  }
}

Side note: the comments are for this post, standard library has more meaningful, higher level comments.

A Brother Looking for Use Cases

While changing syntax to allow anonymous function after ::, another change was also made: allow anonymous function after . so that one could write expr.{ my arbitrary code } . The whole expression returns what the arbitrary code returns. Unfortunately, I did not come across (or maybe haven’t noticed) real use cases. The appropriate . method in the standard library is defined as follows:

F .(x, f:Fun) f(x)

# Allows
echo(5.{ A * 2 })  # 10

Have any use cases which look less stupid than the above? Let me know.

The Pseudo Narrow Waist in Unix

Background

This is a pain-driven response to post about Narrow Waist of Unix Architecture. If you have the time, please read that post.

The (very simplified and rough) TL;DR of the above link:

  1. The Internet has “Narrow Waist”, the IP protocol. Anything that is above that layer (TCP, HTTP, etc), does not need to be concerned with lower level protocols. Each piece of software therefore does not need to concern itself with any specifics of how the data is transferred.
  2. Unix has “Narrow Waist” which is text-based formats. You have a plethora of tools that work with text. On one side of of Narrow Waist we have different formats, on another side text manipulating tools.

I agree with both points. I disagree with implied greatness of the Unix “design” in this regard. I got the impression that my thoughts in this post are likely to be addressed by next oilshell blog posts but nevertheless…

Formats

Like hierarchy of types, we have hierarchy formats. Bytes is the lowest level.

Bytes

Everything in Unix is Bytes. Like in programming languages, if you know the base type, you have a certain set of operations available to you. In case of Bytes in Unix, that would be cp, zip, rsync, dd, xxd and quite a few others.

Text

A sub-type (a more specific type) of Bytes would be Text. Again, like in a programming language, if you know that your are dealing with data of a more specific type, you have more operations available to you. In case of Text in Unix it would be: wc, tr, sed, grep, diff, patch, text editors, etc.

X

For the purposes of this discussion X is a sub-type of Text. CSV or JSON or a program text, etc.

Is JSON a sub-type of Text? Yes, in the same sense that a cell phone is a communication device, a cow is an animal, and a car is a transportation device. Exercise to the reader: are this useful abstractions?

Cow is an animal

The Text Hell

The typical Unix shell approach for working with X are the following steps:

  1. Use Text tools (because they are there and you are proficient wielder)
  2. One of:
    1. Add a bunch of fragile code to bring Text tools to level where they understand enough of X (in some cases despite existing command line tools that deal specifically with X)
    2. Write your own set of tools to deal with the relevant subset of X that you have.
  3. Optional but likely: suffer fixing and extending number 2 for each new “corner case”.

The exception here are tools like jq and jc which continue gaining in popularity (for a good reason in my opinion). Yes, I am happy to see declining number of “use sed” recommendations when dealing with JSON or XML.

Interestingly enough, if a programmer would perform the above mentioned atrocities in almost any programming language today, that person would be pointed out that it’s not the way and libraries should be used and “stop using square peg for round hole”. After few times of unjustified repetition of the same offense, that person should be fired.

Square peg / round hole

Somehow this archaic “Unix is great, we love POSIX, we love Text” approach is still acceptable…

Pipes Text Hell

  1. Create a pipe between different programs (text output becomes text input of the next program)
  2. Use a bunch of fragile code to transform between what first program produces and the second one consumes.

Where Text Abstraction is not Useful

Everywhere almost. In order to do some of the most meaningful/high-level operations on the data, you can’t ignore it’s X and just work like it is Text.

Editing

The original post says that since the format is Text, you can use vim to edit it. Yes you can… but did you notice that any self respecting text editor comes with plugins for various X’s? Why is that? Because even the amount of useful “text editing” is limited when all you know you are dealing with Text. You need plugins for semantic understanding of X in order to be more productive.

Wanna edit CSV in a text editor without CSV plugin? OK. I prefer spreadsheet software though.

Have you noticed that most developers use IDEs that “understand” the code and not Notepad?

Lines Count

Simple, right? wc -l my.csv. Do you know the embedded text in quotes does not have newlines? Oops. Does it have header line? Oops.

Text Replacement

Want to try to rename a method in a Java program? sed -i 's/my_method/our_method/g' *.java, right? Well, depends on your luck. I would highly recommend to do such kind of refactoring using an IDE that actually understands Java so that you rename: only specific method in a specific class as opposed to unfortunately named methods and variables, not to mention arbitrary strings.

Search / Indexing

Yep… except that understanding of the semantics helps here quite a bit. That’s why you have utilities which understand specific programming languages that do the indexing.

Conclusion

I do not understand the fascination with text. Still waiting for any convincing arguments why is it so “great” and why the interoperability that it provides is not largely a myth. Having a set of tools enabling one to do subpar job each time is better than not having them but is it the best we can?

My previous dream of eradicating text where it does not make sense (my blog post from 2009) came true with HTTP/2. Apparently I’m not alone in this regard.

Sorry if anything here was harsh. It’s years of pain.

Clarification – Layering

Added: 2022-02-07 (answering, I hope, https://www.reddit.com/r/ProgrammingLanguages/comments/t2bmf2/comment/hzm7n44/)

Layering in case of IP protocol works just fine. Implementer of HTTP server really does not care about the low level transport details such as Ethernet. Also the low level drivers don’t care which exactly data they deliver. Both sides of the Waist don’t care about each other. This works great!

My claim is that in case of the Text Narrow Waist, where X is on one hand of and the Text tools are on the other, there are two options:

  1. Tools ignore X and you have very limited functionality you get out of the tools.
  2. Tools know about X but then it’s “leaky abstraction” and not exactly a Narrow Waist.

That’s why I think that in case of Text, the Narrow Waist is more of an illusion.


Have a nice week!

On Accidental Serialization Formats

Let’s talk about the “just separate with comma and stick it into one field” type of serialization.

You had two strings (abc and def) and you joined them with a separator. What do you have now? One string with two elements, right? Right, abc,def. Well… two or more actually, depending on how many times the chosen separator occurred in the original strings: if they were a,bc and def, you’ve got a,bc,def, which is 3 elements according to our format. Oops. Leaving out the question whether leading and trailing spaces are significant.

Wanna add escaping for the separator then? a,bc and def are now serialized as a\,bc,def. Now the parsing became more complex. You can’t just split the string by the separator (you would get 3 elements: a\ and bc and def. You need to scan the serialized data, considering escaping when splitting. You also need to remove the escaping character from the result. How about escaping the escape character? If original data is a\bc, it is serialized as a\\bc). Yet another something not to forget.

Don’t like escaping then? How about encoding like in URL? a,bc becomes a%2Cbc. You can now once again split the string by the separator character… assuming it was encoded. Which characters you encode anyway? If you encode all ASCII characters, the result is 3 times the original and is completely unreadable. It least you are “safe” with regards to separator now, it is encoded for sure so no split problems. You have to add a decoding routine now though.

If your serialized thing goes into a database, consider how indexing would work. It probably won’t. Maybe you should model your domain properly in the database and not serialize at all. Hint: if the values ever need to be treated differently/separately by the database, they go into different cells/rows/columns/fields, not one. There are very rare exceptions. Notable exception is the ability of databases to handle JSON fields (examples: MySQL, PostgreSQL). Note that this capability can fit or not fit your use case.

Want to satisfy your artistic needs and do something clever about the serialization? Do it at home then please. Don’t waste time that your colleagues could use on something more productive than dealing with your custom format.

Strong advice: don’t do custom serialization format, use existing serialization formats and libraries.


Seen something to add to the above? Leave a comment!

Failed Stealing from Python

I made a mistake. Hope you will learn something from it.

Mental Shortcuts

Heuristic, tl;dr for the purposes of this article – mental shortcut. The brain chooses to do less thinking to save energy. It relies on simple rules to get the result. The result is correct… some times.

I took a mental shortcut when working on my own programming language, Next Generation Shell. It was a mistake.

Additionally, I have ignored the uneasy but very vague feeling that what I’m doing is not 100% correct. From my experience I knew I shouldn’t ignore it but I still did it. Another mistake.

I “thought”

Below are heuristics that led to the wrong decision.

Copying features from popular languages is pretty “safe”. After all, “everybody” is using the language and it’s OK. Social proof. Wrong. Everybody does mistakes. Popular languages have design issues too.

It’s OK to copy a feature because it’s very basic aspect of a language. Nope. Python messed up arguments passing. And I copied that mess.

The Fix

Python 3.8 has the fix. I have mixed feelings about it. Still not sure how I should fix it in NGS.

Takeaway

Beware of mental shortcuts. There are situations where these are not acceptable. The tricky part is to detect that you are using a mental shortcut in a situation where it’s not appropriate. I hope that with awareness and practice we can do it.

Also note that your $job is most likely paying you to not take mental shortcuts.

“But it works”

TL;DR – this is not nearly good enough in most cases and it’s only small fraction of what you are paid for.

I want this post to be the canonical place to refer people to who say “but it works” because people who explain why this is not OK are tired of repeating the same arguments, me included.

You are paid for …

Following is not an exhaustive list but it should give you some perspective which is opposite from the narrow-minded “but it works”.

Typically, your Software Engineering $Job pays you for:

  1. Of course the thing must work. But also..
  2. It should continue working
    1. Gives deprecation warnings? Probably not good.
    2. Only runs on Node.js v10 LTS which is end of life in less than a year (as of writing)? Think again.
    3. Got away with invalid XML? Can you be sure that the next version of parser won’t be stricter?
  3. It should be maintainable (aka you and other people should find it easy to operate and modify, now and years later)
    1. Code quality
    2. Tests (if you don’t have tests, even your basic claim that something “works” is under suspicion)
    3. Documentation
      1. How to use your sh*t?
      2. How to set up the development environment?
      3. Decisions
      4. Non-obvious code parts
  4. It should be production ready, not abstract “works” or even worse “works on my machine”
    1. Logs
    2. Metrics
    3. Tested in dev/qa/whatever-you-call it environment
    4. Reproducible – tomorrow they make a new environment, “qa42”, in a different AWS account in a different region. Could somebody else deploy your sh*t there without talking to you?
    5. Update 2020-09-13 (from Guy Egozy) – Scalable enough to be used in production.

If you claim that you are “done” because “it works”, congratulations, you have a (probably) working prototype. That’s typically small part of a project.


Related term – “Tactical Tornado”, look this up.


Update 2020-09-13: Reddit discussion

Python 3.8 Makes me Sad Again

Looking at some “exciting” features landing in Python 3.8, I’m still disappointed and frustrated by the language… like by quite a few other languages.

As an author of another programming language, I can’t stop thinking about how things “should have been done” from my perspective. I want to be explicit here. My perspective is biased towards correctness and “WTF are you doing?”. Therefore, take everything here with a appropriate amount of salt.

Yes, not talking about any “positive” changes here.

Assignment Expressions

There is new syntax := that assigns values to variables as part of a larger expression.

A fix which couldn’t be the best because of previous design decision.

“Somebody” ignored the wisdom of Lisp, which was “everything is an expression and evaluates to a value” (no statements vs expressions), and made assignment a statement in Python years ago. Now this can not be fixed in a straightforward manner. It must be another syntax. Two different syntaxes for almost the same thing which is = for assignment as a statement and := for expression assignment.

Positional-only Parameters

There is a new function parameter syntax / to indicate that some function parameters must be specified positionally and cannot be used as keyword arguments:

def f(a, b, /, c, d, *, e, f):
    print(a, b, c, d, e, f)

Trying to clean up a mess created by mixing positional and named parameters. Unfortunately I did not give it enough thought at the time and copied parameters handling behaviour from Python. Now NGS also has the same problem as Python had before 3.8. Hopefully, I will be able to fix it in some more elegant way than Python did.

LRU cache

functools.lru_cache() can now be used as a straight decorator rather than as a function returning a decorator. So both of these are now supported

OK. Bug fix. But … (functools.py)

    if isinstance(maxsize, int):
        # Negative maxsize is treated as 0
        if maxsize < 0:
            maxsize = 0

If you are setting LRU cache size to a negative number, it’s 99% by mistake. In NGS that would be an exception. That’s the approach that causes rm -rf $myfolder/ to remove / when myfolder is unset. Note that the maxsize code is not new but it’s still there in Python 3.8. I guess that is another mistake which can not be easily fixed now because that would break “working” code.

Collections

The _asdict() method for collections.namedtuple() now returns a dict instead of a collections.OrderedDict. This works because regular dicts have guaranteed ordering since Python 3.7

OK. Everybody had the mistake of making maps unordered: Perl, Ruby, Python.

  1. Ruby fixed that with the release of version 1.9 in 2008 (according to the post).
  2. Python fixed that with the release of version 3.7 in 2018 (which I take as 10 years of “f*ck you, the developer”).
  3. Perl keeps using unordered maps according to documentation.
  4. Same for Raku, again according to the documentation.

NGS had ordered maps from the start but that’s not a fair comparison because NGS project started in 2013, when the mistake was already understood.


How all that helps you, the reader? I encourage deeper thinking about the choice of programming languages that you use. From my perspective, all languages suck, while NGS aims to suck less than the rest for the intended use cases (tl;dr – for DevOps scripting).


Update 2020-08-16

Discussions:

  1. https://news.ycombinator.com/item?id=24176823
  2. https://lobste.rs/s/rgcgjz/python_3_8_makes_me_sad_again

Update 2020-08-17

It looks like the article above needs some clarification about my perspective: background, what I am doing and why.

TL;DR

The main points of the article are:

  1. Everything still sucks, including Python. By sucks I mean does not fit well with the tasks I need to do neither aligned with how I think about these tasks.
  2. I am trying to help the situation and the industry by developing my own programming language

Background about my Thinking

In general, I’m amazed with how bad the overall state of programming is. That includes:

  1. All programming languages that I know including my own NGS. This is aggravated by inability to fix anything properly for any language with substantial amount of code written in it because you will be breaking existing code. And if you do break, you get the shitstorm like with Python 3 or Perl 6 (Raku).
  2. Code quality of the programs written in all languages. Most of the code that I have seen is bad. Sometimes even in official examples.
  3. Quality of available materials, which are sometimes plainly wrong.
  4. Many of existing “Infrastructure as code” solutions, which in most cases follow the same path:
    1. Invent a DSL or use YAML.
    2. “figure out” later that it’s not powerful enough (by the way there is an elegant solution – a programming language, forgot the name)
    3. Create pretty ugly programming language on top of a DSL that was intended for data.

I am creating new programming language and a shell out of frustration with current situation, especially with bash and Python. Why these two? Because that’s what I was and still using to get my tasks done.

Are these languages bad? I don’t think it’s a question with any good answers. These languages don’t fit the tasks that I’m trying to do nor are aligned with how I think while being apparently one of the best choices available.

This Article Background

  1. Seen some post on RSS about new features in Python 3.8.
  2. Took a look.
  3. Yep, everything is still f*cked up.
  4. Wrote a post about it which was not meant to be “deep discussion about Python flaws”.

I was not planning to invest more time in this but here I am trying to clarify.

And your Language is Better? Really?

Let’s clarify “better”. For me, it’s to suck less than the rest for the intended use cases.

author really does consider himself a superior language designer than the Python core-dev team

( From https://www.reddit.com/r/Python/comments/iartgp/python_38_makes_me_sad_again/ )

I consider myself in much easier circumstances:

  1. No substantial amount of code is written in NGS yet.
  2. I’m starting later and therefore have the advantage of looking at more languages, avoiding bad parts, copying (with adaptation) the good parts.
  3. NGS targets a niche, it’s not intended to be general purpose language. Choices are clearer and easier when you target a niche.
  4. The language that I’m creating is almost by definition is more aligned with how I think. Hoping that people out there will benefit from using NGS if it is more aligned with how they think too.
  5. See also my Creating a language is easier now (2016) post.

Will I be able to make a “better” language?

From technical perspective, that’s probable: I am a skilled programmer in several languages and I have languages to look at more than everybody else had before. My disadvantage is not much experience in language design. I’m trying to offset that with thinking hard (about the language, the essence of what is being expressed, common patterns, etc), looking at other languages and experimenting.

From marketing perspective, I need to learn a lot. I am aware that “technically better” doesn’t matter as much as I would like to. Without community and users that would be a failed project.

Also don’t forget luck which I might or might not have.

What if NGS fails?

I think that the situation today is unbearable. I’m trying to fix it. I feel like I have to, despite the odds. I hope that even if NGS fails to move the industry forward it would be useful to somebody who will attempt that later.

NGS Unique Features – the_one()

I’ve spotted the following common data access patterns.

Get the Only Element in an Array

You need to get the only element of an array as in

my_val = my_arr[0]

Additionally you want to express the assumption that there should be exactly one element in the array. In NGS it’s simple:

my_val = my_arr.the_one()

the_one() will return the only element or will throw an exception if there is not exactly one element in the given array.

Get the Only Matching Element in an Array

You have an array. Only one element should satisfy some condition. You want to access that element. Again, there is a straightforward way to express this in NGS:

my_instance = instances.the_one({"InstanceId": my_id})

The use of the_one(...) here is again about the assumption that there should be exactly one instance with the given instance id in the instances array. Exception will be thrown by the_one(...) if that is not the case.

Update 2020-08-08: As pointed out, the method is not unique to NGS.

Update 2021-02-14: FHIRPath also has somewhat similar function: single().


Happy coding and have a nice week!

“Use Dumb Shell, don’t Reinvent the Wheel”

Opening Rant

You don’t hear one developer saying “Just use Notepad” to a colleague with argumentation that goes roughly like this:

Why are you using this horrible Visual Studio Code? It has built-in debugger! No!

JetBrains IDEs? No! They do too much! They are so into the code!

Vim? Emacs? Not pure enough! Who needs that stupid syntax highlighting?

Keep text editing pure! Any semantic understanding by the text editor is undesirable, other programs should handle that. You don’t want to complicate the text editor.

Developers are not saying that because user experience and productivity matter. Yet, “Use Dumb Shell” is considered to be an acceptable opinion. Is that so common that people fall on their heads so hard (alternatively, did not give it any thought)? WTF?

The solution (shell) should be as simple as possible but not simpler than possible. Current shells are simpler than required by good user experience. Wrong trade-off. Keeping something simple is important but not more important than the outcomes.

Source: https://www.flickr.com/photos/toddle_email_newsletters/15413603567/
Image is a link to http://www.workcompass.com/

Additional food for thought:

  1. Why use a car when bicycle is so much simpler?
  2. Why use electricity when fire is so much simpler?
  3. Why have water in your house when a wells are so much simpler?

Background

I was doing consulting. The usual suspects: AWS, bash, Python, Puppet, Chef. Got to Terraform later. I had and I am still having subpar experiences with these tools. Anything I wanted to do, was overly burdensome, complicated and full of pitfalls.

Since I can’t attempt to fix everything, I picked the worst offender and started working on the alternative programming language and shell combo. The motivating opinion is that Ops have no good programming language nor adequate shell.

The absence of good programming language for ops was covered in another post. In this post I will cover some of the things that are wrong with the interactive shell.

The Shell

The dominant player is bash. It didn’t change much for decades: you type commands and get a dump of text on your screen. Most of the alternatives are essentially the same in this regard, for decades.

Is this because of the brilliant design? I would ask: which design? This? Quoting:

I wrote quite complex shell scripts and my first suggestion is “don’t”. The reason is that is fairly easy to make a small mistake that hinders your script, or even make it dangerous.

The “Dumb Shell” Approach

In this post I would like to address common thought that I hear from people regarding Next Generation Shell, a new programming language and a shell that I’m working on. Note that other shells which are more advanced than POSIX shells also get this. Quoting @cup from lobste.rs:

Wouldnt it be better to just have a dumb shell, that can call programs to do heavy lifting (read: programming languages). This way you have a “division of labor”. Shell works best for launching executables, and programming languages work best for handling data structures and algorithms.

No, it would not. I refuse to accept under-powered tools.

Dumbness is Fundamental Flaw

The “dumb shell” has no semantic understanding and doesn’t care about programs’ inputs nor outputs. Let’s see how it plays out.

Today, “Understanding” of programs’ inputs is covered by completion. Completion was added because “dumb shell” had horrible user experience. It’s slightly better now when the shell “understands” programs’ input to some degree. To some people completion is a scope creep. I think of it as better user experience and productivity gain.

“Understanding” of programs’ outputs? We are not there yet. It also seems that interacting with objects on the screen is too novel of an idea for the shell. Considering how much time this idea is out there: WTF?

Let’s see how this “dumbness” manifests as bad user experience even at the very basic, “intended” functionality:

Programs’ Output – Size

Do you know of any real world scenario when a human supposed to go over 10K lines on the screen? I mean just sit there and read it. Let me know. I’ve never seen such use case.

The shell is dumb, the shell “does not intervene” in programs’ outputs. Sounds good until you get unlimited number of lines dumped on your screen.

“Should have used less” you think later. Right. What if you forgot? The buffer is now filled with useless output and you can’t see outputs of previous programs. Are you being punished? No, just nobody cared about the UX. Alternatively, “it would be to complicated to implement”.

Programs’ Output – All Mixed

  1. Want to know what’s on your screen is stdout and what is stderr? Well… you can’t. Your shell is dumb, it doesn’t deal with things like that.
  2. Want to know from which program the output came from? Nope. Some programs cope with that to some degree by prepending their name to error messages: ls xxx gives you ls: xxx: No such file or directory. What a wonderful strategy! Keep the shell dumb and push the burden to all the programs.
  3. You can’t type because some background job is continuing to dump text on the screen where you are trying to work? Too bad, should have used redirection because guess what … you shell doesn’t handle that either… and you can’t add redirection after the program is running; again not shell’s business.

Programs’ Output – Semantic Understanding

You just typed aws ec2 describe instances --filters ... and now you have some output.

You now see on your screen instance you would like to stop. The ID of the instance is right in front of your face. Now you type aws ec2 stop-instances --instance-ids. You would like to append the instance ID that you see on the screen. Nope. Your shell doesn’t do that. Too dumb. Select with the mouse and paste, because f*ck you!

Side note: amazing AWS engineers did not include any human readable output format so you get JSON dumped on your screen (or any other format which is still non-human-compatible).

Let’s imagine for a moment that the command output had some semantic meaning to the shell.

  1. The shell would display the output as a table.
  2. The table would be interactive (interactive output, what a heresy!) and one could navigate with arrow keys and have a shortcut for copy/paste the current cell value to the command line (for completion).
  3. You could interact with the objects in the table with the mouse (very new concept, another heresy for the shell).
  4. How about instead of typing aws ec2 stop-instances --instance-ids you navigate to the correct line, press enter, choose “stop” from the menu and the command is constructed for you? aws ec2 stop-instances --instance-ids i-123... amazing, ha? Well, your shell can’t do that.

Meaning, do you speak it mo***er?

How about after performing operations using the UI you would get as per your choice one of the below snippets which would re-create the operation:

  1. CLI commands
  2. CloudFormation tempalte
  3. Terraform “code”

Solution: UI for the Shell

Suppose I agree for a second, what do you suggest?

https://github.com/ngs-lang/ngs/wiki/UI-Design

I personally don’t see how the described features could be implemented as external programs, keeping the shell “dumb”.

We Can Do Better Today

The reality has changed. What was once amazing is subpar by today’s standards. The world outside of the shell moved forward while the shell stayed almost the same. Brilliant design? Brilliant what?

Let’s move this industry together from the stone age of bash shell to the bronze age of something a bit less subpar – Next Generation Shell.

Closing Rant

Imaginary UNIX people:

We wanted to separate things because they are semantically different so we split the things into stdout and stderr. Well… stderr was is actually for everything that is not stdout.

One bit of metadata (stdout vs stderr) for semantic meaning of the output should be enough for everyone forever. Well… at least it’s simple for us to implement.


Update: discussion on lobste.rs