December 2022 – Ilya's blog

Following is my opinion of the interactive mode of the Unix Shell. The interactive mode of the Unix shell is not much different from using Telegraph. I’ll substantiate the claim and point out what went wrong.

History

Short history of relevant inventions and events follows.

Telegraph – 1840s

Telegraph is a “point-to-point text messaging system”

Teleprinter – 1887

Teleprinter improved over telegraph’s user interface by adding a keyboard. Essentially it’s still a text messaging system.

“Computers used teleprinters for input and output from the early days of computing.”

Computer Terminal with VDU – 1950s

Computer Terminal with a “video display unit (VDU)” improved over teleprinter by replacing the printer with a screen.

( Summary Till this Point )

We see incremental progress in transmitting text. First between humans and later between a human and computer. The core concept did not change over time. Send text, receive text in response.

VT 52 – 1974/1975

VT 52 is released. It’s a pretty big deal because it supports cursor movement. Text on the screen can therefore be updated. In more general terms, it allowed interaction with what’s on the screen.

Bill Joy – vi – 1976

Bill Joy released vi in 1976. He figured that since cursor movement is supported he can make an editor that actually interactively uses the whole screen. That’s in contrast to the ex editor which didn’t.

What Went Wrong with the Shell?

The shell never caught up with the idea of interactivity on the screen. It’s still a “point-to-point text messaging system”. The paradigm shift never happened. Treating the received text as if it was printed on paper puts the idea of interacting with the output completely out of the realm of possibilities.

The “interactive” shell is not that much interactive. It is only “interactive” compared to “batch”. If you take a 24 lines terminal, considering that all interaction happens on one line (the “command line” I guess), that’s about 4% interactivity by line count.

Practical Consequences

Copy + Paste

Does the following sequence of operations happen to you when working in a shell?

Type and run a command
Start typing a second command
Copy something from the output of the first command
Paste that as an argument to the second command

I know that it does happen to me quite often. Why can’t the shell (tab) complete from the output of the first command? Because the shell has no idea (1) what’s in that output (2) what’s the semantic meaning of what’s in that output and how to use it for completion.

Imagine for a moment a website where you see a list of items but if you want more information about one of them, you need to copy its name and paste somewhere else, adding a word or two. Sounds weird? How is this OK in a shell then?

For the “shell is not supposed to do that” camp: you are welcome to continue to use Nano, Pico and Notepad for programming because “text editor is not supposed to do that”; I’ll use an IDE because it’s more productive.

Each Command on its own

Typically a shell user tries to achieve a goal. For that, typically a series of related commands. The Unix shell doesn’t know or care about that.

Let’s say you issued ls *.txt and you see the file that you were interested in in the output. You can’t interact with it despite being on the screen right in front of you so you proceed. Now you typed cp and pressed tab for completion. The completion will happily try to complete the names of all the files in the directory, not only *.txt which are way more likely to be meant by the user. There isn’t be any relation between sequentially issued commands in the Unix shell. Want composition? OK, use pipe or start scripting.

I might expand this article in the future.

Background

I’m designing and implementing Next Generation Shell, a programming language (and a shell) for “DevOps” tasks (read: running external commands and data manipulation are frequent).

I came across a programming pattern (let’s call it P) as follows:

An object is created
Some operations are performed on the object
The object is returned from a function (less frequently – stored in a variable)

P Using Plain Approach

The typical code for P looks in NGS like the following:

F my_func() {
  my_obj = MyType()
  my_obj.name = "blah"
  my_obj.my_method(...)
  my_obj  # last expression is evaluated and returned from my_func()
}

The above looks repetitive and not very elegant. Given the frequency of the pattern, I think it deserves some attention.

Attempt 1 – set()

In simpler but pretty common case when only assignment to fields is required after creating the object, one could use set() in NGS:

F my_func() {
  MyType().set(name = "blah")
}

or, for multiple fields:

F my_func() {
  MyType().set(
    name = "blah"
    field2 = 100
    field3 = "you get the idea"
  )
}

Side note: parameters to methods can be separated by commas or new lines, like in the example above.

I feel quite OK with the above but the cons are:

Calling a method is not supported (unless that method returns the original object, in which case one could MyType().set(...).my_method())
Setting of fields can not be interleaved in a straightforward manner with arbitrary code (for example to calculate the fields’ values)

Attempt 2 – tap()

I’m familiar with tap() from Ruby. It looked quite useful so NGS also had tap() for quite a while. Here is how P would look like in NGS when implemented with tap():


F my_func() {
  MyType().tap({
    A.name = "blah"
    A.my_method()
  })
}

Tap takes an arbitrary value, runs the given callback (passing that value as the only argument) and returns the original value. It is pretty flexible.

Can’t put my finger on what’s exactly is bothering me here but the fact is that I was not using tap() to implement P.

Attempt 3 – expr::{ … }

This one is very similar to tap() but it is syntactically distinct from tap.

F my_func() {
  MyType()::{
    A.name = "blah"
    # arbitrary code here
    A.my_method()
  }
}

I think the main advantage is that P is easily visually distinguishable. For example, if you only want to know the type of the expression returned, you can relatively easy skip everything between ::{ and } . Secondary advantage is that it’s a slightly less cluttered than tap().

Let’s get into the details of how the above works.

Syntax

MyType() in our case is an expression. Happens to be a method call which returns a new object.
:: – namespace field access operator. Typical use case is my_namespace::my_field.
{ ... } – anonymous function syntax. Equivalent to a function with three optional parameters (A, B, and C, all default to null).

Note that all three syntax elements above are not unique to this combination. Each one of them is being used in other circumstances too.

Up until recently, the :: syntax was not allowing anonymous function as the second argument. That went against NGS design: all methods should be able to handle as many types of arguments as possible. Certainly limiting arguments’ types syntactically was wrong for NGS.

Semantics

In NGS, any operator is transformed to a method call. :: is no exception. When e1::e2 is encountered, it is translated into a call to method :: with two arguments: e1 and e2.

NGS relies heavily on multiple dispatch. Let’s look at the appropriate definition of the :: method from the standard library:

F '::'(x, f:Fun) {
  f(x)
  x
}

Not surprisingly, the definition above is exactly like the definition of F tap() ... (sans method and parameters naming).

Examples of expr::{ … } from the Standard Library

# 1. Data is an array. Each element is augmented with _Region field.
data = cb(r)::{
  A._Region = ConstIter(r)
}


# 2. push() returns the original object, which is modified in { ... }
F push(s:Set, v) s::{ A.val[v] = true }


# 3. each() returns the original object.
# Since each() in { ... } would return the keys() and not the Set,
# we are working around that with s::{...}
F each(s:Set, cb:Fun) s::{ A.val.keys().each(cb) }


# 4. Return what c_kill() returns unless it's an error
F kill(pid:Int, sig:Int=SIGNALS.TERM) {
  c_kill(pid, sig)::{
    A == -1 throws KillFail("Failed to kill pid $pid with signal $sig")
    A != 0 throws Error("c_kill() did not return 0 or -1")
  }
}

Side note: the comments are for this post, standard library has more meaningful, higher level comments.

A Brother Looking for Use Cases

While changing syntax to allow anonymous function after ::, another change was also made: allow anonymous function after . so that one could write expr.{ my arbitrary code } . The whole expression returns what the arbitrary code returns. Unfortunately, I did not come across (or maybe haven’t noticed) real use cases. The appropriate . method in the standard library is defined as follows:

F .(x, f:Fun) f(x)

# Allows
echo(5.{ A * 2 })  # 10

Have any use cases which look less stupid than the above? Let me know.

Ilya's blog

Systems and software engineering

Month: December 2022

Telegraph and the Unix Shell