Naming in Software – Practical Guide

The title of the post is the title of the book that I wanted to publish for quite some time now. While I was thinking about phrasing and gathering content, somebody else beat me to it with Naming Things: The Hardest Problem in Software Engineering. The main issue that I wanted to solve is now solved. Programmers don’t have an excuse for poor naming anymore.

In light of this event, I’ve decided to make small complementary post out of the materials that I have gathered and move on, focusing on Next Generation Shell.

Me and Naming

I have over 20 years of professional experience in programming. During that time, as many others, I’ve also noted the struggle when it comes to naming.

Here is a list of my accepted naming contributions to various projects.

  1. iterators – function shoes_in_my_size naming 2020-02-16, “The Rust Programming Language” book
  2. Constructors – Get_Contents() method is misnamed 2020-02-23, MS C++ Documentation
  3. Rename howMany() to countSelected() 2023-01, MDN
  4. nilJson naming issue in readme 2023-04, Otterize

Naming Things, the Book

I skimmed Tom’s book to understand how similar it was to what I was about to write. Quite close. If you are struggling with naming, go and read it.

There is some amount of fluff which I think my book would have less. Example: convincing people that naming is important while they are already reading the book.

Overall, I do recommend the book though.

Especially I recommend this book to AWS as an organization, I guess along with other books about code quality in general. AWS, your de-prioritization of code quality is staggering. I mean observable output here, not the stated “Insist on the Highest Standards”.

6.2.6 Use accurate parts of speech

Adding negative example from AWS:

https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/crpg-ref-responses.html

SUCCESS and FAILED are not the same part of speech.

7.2.3 Omit Metadata

Additional reason to exclude data type from the name is to avoid additional changes anywhere in the code except for the declaration.

elt in items

Is items a list here or a set? Probably not important, the code should work with either. On the other hand, changing data type from list to set in the following example will make the code incorrect:

elt in items_list

8.2.4 Use similar names for similar concepts

Adding negative example from AWS, which time after time fails to give consistent names across their APIs.

How do you limit number of results from an API? MaxResults, maxResults, MaxRecords, MaxItems, Limit, limit, … Details at AWS API pagination naming.

It looks like consistent naming is valued less than independence of teams and ability of teams to perform uncoordinated work.

8.2.5 Use consistent antonyms

Adding example.

When I’ve got to name Option type (represents a container that can hold a value or can be empty) in Next Generation Shell, I went with straightforward antonyms.

  • Box (super type)
  • EmptyBox
  • FullBox

Authors of other languages preferred other naming conventions:

Scala: Option, None, Some

Haskell: Maybe, Nothing, Just

Perspective

Information Loss

In my perspective, giving inadequate names is part of a larger issue – Information Loss. Each time you give a name, think which information is now in your head which will be helpful to the reader of the code. If you don’t phrase it concisely and precisely, information loss occurs between your head and the code you are working on. There are several common types of errors one can make:

  1. Don’t provide enough information. Causes the reader to investigate in order to recover the information.
  2. Provide wrong information. That’s the worst, it’s misleading the reader.
  3. Provide too much information. The reader then must sift through the information to get to the relevant parts.

API

Sometimes it’s useful to think of methods as an API. That’s why method names shouldn’t include implementation details (with rare exceptions when they are important to the caller). Think of methods’ names and parameters’ names as a short version of API specification.

Identifiers

Tom’s book deals with naming identifiers, such as functions, classes, variables, etc. One step before naming an identifier is the question whether there should be an identifier.

Avoid Naming

Sometimes, the additional cognitive load is not worth it.

# Bad
chairs = fetch_chairs()
sorted_chairs = chairs.sort()
# also, now have to use the longer identifier in the code below

# Good
chairs = fetch_chairs().sort()

Apply your judgement of course. If it’s a 20 step process, additional identifiers in the middle do contribute to understanding. You still probably don’t want an identifier for each and every step of the calculation.

Do Name – Magic Numbers

Avoid magic numbers through naming. Please ignore whether result is a good name 🙂

# Bad
if result == 126 { ... }

# Good
NOT_EXECUTABLE = 126; # Or better, part of an Enum

if result == NOT_EXECUTABLE { ... }

Do Name – Repetitive Code

If you notice code that repeats, with rare exceptions, you should refactor your code extracting that code to a function or a method with a name.

Several Identifiers

Sometimes a function, a method, or a class do several things. In this case, you might struggle to name it. In a perfect world, the solution to this is refactoring to appropriate pieces.

Test Your Naming

You just named something: a function, a method or a class. Is there a change around the code that would make the name wrong? What if you copy+paste the named piece of code to another project? Would you need to change the name?

# Bad
function start_yellow_cars(cars) { ... } # The function doesn't know or care about the color
yellow_cars = ...
start_yellow_cars(yellow_cars)

# The change that would highlight the wrong naming
# while keeping the code completely functional
function start_yellow_cars(cars) { ... }
my_cars = ...
start_yellow_cars(my_cars)


# Good
function start_cars(cars) { ... }
yellow_cars = ...
start_cars(yellow_cars)

Common Naming Mistakes Observed

  1. Naming a data structure with “JSON” in name.
  2. Argument vs Parameter

Tooling

I highly recommend using IDEs that “understand” the code enough to be able to refactor/rename (classes, methods, functions, parameters) as opposed to text editors which can not assist with renaming to the same extent.


Hope this helps. Happy naming!

Event Loop for Beginners

The aim of the post is to give a simple, concise, and high level description of the event loop. I’m intentionally leaving out many details and being somewhat imprecise.

If you need detailed picture, follow the links in this post. In this case, I recommend reading this short post first.

Why Event Loop?

To deal with concurrency (multiple “things” need to be happening at what looks like the same time), there are few models. To discuss the models, we need to know what a thread is.

Roughly, a thread is a sequence of computing instructions that runs on a single CPU core.

Roughly, Event Loop manages on which tasks the thread works and in which order.

Queue

The queue contains what’s called in different sources messages, events, and tasks.

Message/event/task is a reference to a piece of code that needs to be scheduled to run. Example: “the code responsible for handling the click of button B + full details of the click event to pass to the code”.

Event Loop

  1. The queue is checked for new tasks. If there is none, we wait for one to appear.
  2. The first task in the queue is scheduled and starts running. The code runs till completion. In many languages, await keyword in the code counts as completion and everything after await is scheduled to run later – new task in the queue.
  3. Repeat from number 1. That’s the loop. It’s called Event Loop because it processes events from the queue in a loop.

Adding Events to the Queue

Tasks are added to the queue for two reasons:

  1. Something happened (user clicked on a button, network packet arrived, etc).
  2. The code that was running in step 2 of the event loop scheduled something to run later.

See Also

  1. Event Loop documentation at MDN.
  2. What is the difference between concurrency and parallelism? at StackOverflow

Hope this helps with high level understanding of Event Loop. Have a nice day!

Arguments and Parameters

These two words are used interchangeably. Please don’t. They mean different things. Here is my concise explanation.

Argument

A value passed into a function/method during invocation.

my_func(arg1, arg2)

Additional names for “argument” are “actual argument” and “actual parameter”.

Parameter

A name of a variable in the function/method definition. During invocation, the variable is used in the function/method body to refer to the value of the passed argument.

F my_func(param1, param2) {
  ...
  # Using param1 and param2 for a computation
  ...
}

Additional name for “parameter” is “formal argument”.

Tip – Parametrize

If you struggle to remember which one is which, this might help: when you “parameterize” a piece of code, you add parameters to the code. Then you have the code with the parameter used in it, with the first occurrence in the function/method definition.

# Initial version

echo("Hello, Joe")

# Parametrized version. "name" is a parameter.

F hello(name) {
  echo("Hello, ${name}")
}

See Also


Hope this helps! Have a nice day!


Updates after Reddit discussion:

  • I never asked the difference as an interview question. If I would:
    • Getting this wrong – tiny negative point
    • Not understanding why using correct terminology matters – big negative point
    • Understanding the difference and using these words interchangeably (knowingly incorrectly) – huge negative point
    • Providing fake facts to support your opinion that these words are interchangeable – huge negative point
  • Explaining why using correct terminology matters is out of scope of this post

The new Life of tap()

Background

I’m designing and implementing Next Generation Shell, a programming language (and a shell) for “DevOps” tasks (read: running external commands and data manipulation are frequent).

I came across a programming pattern (let’s call it P) as follows:

  1. An object is created
  2. Some operations are performed on the object
  3. The object is returned from a function (less frequently – stored in a variable)

P Using Plain Approach

The typical code for P looks in NGS like the following:

F my_func() {
  my_obj = MyType()
  my_obj.name = "blah"
  my_obj.my_method(...)
  my_obj  # last expression is evaluated and returned from my_func()
}

The above looks repetitive and not very elegant. Given the frequency of the pattern, I think it deserves some attention.

Attempt 1 – set()

In simpler but pretty common case when only assignment to fields is required after creating the object, one could use set() in NGS:

F my_func() {
  MyType().set(name = "blah")
}

or, for multiple fields:

F my_func() {
  MyType().set(
    name = "blah"
    field2 = 100
    field3 = "you get the idea"
  )
}

Side note: parameters to methods can be separated by commas or new lines, like in the example above.

I feel quite OK with the above but the cons are:

  1. Calling a method is not supported (unless that method returns the original object, in which case one could MyType().set(...).my_method())
  2. Setting of fields can not be interleaved in a straightforward manner with arbitrary code (for example to calculate the fields’ values)

Attempt 2 – tap()

I’m familiar with tap() from Ruby. It looked quite useful so NGS also had tap() for quite a while. Here is how P would look like in NGS when implemented with tap():


F my_func() {
  MyType().tap({
    A.name = "blah"
    A.my_method()
  })
}

Tap takes an arbitrary value, runs the given callback (passing that value as the only argument) and returns the original value. It is pretty flexible.

Can’t put my finger on what’s exactly is bothering me here but the fact is that I was not using tap() to implement P.

Attempt 3 – expr::{ … }

New Life of tap()

This one is very similar to tap() but it is syntactically distinct from tap.

F my_func() {
  MyType()::{
    A.name = "blah"
    # arbitrary code here
    A.my_method()
  }
}

I think the main advantage is that P is easily visually distinguishable. For example, if you only want to know the type of the expression returned, you can relatively easy skip everything between ::{ and } . Secondary advantage is that it’s a slightly less cluttered than tap().

Let’s get into the details of how the above works.

Syntax

  1. MyType() in our case is an expression. Happens to be a method call which returns a new object.
  2. :: – namespace field access operator. Typical use case is my_namespace::my_field.
  3. { ... } – anonymous function syntax. Equivalent to a function with three optional parameters (A, B, and C, all default to null).

Note that all three syntax elements above are not unique to this combination. Each one of them is being used in other circumstances too.

Up until recently, the :: syntax was not allowing anonymous function as the second argument. That went against NGS design: all methods should be able to handle as many types of arguments as possible. Certainly limiting arguments’ types syntactically was wrong for NGS.

Semantics

In NGS, any operator is transformed to a method call. :: is no exception. When e1::e2 is encountered, it is translated into a call to method :: with two arguments: e1 and e2.

NGS relies heavily on multiple dispatch. Let’s look at the appropriate definition of the :: method from the standard library:

F '::'(x, f:Fun) {
  f(x)
  x
}

Not surprisingly, the definition above is exactly like the definition of F tap() ... (sans method and parameters naming).

Examples of expr::{ … } from the Standard Library

# 1. Data is an array. Each element is augmented with _Region field.
data = cb(r)::{
  A._Region = ConstIter(r)
}


# 2. push() returns the original object, which is modified in { ... }
F push(s:Set, v) s::{ A.val[v] = true }


# 3. each() returns the original object.
# Since each() in { ... } would return the keys() and not the Set,
# we are working around that with s::{...}
F each(s:Set, cb:Fun) s::{ A.val.keys().each(cb) }


# 4. Return what c_kill() returns unless it's an error
F kill(pid:Int, sig:Int=SIGNALS.TERM) {
  c_kill(pid, sig)::{
    A == -1 throws KillFail("Failed to kill pid $pid with signal $sig")
    A != 0 throws Error("c_kill() did not return 0 or -1")
  }
}

Side note: the comments are for this post, standard library has more meaningful, higher level comments.

A Brother Looking for Use Cases

While changing syntax to allow anonymous function after ::, another change was also made: allow anonymous function after . so that one could write expr.{ my arbitrary code } . The whole expression returns what the arbitrary code returns. Unfortunately, I did not come across (or maybe haven’t noticed) real use cases. The appropriate . method in the standard library is defined as follows:

F .(x, f:Fun) f(x)

# Allows
echo(5.{ A * 2 })  # 10

Have any use cases which look less stupid than the above? Let me know.

The Pseudo Narrow Waist in Unix

Background

This is a pain-driven response to post about Narrow Waist of Unix Architecture. If you have the time, please read that post.

The (very simplified and rough) TL;DR of the above link:

  1. The Internet has “Narrow Waist”, the IP protocol. Anything that is above that layer (TCP, HTTP, etc), does not need to be concerned with lower level protocols. Each piece of software therefore does not need to concern itself with any specifics of how the data is transferred.
  2. Unix has “Narrow Waist” which is text-based formats. You have a plethora of tools that work with text. On one side of of Narrow Waist we have different formats, on another side text manipulating tools.

I agree with both points. I disagree with implied greatness of the Unix “design” in this regard. I got the impression that my thoughts in this post are likely to be addressed by next oilshell blog posts but nevertheless…

Formats

Like hierarchy of types, we have hierarchy formats. Bytes is the lowest level.

Bytes

Everything in Unix is Bytes. Like in programming languages, if you know the base type, you have a certain set of operations available to you. In case of Bytes in Unix, that would be cp, zip, rsync, dd, xxd and quite a few others.

Text

A sub-type (a more specific type) of Bytes would be Text. Again, like in a programming language, if you know that your are dealing with data of a more specific type, you have more operations available to you. In case of Text in Unix it would be: wc, tr, sed, grep, diff, patch, text editors, etc.

X

For the purposes of this discussion X is a sub-type of Text. CSV or JSON or a program text, etc.

Is JSON a sub-type of Text? Yes, in the same sense that a cell phone is a communication device, a cow is an animal, and a car is a transportation device. Exercise to the reader: are this useful abstractions?

Cow is an animal

The Text Hell

The typical Unix shell approach for working with X are the following steps:

  1. Use Text tools (because they are there and you are proficient wielder)
  2. One of:
    1. Add a bunch of fragile code to bring Text tools to level where they understand enough of X (in some cases despite existing command line tools that deal specifically with X)
    2. Write your own set of tools to deal with the relevant subset of X that you have.
  3. Optional but likely: suffer fixing and extending number 2 for each new “corner case”.

The exception here are tools like jq and jc which continue gaining in popularity (for a good reason in my opinion). Yes, I am happy to see declining number of “use sed” recommendations when dealing with JSON or XML.

Interestingly enough, if a programmer would perform the above mentioned atrocities in almost any programming language today, that person would be pointed out that it’s not the way and libraries should be used and “stop using square peg for round hole”. After few times of unjustified repetition of the same offense, that person should be fired.

Square peg / round hole

Somehow this archaic “Unix is great, we love POSIX, we love Text” approach is still acceptable…

Pipes Text Hell

  1. Create a pipe between different programs (text output becomes text input of the next program)
  2. Use a bunch of fragile code to transform between what first program produces and the second one consumes.

Where Text Abstraction is not Useful

Everywhere almost. In order to do some of the most meaningful/high-level operations on the data, you can’t ignore it’s X and just work like it is Text.

Editing

The original post says that since the format is Text, you can use vim to edit it. Yes you can… but did you notice that any self respecting text editor comes with plugins for various X’s? Why is that? Because even the amount of useful “text editing” is limited when all you know you are dealing with Text. You need plugins for semantic understanding of X in order to be more productive.

Wanna edit CSV in a text editor without CSV plugin? OK. I prefer spreadsheet software though.

Have you noticed that most developers use IDEs that “understand” the code and not Notepad?

Lines Count

Simple, right? wc -l my.csv. Do you know the embedded text in quotes does not have newlines? Oops. Does it have header line? Oops.

Text Replacement

Want to try to rename a method in a Java program? sed -i 's/my_method/our_method/g' *.java, right? Well, depends on your luck. I would highly recommend to do such kind of refactoring using an IDE that actually understands Java so that you rename: only specific method in a specific class as opposed to unfortunately named methods and variables, not to mention arbitrary strings.

Search / Indexing

Yep… except that understanding of the semantics helps here quite a bit. That’s why you have utilities which understand specific programming languages that do the indexing.

Conclusion

I do not understand the fascination with text. Still waiting for any convincing arguments why is it so “great” and why the interoperability that it provides is not largely a myth. Having a set of tools enabling one to do subpar job each time is better than not having them but is it the best we can?

My previous dream of eradicating text where it does not make sense (my blog post from 2009) came true with HTTP/2. Apparently I’m not alone in this regard.

Sorry if anything here was harsh. It’s years of pain.

Clarification – Layering

Added: 2022-02-07 (answering, I hope, https://www.reddit.com/r/ProgrammingLanguages/comments/t2bmf2/comment/hzm7n44/)

Layering in case of IP protocol works just fine. Implementer of HTTP server really does not care about the low level transport details such as Ethernet. Also the low level drivers don’t care which exactly data they deliver. Both sides of the Waist don’t care about each other. This works great!

My claim is that in case of the Text Narrow Waist, where X is on one hand of and the Text tools are on the other, there are two options:

  1. Tools ignore X and you have very limited functionality you get out of the tools.
  2. Tools know about X but then it’s “leaky abstraction” and not exactly a Narrow Waist.

That’s why I think that in case of Text, the Narrow Waist is more of an illusion.


Have a nice week!

On Accidental Serialization Formats

Let’s talk about the “just separate with comma and stick it into one field” type of serialization.

You had two strings (abc and def) and you joined them with a separator. What do you have now? One string with two elements, right? Right, abc,def. Well… two or more actually, depending on how many times the chosen separator occurred in the original strings: if they were a,bc and def, you’ve got a,bc,def, which is 3 elements according to our format. Oops. Leaving out the question whether leading and trailing spaces are significant.

Wanna add escaping for the separator then? a,bc and def are now serialized as a\,bc,def. Now the parsing became more complex. You can’t just split the string by the separator (you would get 3 elements: a\ and bc and def. You need to scan the serialized data, considering escaping when splitting. You also need to remove the escaping character from the result. How about escaping the escape character? If original data is a\bc, it is serialized as a\\bc). Yet another something not to forget.

Don’t like escaping then? How about encoding like in URL? a,bc becomes a%2Cbc. You can now once again split the string by the separator character… assuming it was encoded. Which characters you encode anyway? If you encode all ASCII characters, the result is 3 times the original and is completely unreadable. It least you are “safe” with regards to separator now, it is encoded for sure so no split problems. You have to add a decoding routine now though.

If your serialized thing goes into a database, consider how indexing would work. It probably won’t. Maybe you should model your domain properly in the database and not serialize at all. Hint: if the values ever need to be treated differently/separately by the database, they go into different cells/rows/columns/fields, not one. There are very rare exceptions. Notable exception is the ability of databases to handle JSON fields (examples: MySQL, PostgreSQL). Note that this capability can fit or not fit your use case.

Want to satisfy your artistic needs and do something clever about the serialization? Do it at home then please. Don’t waste time that your colleagues could use on something more productive than dealing with your custom format.

Strong advice: don’t do custom serialization format, use existing serialization formats and libraries.


Seen something to add to the above? Leave a comment!

Failed Stealing from Python

I made a mistake. Hope you will learn something from it.

Mental Shortcuts

Heuristic, tl;dr for the purposes of this article – mental shortcut. The brain chooses to do less thinking to save energy. It relies on simple rules to get the result. The result is correct… some times.

I took a mental shortcut when working on my own programming language, Next Generation Shell. It was a mistake.

Additionally, I have ignored the uneasy but very vague feeling that what I’m doing is not 100% correct. From my experience I knew I shouldn’t ignore it but I still did it. Another mistake.

I “thought”

Below are heuristics that led to the wrong decision.

Copying features from popular languages is pretty “safe”. After all, “everybody” is using the language and it’s OK. Social proof. Wrong. Everybody does mistakes. Popular languages have design issues too.

It’s OK to copy a feature because it’s very basic aspect of a language. Nope. Python messed up arguments passing. And I copied that mess.

The Fix

Python 3.8 has the fix. I have mixed feelings about it. Still not sure how I should fix it in NGS.

Takeaway

Beware of mental shortcuts. There are situations where these are not acceptable. The tricky part is to detect that you are using a mental shortcut in a situation where it’s not appropriate. I hope that with awareness and practice we can do it.

Also note that your $job is most likely paying you to not take mental shortcuts.

“But it works”

TL;DR – this is not nearly good enough in most cases and it’s only small fraction of what you are paid for.

I want this post to be the canonical place to refer people to who say “but it works” because people who explain why this is not OK are tired of repeating the same arguments, me included.

You are paid for …

Following is not an exhaustive list but it should give you some perspective which is opposite from the narrow-minded “but it works”.

Typically, your Software Engineering $Job pays you for:

  1. Of course the thing must work. But also..
  2. It should continue working
    1. Gives deprecation warnings? Probably not good.
    2. Only runs on Node.js v10 LTS which is end of life in less than a year (as of writing)? Think again.
    3. Got away with invalid XML? Can you be sure that the next version of parser won’t be stricter?
  3. It should be maintainable (aka you and other people should find it easy to operate and modify, now and years later)
    1. Code quality
    2. Tests (if you don’t have tests, even your basic claim that something “works” is under suspicion)
    3. Documentation
      1. How to use your sh*t?
      2. How to set up the development environment?
      3. Decisions
      4. Non-obvious code parts
  4. It should be production ready, not abstract “works” or even worse “works on my machine”
    1. Logs
    2. Metrics
    3. Tested in dev/qa/whatever-you-call it environment
    4. Reproducible – tomorrow they make a new environment, “qa42”, in a different AWS account in a different region. Could somebody else deploy your sh*t there without talking to you?
    5. Update 2020-09-13 (from Guy Egozy) – Scalable enough to be used in production.

If you claim that you are “done” because “it works”, congratulations, you have a (probably) working prototype. That’s typically small part of a project.


Related term – “Tactical Tornado”, look this up.


Update 2020-09-13: Reddit discussion

Python 3.8 Makes me Sad Again

Looking at some “exciting” features landing in Python 3.8, I’m still disappointed and frustrated by the language… like by quite a few other languages.

As an author of another programming language, I can’t stop thinking about how things “should have been done” from my perspective. I want to be explicit here. My perspective is biased towards correctness and “WTF are you doing?”. Therefore, take everything here with a appropriate amount of salt.

Yes, not talking about any “positive” changes here.

Assignment Expressions

There is new syntax := that assigns values to variables as part of a larger expression.

A fix which couldn’t be the best because of previous design decision.

“Somebody” ignored the wisdom of Lisp, which was “everything is an expression and evaluates to a value” (no statements vs expressions), and made assignment a statement in Python years ago. Now this can not be fixed in a straightforward manner. It must be another syntax. Two different syntaxes for almost the same thing which is = for assignment as a statement and := for expression assignment.

Positional-only Parameters

There is a new function parameter syntax / to indicate that some function parameters must be specified positionally and cannot be used as keyword arguments:

def f(a, b, /, c, d, *, e, f):
    print(a, b, c, d, e, f)

Trying to clean up a mess created by mixing positional and named parameters. Unfortunately I did not give it enough thought at the time and copied parameters handling behaviour from Python. Now NGS also has the same problem as Python had before 3.8. Hopefully, I will be able to fix it in some more elegant way than Python did.

LRU cache

functools.lru_cache() can now be used as a straight decorator rather than as a function returning a decorator. So both of these are now supported

OK. Bug fix. But … (functools.py)

    if isinstance(maxsize, int):
        # Negative maxsize is treated as 0
        if maxsize < 0:
            maxsize = 0

If you are setting LRU cache size to a negative number, it’s 99% by mistake. In NGS that would be an exception. That’s the approach that causes rm -rf $myfolder/ to remove / when myfolder is unset. Note that the maxsize code is not new but it’s still there in Python 3.8. I guess that is another mistake which can not be easily fixed now because that would break “working” code.

Collections

The _asdict() method for collections.namedtuple() now returns a dict instead of a collections.OrderedDict. This works because regular dicts have guaranteed ordering since Python 3.7

OK. Everybody had the mistake of making maps unordered: Perl, Ruby, Python.

  1. Ruby fixed that with the release of version 1.9 in 2008 (according to the post).
  2. Python fixed that with the release of version 3.7 in 2018 (which I take as 10 years of “f*ck you, the developer”).
  3. Perl keeps using unordered maps according to documentation.
  4. Same for Raku, again according to the documentation.

NGS had ordered maps from the start but that’s not a fair comparison because NGS project started in 2013, when the mistake was already understood.


How all that helps you, the reader? I encourage deeper thinking about the choice of programming languages that you use. From my perspective, all languages suck, while NGS aims to suck less than the rest for the intended use cases (tl;dr – for DevOps scripting).


Update 2020-08-16

Discussions:

  1. https://news.ycombinator.com/item?id=24176823
  2. https://lobste.rs/s/rgcgjz/python_3_8_makes_me_sad_again

Update 2020-08-17

It looks like the article above needs some clarification about my perspective: background, what I am doing and why.

TL;DR

The main points of the article are:

  1. Everything still sucks, including Python. By sucks I mean does not fit well with the tasks I need to do neither aligned with how I think about these tasks.
  2. I am trying to help the situation and the industry by developing my own programming language

Background about my Thinking

In general, I’m amazed with how bad the overall state of programming is. That includes:

  1. All programming languages that I know including my own NGS. This is aggravated by inability to fix anything properly for any language with substantial amount of code written in it because you will be breaking existing code. And if you do break, you get the shitstorm like with Python 3 or Perl 6 (Raku).
  2. Code quality of the programs written in all languages. Most of the code that I have seen is bad. Sometimes even in official examples.
  3. Quality of available materials, which are sometimes plainly wrong.
  4. Many of existing “Infrastructure as code” solutions, which in most cases follow the same path:
    1. Invent a DSL or use YAML.
    2. “figure out” later that it’s not powerful enough (by the way there is an elegant solution – a programming language, forgot the name)
    3. Create pretty ugly programming language on top of a DSL that was intended for data.

I am creating new programming language and a shell out of frustration with current situation, especially with bash and Python. Why these two? Because that’s what I was and still using to get my tasks done.

Are these languages bad? I don’t think it’s a question with any good answers. These languages don’t fit the tasks that I’m trying to do nor are aligned with how I think while being apparently one of the best choices available.

This Article Background

  1. Seen some post on RSS about new features in Python 3.8.
  2. Took a look.
  3. Yep, everything is still f*cked up.
  4. Wrote a post about it which was not meant to be “deep discussion about Python flaws”.

I was not planning to invest more time in this but here I am trying to clarify.

And your Language is Better? Really?

Let’s clarify “better”. For me, it’s to suck less than the rest for the intended use cases.

author really does consider himself a superior language designer than the Python core-dev team

( From https://www.reddit.com/r/Python/comments/iartgp/python_38_makes_me_sad_again/ )

I consider myself in much easier circumstances:

  1. No substantial amount of code is written in NGS yet.
  2. I’m starting later and therefore have the advantage of looking at more languages, avoiding bad parts, copying (with adaptation) the good parts.
  3. NGS targets a niche, it’s not intended to be general purpose language. Choices are clearer and easier when you target a niche.
  4. The language that I’m creating is almost by definition is more aligned with how I think. Hoping that people out there will benefit from using NGS if it is more aligned with how they think too.
  5. See also my Creating a language is easier now (2016) post.

Will I be able to make a “better” language?

From technical perspective, that’s probable: I am a skilled programmer in several languages and I have languages to look at more than everybody else had before. My disadvantage is not much experience in language design. I’m trying to offset that with thinking hard (about the language, the essence of what is being expressed, common patterns, etc), looking at other languages and experimenting.

From marketing perspective, I need to learn a lot. I am aware that “technically better” doesn’t matter as much as I would like to. Without community and users that would be a failed project.

Also don’t forget luck which I might or might not have.

What if NGS fails?

I think that the situation today is unbearable. I’m trying to fix it. I feel like I have to, despite the odds. I hope that even if NGS fails to move the industry forward it would be useful to somebody who will attempt that later.

NGS Unique Features – the_one()

I’ve spotted the following common data access patterns.

Get the Only Element in an Array

You need to get the only element of an array as in

my_val = my_arr[0]

Additionally you want to express the assumption that there should be exactly one element in the array. In NGS it’s simple:

my_val = my_arr.the_one()

the_one() will return the only element or will throw an exception if there is not exactly one element in the given array.

Get the Only Matching Element in an Array

You have an array. Only one element should satisfy some condition. You want to access that element. Again, there is a straightforward way to express this in NGS:

my_instance = instances.the_one({"InstanceId": my_id})

The use of the_one(...) here is again about the assumption that there should be exactly one instance with the given instance id in the instances array. Exception will be thrown by the_one(...) if that is not the case.

Update 2020-08-08: As pointed out, the method is not unique to NGS.

Update 2021-02-14: FHIRPath also has somewhat similar function: single().


Happy coding and have a nice week!