blainehansen

bio

My Favorite Syntax Ideas

Easy but clear binary operators, chainable everything, and better whitespace sensitive concepts.

published: May 11, 2023 - last updated: May 22, 2023

This post is just a brief pipe dream about some syntax ideas I have, ideas I've desperately wanted to get into a real language. I'm sure no one really cares, but I wanted to have somewhere to point people when I discuss them.

These ideas are powerfully informed by my strong personal opinions about language design, which I won't fully repeat here. Just know that in general:

I like function chaining.
I think whitespace sensitivity is a good idea, because syntactic structure should mirror semantic structure.

Here are the ideas!

§ Chainable freestanding functions

People like function chaining (opens new window), since:

We think left-to-right (in most countries), and find it easier to understand program behavior as an ordered sequence of steps rather than "wrapped" operations. It's easier to understand a.b().c() than c(b(a)).
It allows us to avoid coming up with almost certainly redundant names for intermediate results (coming up with names is hard).

But in most languages only "methods" can be chained, functions that are somehow "intrinsic" to a piece of data. When we want to call a "freestanding" function, we usually have to break the chain.

let thing = some_object
	.method()
	.attr
	.other_method()

freestanding_function(thing)

It would be nice if we could call freestanding functions in the chain!

There are some languages that make it possible to just call a freestanding function using ., with the first argument as the thing with the dot (a.fn(b) same as fn(a, b)). I actually think reusing . is a bad idea, since it makes it unclear where to look for the function if you need to read the definition (is it defined in the left-hand type? or is it freestanding?).

We can get both by just choosing some different symbol for chaining freestanding functions, and I'm preferential to : since there's a similarity to ..

a:func(b, c)

// same as
func(a, b, c)

§ Clean but clear binary operators

The concept of binary operators is for some reason very intuitive (maybe because we have two hands?). The addition operation is very natural to write as left + right and less so as add(left, right).

We get most of the benefits of binary operators with chaining (left.binary(right), left:binary(right)), but the asymmetry between the left and right sides caused by the parentheses are less than desirable.

If we fully embrace whitespace sensitivity we can make this cleaner, by not requiring parentheses if the operator function is spaced away from both left and right arguments:

left .binary right
left :binary right

// wrong!
left:binary right
// wrong!
left :binary(right)

This is better than allowing custom symbolic operators as is done in languages like Haskell (opens new window), since custom symbolic operators are extremely confusing and difficult to read locally, especially for language beginners. It's more legible and approachable to limit symbolic syntax to a small amount everyone can learn in the language docs, and make everything else use alphanumeric names.

Importantly, this syntax shouldn't be allowed for unary calls, since if left .unary was the same as left.unary() it would be ambiguous how to merely access unary, such as might be done when passing it as a higher order function.

Embracing whitespace sensitivity also allows unambiguous use of : for type annotations as well. It's important to remember that all these patterns are easy for simple tools like syntax highlighters to detect and flag appropriately.

// when `:` touches only on the left,
// it's a type annotation
let a: u8 = 0

// when `:` touches on left and right,
// it's a chained freestanding call
a:some_func(b)

// when `:` touches only on the right,
// it's a binary operator
a :some_func b

Doing this would make it a good idea for no space to be required for "normal" invocations (except for multiline chaining, more on that in a second).

§ Whitespace delimited function calls

It's common to split long function calls up onto multiple lines:

func(
	long.complex().expression:fn_do(a, b, c).9,
	long.other().expression:yo_yo(a, b, c),
	hello.here():fn_do(a, b, c),
)

It's frustrating to do this in whitespace sensitive languages, since it replicates the clutter of "closing braces". It could be nice to have a different function call operator for these multiline situations. I'm preferable to :: since it creates a sort of analogy with the : freestanding operator syntax.

func::
	long.complex().expression:fn_do(a, b, c).9
	long.other().expression:yo_yo(a, b, c)
	hello.here():fn_do(a, b, c)

Commas can be used to place arguments on the same line, but are disallowed when using linebreaks:

func::
	a, b, c
	long.complex().expression:func(a, b, c).9
	long.other().expression:yo_yo(a, b, c)
	hello.here():func(a, b, c)

§ New chain operator

It's common to "start" a chain operation on one line that isn't indented, and then continue it on further lines:

let thing = something
	:next_chain()
	.other thing
	.yes(a, b, c)

This is gross, since it makes the alignment of the start of the chain (something above) dependent on the length of whatever comes before it (let thing = above). We can create some "start chain" operator to push things into alignment. Using :: also seems to analogize with its other uses, since again this use isn't directly touching any other tokens:

let thing = ::
	something
	:next_chain()
	.other thing
	.yes(a, b, c)

This is also how you would provide a multiline argument when using multiline function calls:

func::
	a_thing
	::
		complex
		.thing
		.happening
	c_thing

This does increase the line count, but it keeps related things aligned together.

§ Chain "catching" operator

Sometimes you want to act in a more complex way on the current value of the chain. Not all functions return a clean single value that can be chained normally. Introducing some "catching" operator that gives the value of the chain a pattern solves this problem. I like :>, and you can think of it like an unnamed function that's invoked immediately:

let thing = ::
	first
	:second()
	:> return_of_second; func(a, b, return_of_second, d)
	.continue_chain()

Using a pattern means you can also destructure the value:

let thing = ::
	first
	:second()
	:> (one, two, three); one:two(three)
	.continue_chain()

The choice of :> is especially clean if the syntax for unnamed functions is similar, such as |> a, b; ...

Both :> and |> allow either resolving the function with a single expression on the same line as the operator (such as is done above), or continuing on multiple lines like a normal function:

let thing = ::
	first
	:second()
	:> value;
		let a = value.something
		...
		return final_value
	.continue_chain()

§ Chain "tapping" operator

Although less common, sometimes you want to "tap" a chain rather than "catching" it, such as to debug the value. In this situation you don't want to modify the value, you just want to do something with the value and pass it along immediately.

This is already possible by catching with a multiline function and passing along the value after doing whatever you wanted to do:

let thing = ::
	first
	:second()
	:> value;
		dbg value
		return value
	.as_if_catching_never_happened()

But it might be nice to just add a simple "tapping" operator, syntactically similar to the catching operator, that doesn't disrupt the chain. ::> seems good to me:

let thing = ::
	first
	:second()
	::> value; dbg value
	.as_if_tapping_never_happened()

§ "Chained" lambda

Sometimes you want to define a function that only takes one argument, and you'd prefer to not even name that one argument but instead act like you're in a chain. For that you could use |: instead of |>:

let my_lambda = |: .field() :op arg
...

let value = ::
	first
	:my_lambda()
	.next

In general the symbols have these general meanings:

> is for giving a name
: is for chaining
| is for anonymous functions

and the different complex operators are the result of combining them:

:> is for chaining and then giving a name (similar for ::>)
|> is for a function and giving a name
|: is for a function and then chaining

§ "Do" block

The new chain :: operator expects an expression across multiple lines, but we also need a "braceless" way to create blocks of statements that resolve to a value (opens new window). I'm choosing ; since it's already the general "block" operator in the rest of these examples. If we wanted some way to explicitly "return" values from these blocks we could use <; (< in general stands for resolving to a value, the opposite of > which gives a name for a value):

let value = ;
	let a = ...
	let b = ...
	...
	<; final_resolved_value

The mere block concept created with ; is simple and doesn't create any semantic difficulties in the language. But including <; would probably also require optional labels (opens new window) which isn't great. Also it would make it possible to implement control flow effects (opens new window) by allowing the <; to be captured by unnamed functions and such, and that's a much much more complicated matter that I won't explore here.

There you have it! Hope you enjoyed!