Building a language?

Started by oconnor0

oconnor0

I'm a professional compiler developer and so find myself annoyed with existing languages.

What projects or languages exist that build a language up from "first principles"? Lisp and Scheme are the obvious two since they both allow constructing new abstractions on top of the existing ones. I'm interested in something more statically typed.

I have this idea about a language where each layer of abstraction is built on top of the previous one and is done in such a way that the language is built from the "ground up". I love functional languages but find them so disconnected from the actual hardware we run on.

I don't know exactly what I'm asking for… :D

Kartik Agaram

I've advocated before [1, 2] that we all use distinct words for "I know how this works but it's useful not to think about its internals most of the time" and "I don't care how this works; somebody else provides it for me". I call the first "abstraction" and the second "service". Conflating the two muddies many a conversation about abstractions and good software practice.

(I didn't follow your Mu question. Perhaps there's a typo?)

oconnor0

Yeah, I've looked at concatenative languages because of that, but I haven't been convinced they scale up nicely in abstractions.

oconnor0

Your thoughts about having a language that enables the encoding of why along with what and how is fascinating. Do you feel like you've captured that with Mu? Or gotten closer?

Abstractions and naming are really hard. I have spent hours stuck on a problem because I couldn't name some transformation or operation. This is one of the reasons I've been reticent to "believe" in concatenative languages. They can read beautifully - if all the words are small and properly named. As soon as it turns into bi dup dup swap my eyes glaze over.

And abstractions have a tendency to leak and to miss important details - requiring ugly hacks or poking at the innards to do what is needed.

I am also very intrigued by both total languages and dependently-typed languages. But one of my big peeves about languages is when they have dramatically different syntaxes for different things. C++ being my favorite example. Recursion is relatively easy to write, but attempting to do the same recursion at compile-time in the template system is awful. C++ template syntax is so foreign that I find myself incapable of thinking in it beyond the simplest of expression.

Kartik Agaram

Do you feel like you've captured that [encoding of why] with Mu? Or gotten closer?

Yes, it definitely seems like an improvement. The catch is that it works best if baked in from the lowest level up. So I'm stumped on how to apply it to a stack without needing to reinvent all of modern software.

I too used to obsess about abstraction, but I don't anymore. The key for me was to realize that I can always change an abstraction when I realize it's not good enough. Abstractions are only hard when we feel disempowered from changing them in the name of backwards compatibility, componentization and other such false gods. Once you can change an abstraction it no longer has to be perfect, just the best idea you have had so far. You can just switch to better abstractions as new issues come up.

oconnor0

Attempting to reinvent all of modern software is probably a losing proposition. Can you embed reasoning in from Mu from its lowest layer?

Changing abstractions only works when I have control over the abstraction and its users. When someone else provides the abstraction or when I am responsible for providing that abstraction for others, I am not able nor empowered to change it.

Kartik Agaram

I've advocated before [1, 2] that we all use distinct words for "I know how this works but it's useful not to think about its internals most of the time" and "I don't care how this works; somebody else provides it for me". I call the first "abstraction" and the second "service". Conflating the two muddies many a conversation about abstractions and good software practice.

(I didn't follow your Mu question. Perhaps there's a typo?)

oconnor0

Basically, can you write the lowest layer of Mu on top of an existing OS and attempt to encode the appropriate assumptions and reasoning into that layer rather than starting from bare metal?

Kartik Agaram

Yes, that is basically what the current version does. For example, I started out implementing raw-mode terminal manipulation primitives using ncurses, then switched to the more lightweight termbox libraries. I've been gradually stripping out everything between Mu and the kernel, and I kinda understand now how the termbox library calls translate to raw ioctl()s.

However, I'm not sure how to wrap around certain properties of kernels that programmers may care about and want to write tests for. For example, fsync() has many gotchas. A program that used fsync() would reasonably want to check against undesirable behaviors in its automated tests. However, to make fsync() testable would require giving access from userland to kernel-private data structures.

marco

What projects or languages exist that build a language up from "first principles"?

What language doesn't try to be based on "first principles"? It depends on where you start building from. My language Egel can be thought of as being derived from "first principles" since it is based entirely on untyped eager combinatorial rewriting. But Lisp is also derived from first principles as a list processing language, and so is C, a language based around some abstract Von Neumann machine.

Duncan Cragg

Nice to see some activity here!

My own language is aimed at non-coders - at least at the level of those comfortable with spreadsheet programming. Like spreadsheets, it works well via a 2D 'visual' interface.

Its 'style' could be described as graph rewriting or state transformation. There isn't really a known 'paradigm' it falls under. I call it 'Functional Observer': linked data objects set their state as a function of their own current state plus the state of peer objects observed 'through' those links.

Duncan Cragg

@Kartik:

Thanks!

I'm building it in … C ! :-)

Well, the core in C, the outside bits for Android, etc, in C++.

I'm currently playing with Vulkan for the render, which gives me an easy route into 3D when I'm ready for that.

Kartik Agaram

Given that my Mu is also in C[1], I think we have a lot in common here. What I'm curious to see is if there's a graphics stack that I can steal from you. I gave up on graphics programming 15 years ago when I left Allegro on DOS and switched to Linux. X has always been a pain in my backside.

[1] Well, C++, but in a very C style.

Duncan Cragg

Oh well it's a bit early days to share code .. the fullest implementation of the core ideas is in 'NetMash' in Java. The C version (Onex) is really a re-implementation, along with lessons learned and an outside-in, UI-first approach - to get early feedback (from my core demographic: my family - wife and two daughters!!)

Duncan Cragg

I like the idea of "C++ in a C style" - I do like some of the benefits of C++ but I ended up hating it when I was a C++ contractor. C is a simple, honest language.

I'll put my first slice of code up on GitHub as soon as it's vaguely coherent.. :-)

Duncan Cragg

Just realised I didn't get to the point w.r.t. the OP's question and Onex.

My approach is layered in the sense that Onex can easily become an operating system (it's being written in C for that reason).

It's built from four components - Object Network Format, Protocol, Rules and Types:

  • 'ONF': really the core mechanism that handles my 'objects' and their observation, notification and persistence
  • 'ONP': a P2P distribution protocol that allows objects to observe over the net
  • 'ONR': the Onex programming language
  • 'ONT': the i/o, the types, of objects (an object has a broad type but dynamically not statically)

ONF and ONP can replace or be built into the operating system, providing an API on which to build ONR and the ONT application and i/o types.

The programming model is so 'inside out' relative to conventional approaches, that it does actualy make sense to rebuild the whole stack, ground up. I guess IP networking can stay, though!

Realistically, it'll never replace current approaches, so my OS ambitions are limited to IoT devices (which probably won't even need IP :-) .