Michael Alyn Miller

Enforth


I’m not going to tell the story the way it happened. I’m going to tell it the way I remember it.

I created Enforth more than four years ago. The idea had been in my head for a long time, but it was a complex problem and I didn’t have an immediate need for the technology.

And then this popped up in Freenode’s #forth channel one day:

I’ve got a forth running on an avr atmega32u4, but I’m really tight on RAM. the code I want to run is only 50 lines ATM, but I can only load a quarter or so of it before I run out of space. thinking about ditching it for arm, but I’m wondering how feasible of an idea it is to run forth with 2.5kb of ram in the first place.

I felt that an AVR-compatible Forth environment was possible and this message was enough of a challenge that I wanted to give it a try.

My first step was abandoning the idea of creating a standalone AVR Forth. Similar efforts replace the bootloader and are thus incompatible with the larger ecosystem around the AVR and its Arduino libraries.

That last point is something that I had only recently come to appreciate. I have numerous half-completed electronics projects that were designed around bare metal Forth environments. SwiftForth is wonderful, for example, and I had started reflashing all of my Arduinos with a raw SwiftForth environment.

The downside to that approach is that I had to code everything myself. Every Adafruit delivery meant that I had a new device driver to write. I eventually realized that I was missing the entire point of the Arduino ecosystem, not to mention Adafruit’s significant contributions to that ecosystem.

The Arduino is a wonderful device precisely because you do not need to do everything on your own. There are libraries for almost every purpose, and virtually every piece of hardware. Buying Adafruit products and not using their libraries was insane and meant that I was spending all of my time writing firmware and no time actually making things.

That #forth message came at the perfect time for me: I still wanted a Forth on the Arduino, but I wanted one that would embrace the Arduino ecosystem and all of its wonderful contributors.

Goals

My goals for Enforth were very straightforward:

These goals suggested a few technical requirements:

The Crazy Idea

I had written a Forth – MFORTH – for the Tandy M100 laptop a few years earlier, and it was quite the undertaking. You mix assembly language and Forth together and eventually end up with a working Forth system. I wanted to do something different with Enforth. First, I didn’t want to write AVR assembly. My goal was not maximum performance, it was maximum interoperability. The idea was to make FFI as simple as possible.

My original design focused almost entirely on the FFI. The idea was that the inner interpreter of Enforth would effectively just be a loop that called one FFI after another. FFIs could be Forth words (DUP, DROP, etc.) or external libraries (digitalWrite, for example).

I got that working pretty quickly, organizing the FFI signature such that you could directly jump to the target function by blindly loading the Forth stack into registers. This was designed to minimize the overhead of calling an FFI; a key thing in Forth, given the high occurrence of function calls in Forth programs.

This worked for a while and I constantly checked the generated C code against the disassembled AVR code. This led to optimizations like inverting the stack direction in order to minimize assembly overhead.

Eventually this introduced too much overhead in Enforth. Not everything had to be a literal FFI call, especially when most Forth calls are internal Forth words. I decided to change the implementation to a token-threaded Forth in order to minimize the size of each compiled word.

The Spiritual Successor to MFORTH

Over time, Enforth began to resemble MFORTH more and more. I ended up using the same dictionary layout and ported many of MFORTH’s kernel words over to Enforth, often as-is. This was a helpful lesson in Forth design. It turns out that many Forth implementations share common design principles for a reason: those principles are the best way to make an efficient Forth.

It was around this time that I changed the name of the project from ARF (Arduino Forth) to MFORTH. I then changed the name again to Enforth. I didn’t want two things with the same name, so MFORTH was out, and I liked the duality implied by the name Enforth: “n” is the next letter after “m”, but more importantly, Enforth was guided by the goal of embeddability, and so I wanted the name to suggest that you added Enforth to an existing system rather than replacing your runtime with Enforth.

That is quite a contrast with most other embedded systems Forths, which are runtimes unto themselves. Ficl and other C-based Forths tend to be an exception to that, but I looked at those systems and they were all too large for the constraints of an AVR processor.

Interlude: Needs More LISP

Eventually Enforth became too complex to compile by hand, especially as I started to play with more esoteric dictionary layouts. One of my earlier approaches to reducing the size of compiled words was to multiply the raw value of the Enforth opcode by a constant in order to locate the ROM address of the definition.

To implement that approach I needed to pad all of the definitions to a multiple of the same size – 4 bytes, 8 bytes, etc. – sort those definitions by size, then assign opcodes based on the sort order. I created a Clojure program to do that work for me. This was invaluable during the early development of Enforth as I was constantly adding words and the only way to make room for new words was to expand the multiplier so that I could fit more words into the same space.

Long term, though, this multiplier-based approach didn’t pan out. I eventually ran out of opcodes due to the sheer number of definitions required to implement an ANS Forth. The amount of padding required in the definitions also became extensive, and thus wasteful of the ROM.

The Clojure-based definition generator still exists, though not as a way to sort the ROM definitions. Instead, the generator now has two jobs:

  1. It reads the enforth.c file and looks for specially-formatted comments in order to detect opcodes. Those opcodes become the primitives that can be used in other dictionary definitions. A jump table and opcode list are output from this phase of the process.
  2. The EDN-based definitions files are loaded and compiled into bytecode.

This is not quite a traditional Forth metacompiler, but it’s close, and for bonus points it’s in (badly-written) Clojure. Note to self: do not learn Clojure while trying to write a Forth metacompiler.

That “Four Years Ago” Part

Why did it take four years to release Enforth and write this blog post? I could say that Enforth wasn’t “done” – and that is true – but ultimately I let my possible-future-maybe-someday plans for Enforth get in my way. There was so much that I could imagine doing with Enforth that I ended up spending all of my time messing with those add-ons and never finishing the core system itself.

Ultimately that just meant that both Enforth and those side projects remained incomplete. Meanwhile the world is moving towards much more capable Arduino devices and I worry even now that Enforth is not as useful as it would have been four years ago.

In the end, I was probably just afraid to release something so experimental and half-baked. What if it didn’t work? What if it didn’t work well? Ultimately I had to set those concerns aside and embrace the fact that nothing good would ever come of leaving Enforth sitting around in a private GitHub repo.

And in an ironic twist, publishing a half-baked version of Enforth would have given other people a chance to get interested in the project, possibly increasing the rate of development.

Releasing Enforth so early is not something that I would have done four years ago, but it’s the kind of thing that I am trying to do now. I have other partial projects that I want to share, but this is the biggest one, and the one that I most need to move into its next phase of life.

SKELETONS.md

Currently writing a document called SKELETONS.md describing all the questionable things in the repo. I should do this for all my projects. @ztellman

Enforth is aggressively unfinished. There is an epic TODO.md file and a rambling design doc that is sometimes blatantly inaccurate. This is not a toy Forth, though. It passes most of the ANS Forth test suite, can load and save state from EEPROM, and even includes a full multitasker, all of which runs on the Arduino Uno (or your desktop computer).

Incomplete, yes, but I have still used Enforth on all sorts of devices and it’s a great deal of fun.

As for what is left, sorting out the current FFI mess would go a long way towards finishing Enforth. It’s probably not even that hard, I just haven’t geared myself up to do the work. Writing a Forth can be challenging and there are a few things that easily become roadblocks along the way. I still remember when I got DOES> working in Enforth: it was much simpler than I thought it would be, but it still took twenty hours of staring at the code and making little notes before I knew what to do.

And then I realized that I should have implemented DOES> first, because I could have used it to eliminate most/all of the other DO* words! Brad Rodriguez is likely not surprised by this revelation.

I may try to convert some of the TODOs and larger architectural items into GitHub Pull Requests. Perhaps that will be an easier way to encourage contributions or to measure my own progress on this project.

In the meantime, have fun, release often, and do what I say not what I do.