All Articles

The Semantic Editor

It seems like everybody in the tech industry is passionate about text editing. As a result, editor wars have been a thing forever: Is it vi or emacs or ed?

New editors also pop up left and right all the time.

You will probably think I’m crazy if I tell you that I want to create an editor, too.

Semantics

It is very rare that what we are actually editing is plain text. Operations such as “insert glyph a” or “delete line” rarely actually make sense. As somebody editing text, we need to convert our thought about what we actually want to do into the language of the editor.

For example, what you actually want to do might be to “add a paragraph,” “add a section about a new topic.” If you are editing code, you might want to “create a new class,” “add a method” or “rename variable foo to bar.”

What if your editor understood these intents natively instead? What if this were the only things your editor understood, and it was incapable of “deleting characters?”

I want to create an editor which behaves like that. I call it the semantic-editor, or se for short.

Show me the code!

It’s still horribly broken and incomplete, but sure: https://github.com/dflemstr/semantic-editor.

Architecture

Creating text editors today is not easy. A lot of infrastructure is needed before you can build something with even the most basic features. As a result, I have chosen to leverage existing technology as much as possible to not re-invent the wheel.

Backend

The semantic-editor core backend is written in Rust. This makes it possible to have full access to the underlying operating system in order to make things easier for myself.

The idea is that the editor should leverage native tools as much as possible. Instead of creating its own Java parser and linter, it should invoke javac or libjvm and have it parse and check the code. Instead of creating a bazel or git plugin, I want to invoke bazel or git directly.

Having access to a native process also makes it possible to bind to any C or kernel API for optimal performance.

The backend uses the tokio framework for parallellization and asynchronous APIs. As a result, the editor runs everything in parallel and asynchronously by default, leading to very good performance.

Models

There is no abstraction related to editing blocks of text. Instead, when opening a file, the editor parses the contents into a domain-specific directed acyclic graph representation (similar to an abstract syntax tree) for manipulation. The core editor components then know how to interact with these data structures to perform edits.

Most of this is not implemented yet, but here are some basic ideas:

  • The AST is built using plain Rust structs. For example, to model this Markdown list, we might use a struct like this:

    #[derive(Clone, Debug, Semantic)]
    pub struct List {
        /// Whether the list is ordered (with numbers) or not.
        pub ordered: bool,
        /// The start number for an ordered list, or `None` if numbers should be auto-assigned.
        pub start: Option<u32>,
        /// Whether any of the children are `loose`.
        pub loose: bool,
        /// Child elements.
        pub children: Vec<ListItem>,
    }
  • There is a custom derivable trait, Semantic, that encapsulates everything that is needed by the editor to edit this data structure. It might for example add capabilities of diffing two versions of the data structure to send across the wire, or support editing operations such as “change list starting number.”
  • Any identifiers — pieces of text or similar that identify some specific entity, such as a variable name — are not to be defined as simple Strings, but instead get assigned an internal ID, and their definitions (like their name/scope) are stored elsewhere in a global table. To rename a variable, we simply change its global definition in one place instead of “search-replacing” it everywhere.

Frontend

Creating an application frontend is terribly difficult in today’s day and age. There are so many things to consider: Platform-specific toolkits, localization, translation, accessibility, operating system integration…

The semantic-editor uses your web browser as its user interface. Instead of having a separate app, you open different editor sessions in your preferred browser and use your usual tab management to switch between them.

That means that the editor needs to provide a web frontend. This is done like this:

  • When you start the semantic-editor, it starts an embedded HTTP server.
  • Visiting the web page hosted by this server will load a normal React application, that will in turn construct an user interface, for now using ant.design and slate.
  • The UI will load a version of semantic-editor as a WebAssembly module. This module contains the editor core, data structure definitions for what is being edited, and the necessary protocol bits to communicate with the backend.
  • The frontend WebAssembly module and the backend communicate with either WebSockets and Google Protocol Buffers, or — if supported — gRPC. The protocol is an event-based protocol for modifying CRDTs, inspired by ditto. What this means is that the backend can be more or less stateless, and you can restart the backend without interrupting the frontend experience very much.

I still haven’t figured out how to render the content that is being manipulated. Should it look like the original text, but only allow semantic operations? Should it render the content with a richer representation, potentially obscuring what is being edited? This requires some additional thought.

If you want to contribute, please feel free to file an issue in the project repository: https://github.com/dflemstr/semantic-editor. Any suggestions on the direction of the project are very welcome!