Code is Data, Data is Code

September 26, 2013
| 8 mins read

Reading Time: 8 minutes

I was at the StrangeLoop conference last week and was surprised by the level of interest in dynamic and functional languages, in particular Clojure. It was one of the most talked about languages, along with JavaScript and Scala. The general consensus seemed to be that it is a very powerful language once you get your head around it.

I ended up talking with a Clojure developer there who told me why macros are much easier to create in Lisp and Clojure than most other languages. Later Andreas Fuchs from Stripe used a term which summed it up and was also a bit of an eye opener for me. He said in Lisp “code is data and data is code”.

Code is Data. Data is Code.

It turns out that Lisp source code is actually a valid data structure called an abstract syntax tree (AST) which simplifies the interpretation and execution of the code. This makes metaprogramming easier since code can be treated as a basic data structure that the programming language knows how to access.

For example, this piece of Clojure code adds ten and five together, but it’s also a list of three items in a valid list data structure.

(+ 10 5)

The “code is data” concept has a name. It’s called Homoiconicity and it basically means the program structure is similar to its syntax, and therefore the program’s internal representation can be inferred by reading the text’s layout. This capability allows you to plug in more ASTs to extend the functionality of the language and manipulate it on the fly at runtime without having to recompile the whole OS to integrate new code. In other languages, we would have to recompile from human-language source code into valid AST before it is compiled into code.

Although I’ve developed in dynamic languages before like Ruby (which some people say shares similarities with Lisp and Clojure) I hadn’t worked with Lisp-based languages built on the foundation of “code is data”.

Macros and Clojure

So what are some of the benefits of a language being homoiconic?

For one it makes it easier to do metaprogramming – writing programs that write or manipulate other programs as their data, which can be done at compile time or runtime. It makes it relatively easy to write DSLs. One of the more powerful effects of being homoiconic though is the ability to define custom macros.

Macros in essence allow you to define new language features as a developer. In most languages you would need to wait for a new release of the language to get new syntax – in Lisp (and Clojure) you can just extend the core language with macros and add the features yourself.

Clojure has a programmatic macro system which allows the compiler to be extended by user code. Macros can be used to define syntactic constructs which would require primitives or built-in support in other languages. In fact many of the core constructs of Clojure are not, in fact, primitives, but are normal macros.

Take Clojure’s when statement which executes a block when a condition is true.

(when (true) (println “hello”))
 ; “hello”

This is actually a macro combined by using if and do. You can see how this works in action by using macro expansion which expands the macro into it’s raw form.

(macroexpand ‘(when (true) (println “hello”)))
; (if (true) (do (println “hello”)))

Here’s the source code for the macro, defined using defmacro. You can see how it returns the expanded form using if and do.

(defmacro when
  "Evaluates test. If logical true, evaluates body in an implicit do."
  {:added "1.0"}

[test & body]

(list ‘if test (cons ‘do body)))

Aren’t macros just functions?

You could do something similar with functions, but it’s not the same. Functions execute at run-time, they take and produce data (values). Conceptually you can replace every function invocation with its value. Macros execute at compile-time, they take and produce code. Conceptually one can replace (expand) every occurrence of macro with its value. Macros arguments are not evaluated and their return values are expanded-in-place and treated as code.

Rewind The Loop

I started out by saying how intrigued I was on why creating macros is easier in Lisp and Clojure. It’s down to their “code is data…” foundation which allows you to add macros (which appear like new language features) in a fluid and simple way without needing to release a new version of the core language.

By the way, why is it called Lisp? The acronym derives from “LISt Processing”. At the end of the day Lisp source code is one big data structure, a list.

Code is Data, Data is Code

Share post

Code is Data. Data is Code.

Macros and Clojure

Aren’t macros just functions?

Rewind The Loop

Tags

Related articles

Introducing Anypoint MQ Cross-Region Failover

Anypoint Flex Gateway Policy Development Kit

Scale design and discovery of event-driven APIs with the new AsyncAPI

Newsletter

You have been redirected