Code is Data, Data is Code

September 26 2013

5 comments 0
motif

I was at the StrangeLoop conference last week and was surprised by the level of interest in dynamic and functional , in particular Clojure. It was one of the most talked about languages, along with JavaScript and Scala. The general consensus seemed to be that it is a very powerful language once you get your head around it.

I ended up talking with a Clojure developer there who told me why macros are much easier to create in Lisp and Clojure than most other languages. Later Andreas Fuchs from Stripe used a term which summed it up and was also a bit of an eye opener for me. He said in Lisp “code is data and data is code”.

 

 

 

 

 

 

 

 

 

 

 

 

 

Code is Data. Data is Code.

It turns out that Lisp source code is actually a valid data structure called an abstract syntax tree (AST) which simplifies the interpretation and execution of the code. This makes metaprogramming easier since code can be treated as a basic data structure that the programming language knows how to access.

For example, this piece of Clojure code adds ten and five together, but it’s also a list of three items in a valid list data structure.

(+ 10 5)

The “code is data” concept has a name. It’s called Homoiconicity and it basically means the program structure is similar to its syntax, and therefore the program’s internal representation can be inferred by reading the text’s layout. This capability allows you to plug in more ASTs to extend the functionality of the language and manipulate it on the fly at runtime without having to recompile the whole OS to integrate new code. In other languages, we would have to recompile from human-language source code into valid AST before it is compiled into code.

Although I’ve developed in dynamic languages before like Ruby (which some people say shares similarities with Lisp and Clojure) I hadn’t worked with Lisp-based languages built on the foundation of “code is data”.

 

Macros and Clojure

So what are some of the benefits of a language being homoiconic?

For one it makes it easier to do metaprogramming – writing programs that write or manipulate other programs as their data, which can be done at compile time or runtime. It makes it relatively easy to write DSLs. One of the more powerful effects of being homoiconic though is the ability to define custom macros.

Macros in essence allow you to define new language features as a developer. In most languages you would need to wait for a new release of the language to get new syntax – in Lisp (and Clojure) you can just extend the core language with macros and add the features yourself.

Clojure has a programmatic macro system which allows the compiler to be extended by user code. Macros can be used to define syntactic constructs which would require primitives or built-in support in other languages. In fact many of the core constructs of Clojure are not, in fact, primitives, but are normal macros.

Take Clojure’s when statement which executes a block when a condition is true.

(when (true) (println “hello”))
 ; “hello”

This is actually a macro combined by using if and do. You can see how this works in action by using macro expansion which expands the macro into it’s raw form.

(macroexpand ‘(when (true) (println “hello”)))
; (if (true) (do (println “hello”)))

Here’s the source code for the macro, defined using defmacro. You can see how it returns the expanded form using if and do.

(defmacro when
  "Evaluates test. If logical true, evaluates body in an implicit do."
  {:added "1.0"}
  [test & body]
  (list 'if test (cons 'do body)))

Aren’t macros just functions?


You could do something similar with functions, but it’s not the same. Functions execute at run-time, they take and produce data (values). Conceptually you can replace every function invocation with its value. Macros execute at compile-time, they take and produce code. Conceptually one can replace (expand) every occurrence of macro with its value. Macros arguments are not evaluated and their return values are expanded-in-place and treated as code.

Rewind The Loop

I started out by saying how intrigued I was on why creating macros is easier in Lisp and Clojure. It’s down to their “code is data…” foundation which allows you to add macros (which appear like new language features) in a fluid and simple way without needing to release a new version of the core language.

By the way, why is it called Lisp? The acronym derives from “LISt Processing”. At the end of the day Lisp source code is one big data structure, a list.


We'd love to hear your opinion on this post

5 Responses to “Code is Data, Data is Code”

  1. One thing I really appreciate with code as data in a language such as Clojure is the power of your editor. The power of an editor is related to how well it can interact and analysis code. Compare working in Ruby or Clojure in emacs. Paredit would be a nightmare in Ruby based on the syntax. In Clojure its a joy (eventually) to use. Look at tools like kibit (https://github.com/jonase/kibit) which detect bad patterns in your code by navigating your code as data.

    Its possible in other languages to map code to data with forms such as SEXPS but having it built in is a joy.

    Agree(0)Disagree(0)Comment
    • @josephwilk Thanks for the comment and additional insight. As I was writing the original post I remember coming across the fact that emacs was actually written in a dialect of Lisp. I guess that’s where some of the power of the editor comes from.

      Agree(0)Disagree(0)Comment
  2. Linq and Expressions as it’s foundation (http://msdn.microsoft.com/en-us/library/system.linq.expressions.aspx) are functional programming concepts added to C# 3.0 which enable the develoer to interpret the code and translate it to Sql/XPath/LDAP/… queries.

    Even though it’s very limited but it’s enough for a C# developer to understand the power of technique.

    Agree(0)Disagree(0)Comment
  3. […] On Clojure and Code is Data, Data is Code […]

    Agree(0)Disagree(0)Comment
  4. How can I write transformer or component using clojure ?

    Agree(1)Disagree(0)Comment