SICP in Clojure - Chapter 4
In one of the previous blog posts I have announced that I would like to start a new series of posts. It is a persistent journal from my journey through aforementioned book. I hope that you will enjoy it and find it useful - the main goal is to make this series a place where we can return in future, recall ideas and thoughts that accompanied reading process.
By finishing the previous chapter we learned more about functional programming, designing and dealing with stateful computation and a little bit about laziness. It was pretty much a general purpose programming book till now. Last two chapters of the book are really … Lispy.
Chapter which will be discussed today is focused on Lisp core values built around universal evaluator, homoiconicity and linguistic abstractions.
What is homoiconicity?
Very concise, I would say a mathematical, definition will explain it as a isomorphic relation between language AST (Abstract Syntax Tree) and its syntax. In more human friendly words - it is a property of a programming language in which the program structure is similar to its syntax. If such language is homoiconic, it allows all code in the language to be accessed and transformed as data, using the same representation - because AST is exactly the same as the language itself.
All languages from Lisp family have this property, also languages like Io, Julia or Prolog also have this ability (to a certain degree of course). Keep in mind that it does not mean that having a macros system in the language makes it homoiconic.
Title of this section sounds like a difficult concept, where the core idea is really simple. Aforementioned abstraction is an ability to create new languages. We have done that previously (e.g. by creating various Domain Specific Languages when solving exercises). By the creation, authors also mean ability to evaluate (or interpret) constructs written in that newly created dialect, by calculating values from the prepared expressions. Program which does such thing is called an evaluator (or interpreter).
If we go one level deeper in the abstraction tree, a metacircular evaluator (or also often called a metacircular interpreter) is an evaluator written in the same language that it will interpret. It means that you can write an interpreter of any Lisp dialect in that chosen language.
Core of metacircular evaluator
Clojure REPL (actually any kind of REPL) is an evaluator. But also, our run-time environments are also built on top of such constructs.
In Lisp, the core of the evaluator is often called an
apply cycle. If we will dive into implementations presented in the book, we will immediately see a symmetry between them. Authors defined both of them as follows.
To evaluate a combination (a compound expression other than a special form), evaluate the subexpressions and then apply the value of the operator subexpression to the values of the operand subexpressions.
1 ;; You can either evaluate quoted expressions 2 ;; or strings, but keep in mind that string 3 ;; does not have an AST-like structure by itself. 4 ;; It needs to be parsed first (with a 5 ;; `read-string`). 6 7 (eval '(let [a 10] (+ 3 4 a))) 8 (eval (read-string "(+ 1 1)"))
Evaluation means that we take fragment of the code (in form of a quoted expression or parsed from a string) and evaluate that, using all rules of the language. In other words, it calculates the result of a certain expression. Keep in mind that a delivered expression is just a data structure - list of keywords, other tokens, and other data structures. And it looks exactly the same as the language itself. That is a practical implication of the homoiconicity.
To apply a compound procedure to a set of arguments, evaluate the body of the procedure in a new environment. To construct this environment, extend the environment part of the procedure object by a frame in which the formal parameters of the procedure are bound to the arguments to which the procedure is applied.
1 ;; Result of executing both expressions is 2 ;; exactly the same, but only the first one 3 ;; is an application. 4 ;; 5 ;; Function application means that you have to 6 ;; deliver all arguments upfront in a form of 7 ;; collection. 8 9 (apply str ["str1" "str2" "str3"]) 10 (str "str1" "str2" "str3")
At the first sight you will find that
apply is only a strange syntax for a function invocation. But then, the obvious reflection strikes in - it is exactly the opposite. Function call is a syntax sugar on top of
apply function. Moreover, having this function in your toolbox opens a different ways of thinking about invoking unknown functions, and build other concepts like partial application and currying based on that.
Combining both powers together
Process of interpreting a program is an interaction between them. How it looks like? Here is an excerpt from an implementation (again full version is inside my repository - afronski/sicp-in-examples):
1 (defn my-eval [exp env] 2 (cond (self-evaluating? exp) exp 3 (variable? exp) (lookup-variable-value exp env) 4 (quoted? exp) (text-of-quotation exp) 5 (assignment? exp) (my-eval-assignment exp env) 6 (definition? exp) (my-eval-definition exp env) 7 (if? exp) (my-eval-if exp env) 8 (lambda? exp) (make-procedure (lambda-parameters exp) 9 (lambda-body exp) 10 env) 11 (do? exp) (my-eval-sequence (do-actions exp) env) 12 (cond? exp) (my-eval (cond->if exp) env) 13 (application? exp) (my-apply (my-eval (operator exp) env) 14 (list-of-values (operands exp) env)) 15 16 :else (assert false "Unknown expression in `my-eval`."))) 17 18 (defn my-apply [proc args] 19 (cond (primitive-procedure? proc) 20 (my-apply-primitive-procedure proc args) 21 (compound-procedure? proc) 22 (my-eval-sequence (procedure-body proc) 23 (extend-environment (procedure-parameters proc)) 24 args 25 (procedure-environment proc)) 26 27 :else 28 (assert false "Unknown procedure type in `my-apply`.")))
Even without exact definitions of the used functions, code is pretty self-explanatory. As we can see evaluation requires in certain cases an application, and application requires evaluation of function body and arguments. They are often expressed as a yin-yang symbol, because they are complementing each other.
Different evaluation models
Instead of reimplementing different evaluation models, I have prepared different examples of such, built on top of Clojure standard library (or sometimes with additional custom facilities). We will start with the concept which we already know from the previous chapter.
We have met this concept earlier already. In the previous chapter we worked with streams and infinite collections which simulate e.g. computation process. But built-in mechanisms in Clojure have much more to offer in that matter. We have already created some infinite collections, but let us remind how it works:
1 (import java.util.UUID) 2 3 (defn uuid-seq  4 (lazy-seq 5 (cons (str (UUID/randomUUID)) 6 (uuid-seq)))) 7 8 (println (take 3 (uuid-seq))) 9 10 ; (b09b2a29-2cad-4cda-8e4c-8a9a5c136f05 11 ; 8ece35e6-202f-4977-9987-7292239833e4 12 ; 0a336e55-5e42-4312-87ea-24e86ba4311e)
First we are defining a
lazy-seq then we use standard mechanism of constructing the collection from the first, evaluated element and the rest, which evaluation will be deferred. What I mean by deferring? If you will try to put the following lines inside a file (but not inside the REPL - it will force the evaluation) you will receive nothing:
1 ; This returns a lazy collection, which 2 ; is not evaluated yet. 3 (map inc [1 2 3 4]) 4 5 ; You can force evaluation either by 6 ; enforcing simple run (and wait for 7 ; side-effects) or return the result 8 ; of the operation. 9 10 (dorun (map inc [1 2 3 4])) ; nil 11 (doall (map inc [1 2 3 4])) ; (2 3 4 5)
But it is not an only way of creating lazy sequences. You can use also
iterate in a following way:
1 ; `repeat` and `repeatedly` creates an infinite sequence 2 ; either of elements or results of a function call. You can 3 ; create infinite sequence or a limited one by passing an 4 ; argument or not. 5 6 (println (str (clojure.string/join (take 5 (repeat "Na "))) "Batman!")) 7 ; "Na Na Na Na Na Batman!" 8 9 (println (repeatedly 5 #(rand-int 100))) 10 ; 34 23 12 1 23 11 12 ; `cycle` returns a lazy collection with repetitions 13 ; of a delivered collection. 14 15 (println (take 5 (cycle [1 2 3]))) 16 ; (1 2 3 1 2) 17 18 ; `iterate` is a more generic constructor. It returns 19 ; a lazy sequence, which has the following values: 20 ; 21 ; x, f(x), f(f(x)), ... 22 ; 23 ; This also means, that used `f` functions should be 24 ; *pure* (no side-effects). 25 26 (println (take 5 (iterate (partial * 3) 1))) 27 ; (1 3 9 27 81)
But laziness can be also used in a different way.
Around 1961, John McCarthy (the inventor of LISP) described an interesting mathematical operator called
amb (from ambiguous). Essentially,
amb have to be called with arguments, but thanks to that - it can look into the future to keep that from happening. It does that by rewinding into the past whenever it sees trouble, and try a different choice.
It is called a backtracking algorithm. This technique is often used for solving problems with huge search space. The most canonical example is called 8 queens puzzle. Whole approach is partially based on top of laziness and searching problem space in a lazy way, basing on the constraints and then doing a backtracking.
In example presented below, we are trying to find all Pythagorean triple solutions in a specific range, passed as an argument:
1 ; Both `amb-let` and `amb-let-helper` implementations 2 ; are shamelessly taken from: 3 ; https://github.com/abeppu/toychest 4 5 (defn amb-let-helper [bindings body] 6 (if (< 0 (count bindings)) 7 (let [[form expression] (take 2 bindings) 8 more-bindings (drop 2 bindings) 9 10 filtered-recurse (if (= :where (first more-bindings)) 11 `(when ~(second more-bindings) 12 ~(amb-let-helper (drop 2 more-bindings) body)) 13 (amb-let-helper more-bindings body)) 14 15 res (if (and (seq? expression) 16 (= 'amb (first expression))) 17 `(apply concat (for [~form ~(second expression)] 18 ~filtered-recurse)) 19 `(let [~form ~expression] 20 ~filtered-recurse))] 21 res) 22 [body])) 23 24 ; Macro definition. 25 26 (defmacro amb-let [bindings body] 27 (amb-let-helper bindings body)) 28 29 ; Defining problem and its constraints. 30 ; We would like to calculate all triples in range 100 that 31 ; fullfilling following conditions: 32 ; 33 ; 2 < a < MAX 34 ; a <= b < MAX 35 ; b <= c < MAX 36 ; 37 ; a^2 + b^2 = c^2 38 39 (defn triple [max] 40 (amb-let [a (amb (range 1 max)) :where (> a 2) 41 b (amb (range a max)) 42 c (amb (range b max)) 43 44 :where (= (+ (* a a) (* b b)) 45 (* c c))] 46 [a b c])) 47 48 (println (triple 20)) 49 ; ([3 4 5] [5 12 13] [6 8 10] [8 15 17] [9 12 15])
Talking about backtracking, we can again building on top of that concept power our next evaluator extension. We can use it for logic programming and it is described in the book as a last enhancement.
Book takes that concept as a last one, by implementing own version of logic engine in the Scheme. In Clojure and ClojureScript there is no point of doing that, because we have it in the set of additional libraries. It is called
core.logic and it is delivered as a separate library.
In prepared example we will take the most common problem when it comes to the logic programming kindergarten - simple genealogy questions. It may sound simple, but the provided relations, facts and queries will show the basic unification mechanism:
1 (ns logic-example.core 2 (:use [clojure.core.logic.pldb])) 3 4 ; In the logic programming we are creating *relations* and *facts*. 5 ; Relation describes how to interpret *facts*, with certain associations. 6 7 (db-rel father Father Child) 8 (db-rel mother Mother Child) 9 10 ; *Facts* are the truths, nothing more than a specific data structure 11 ; which describes our state of knowledge. 12 13 (def genealogy 14 (db 15 [father 'Adam 'Wiliam] 16 [father 'Adam 'Thomas] 17 [father 'Andrew 'Jessica] 18 [father 'Andrew 'Mark] 19 ; We are deliberately omitting Dorothy's father here. 20 21 [mother 'Eve 'Wiliam] 22 [mother 'Eve 'Thomas] 23 [mother 'Eve 'Jessica] 24 [mother 'Angie 'Mark] 25 [mother 'Angie 'Dorothy])) 26 27 ; Having *facts* and *relations* we can query them and thanks to them 28 ; `unification` mechanism, based on defined relations and facts available 29 ; in the database our logic engine will answer to that query with one, 30 ; more or no results. 31 32 (defn jessica-mother 33 (with-db genealogy 34 (run* [q] 35 (mother q 'Jessica)))) 36 37 ; user=> (logic-example.core/jessica-mother) 38 ; (Eve) 39 40 (defn adam-children  41 (with-db genealogy 42 (run* [q] 43 (father 'Adam q)))) 44 45 ; user=> (logic-example.core/dorothy-father) 46 ; (Thomas Wiliam) 47 48 (defn dorothy-father  49 (with-db genealogy 50 (run* [q] 51 (father q 'Dorothy)))) 52 53 ; user=> (logic-example.core/dorothy-father) 54 ; ()
Depending on the knowledge and the environment, answers to the prepared questions are different. Query can return either one, more or no results. Everything is related with previously defined facts and relations. It looks pretty amazing, and that is only an introduction to that topic. For more, I will recommend you to read either about Prolog (you can start from here) or play with this tutorial.
We have managed to finish 4th chapter of the book. In the last part we will attack problems with which we are already familiar, but on the lowest possible level. We will focus on hardware specifics of Lisp evaluator implementations, including design constraints and limitations related with those topics.
I hope that we will meet there again!