Wednesday, November 25, 2009

SICP Exercise 1.12: Pascal's Triangle

From SICP section 1.2.2 Tree Recursion

Exercise 1.12 asks us to write a procedure that computes the elements of Pascal's triangle by means of a recursive process.

The numbers along the left and right edges of the triangle are all 1, and each number inside the triangle is the sum of the two numbers directly above it.

We'll design our procedure to take the row and column numbers as arguments, and return the value of the element in that position.
(pascal row col)

We start counting both rows and columns at 0, as pictured in the diagram. So the procedure call
(pascal 4 2)
should return a value of 6, since that is the value in the 4th row and 2nd column.

The first thing we should note is that the nth row of the triangle contains exactly n columns. This means that we shouldn't expect to call our procedure with a column argument greater than the row argument. So, the following call would be expected to return nonsense information:
(pascal 3 4)

We should also note that the 0th and nth column of any row should return 1. This is enough to get us started.
(define (pascal row col)
(cond ((< row col) #f)
((or (= 0 col) (= row col)) 1)
(else #t)))

Here we're just defining the initial logic of the procedure. If the row argument is less than the column, we simply return #f (probably not the best error message I've ever written, but it's not the focus of the exercise). If the column is the 0th or nth column, we return 1, otherwise we return a value of #t. This last value is just a placeholder which we'll need to replace with a calculation of the correct Pascal number.

To finish up the procedure, we need to recursively calculate values from the interior of the triangle. An interior value is defined as the sum of the two values directly above in the triangle. So, for example, to calculate Pascal(4, 2) we would sum Pascal(3, 1) and Pascal(3, 2). Each of these values (since there are both interior values) would be calculated in the same manner, but the recursive property of our procedure will take care of that for us.

The two values to sum in order to calculate Pascal(row, column) are always going to be the value in the previous row and same column, Pascal(row-1, column), and the value in the previous row and previous column, Pascal(row-1, column-1). Translating this information to Scheme and adding it to our earlier procedure, we get our final answer to the exercise:
(define (pascal row col)
(cond ((< row col) #f)
((or (= 0 col) (= row col)) 1)
(else (+ (pascal (- row 1) col)
(pascal (- row 1) (- col 1))))))

Try several examples from Pascal's triangle illustration to verify that the results are correct. Note that since this procedure defines a recursive process, calculating larger values can be extremely time consuming. It takes a noticeable amount of time (a few seconds) on my machine to calculate values in the 25th row.

For more fun with Pascal's triangle see Pascal's Triangle and its Patterns.


For links to all of the SICP lecture notes and exercises that I've done so far, see The SICP Challenge.

Saturday, November 21, 2009

Unsolved: Collatz conjecture

First proposed in 1937 by the German mathematician Lothar Collatz, the Collatz conjecture is also known as the 3n + 1 problem, Kakutani's problem, the Syracuse problem, Thwaites' conjecture, and Ulam's conjecture. The conjecture starts with the following simple procedure:
  1. Let n be any natural number (n > 0).
  2. If n is odd, triple it and add 1 (n = 3n + 1).
  3. If n is even divide it in half (n = n/2).
  4. Stop if n = 1, otherwise go back to step 2.
The Collatz conjecture asks:
Does the above process always terminate (end with n = 1) for any starting value of n?
For example, starting with a value of n = 42, we get the following sequence

{ 42, 21, 64, 32, 16, 8, 4, 2, 1 }

The sequences produced by the Collatz procedure are known as hailstone sequences.

The conjecture remains unanswered, although it has been proven that the process terminates for all values of n up to 5.764 × 1018. In 1972, John Conway proved that it is possible for problems of this type to be undecidable, so it is not known if a solution is even possible.

Wednesday, November 18, 2009

SICP Exercise 1.11

From SICP section 1.2.2 Tree Recursion

Exercise 1.11 gives us a function defined by the rules that

f(n) = n if n < 3
f(n) = f(n - 1) + 2f(n - 2) + 3f(n -3) if n ≥ 3

and asks us to write both recursive and iterative procedures for the function.

Since the function is defined recursively, it's any easy matter of a straight translation to Scheme to come up with the recursive definition.
(define (f n)
(if (< n 3)
(+ (f (- n 1))
(* 2 (f (- n 2)))
(* 3 (f (- n 3))))))

Here are the results of several sample runs:
> (f 0)
> (f 1)
> (f 2)
> (f 3)
> (f 4)
> (f 5)
> (f 6)
> (f 7)

The procedure above is a recursive process because the calculation of some of the state information is deferred until a later call of the procedure. Defining an iterative procedure means we have to remove this property. That means the iterative procedure has to keep track of all state information at each step of the calculation in explicit variables.

We can draw inspiration from the iterative definition of the Fibonacci procedure from the text.
(define (fib n)
(fib-iter 1 0 n))

(define (fib-iter a b count)
(if (= count 0)
(fib-iter (+ a b) a (- count 1))))

Here the fib procedure makes a call to the fib-iter procedure with initial values passed in for the first two parameters, along with a counter. We can use the same pattern for our iterative definition of f and f-iter.
(define (f n)
(if (< n 3)
(f-iter 2 1 0 n)))

(define (f-iter a b c count)
(if (< count 3)
(f-iter (+ a (* 2 b) (* 3 c))
(- count 1))))

In this definition of (f n), we start with the same check as before. If n is less than 3, then f(n) = n, so we need go no further. Otherwise, we iterate for as long as necessary.

f-iter takes three initial values a, b, and c, plus a counter. When counter reaches a value less than 3, then a will be equal to f(n), b will be equal to f(n-1), and c will be equal to f(n-2). Each time we iterate the value of a is recalculated and the old values of a and b are passed into the procedure for the parameters b and c. The counter is also decremented by 1.

You can copy and paste the code above into your Scheme interpreter to verify that we get the same results as the sample runs above.


For links to all of the SICP lecture notes and exercises that I've done so far, see The SICP Challenge.

Wednesday, November 11, 2009

Russian Roulette

Here's a nice little probability problem that I first saw in David Darling's Universal Book of Mathematics:

You're in a game of Russian Roulette (with only one other player), but instead of one bullet the gun (a six-shooter revolver) is loaded with three bullets in three consecutive chambers. The barrel is spun only once. Each player points the gun at his head and pulls the trigger. If he is still alive, the gun is passed to the other player. The game ends when one player dies. Would you stand a better chance of surviving if you shot first or second, or does it make a difference?

This puzzle, or a variation of it, is often used as a brain teaser interview question. To avoid spoiling it for anyone, I'll add the solution as a comment to this post (so don't scroll down until you think you've worked it out).

Tuesday, November 10, 2009

Unsolved: The Perfect Cuboid Problem

In Tomorrow's Math, C. Stanley Ogilvy asks
Does there exist a rectangular parallelepiped (box) all of whose edges and face diagonals are of integral length, and the length of whose main diagonal is also an integer?
Put another way, can you construct a rectangular solid where all three edge lengths, all three face diagonals, and the long space diagonal have integer values (or prove that such a box cannot be constructed)?

This deceptively simple-sounding problem is known as the Perfect Cuboid Problem (among other names). It is closely related to the problem of finding an Euler Brick, which has many solutions, the smallest of which is a box with edges 125, 244, and 267.

No perfect cuboids were found during an exhaustive search of the integers up to 100 billion (1010). This lends evidence (but is far from a proof) to the suspicion that no perfect cuboids exist. One near miss is the cuboid that has edges with lengths 44, 117, and 240, and diagonals with lengths 125, 244, and 267.

Sunday, November 8, 2009

SICP Exercises 1.9 and 1.10

Exercises from section 1.2.1 Linear Recursion and Iteration

Exercise 1.9 asks us to use the substitution model to illustrate the processes generated by the following two procedures:
(define (+ a b)
   (if (= a 0)
       (inc (+ (dec a) b))))

(define (+ a b)
   (if (= a 0)
       (+ (dec a) (inc b))))
You may recognize these procedures as the two Peano Arithmetic examples from Lecture 1B. As we saw in the lecture video, the first procedure produces a linear recursive process, while the second produces an iterative process (the two procedures are given here in the reverse order from the lecture).

Remember also from that lecture that the steps for evaluating an application are:
  • Evaluate the operator to get procedure
  • Evaluate the operands to get arguments
  • Apply the procedure to the arguments
  • Copy the body of the procedure substituting the arguments supplied for the formal parameters of the procedure.
  • Evaluate the resulting new body.

If we are to evaluate (+ 4 5) using the first procedure we'll see that the increment operations are deferred.
(+ 4 5)
Evaluate the operator to get the procedure
(if (= a 0)
    (inc (+ (dec a) b)))

Evaluate the operands to get arguments
(if (= 4 0)
    (inc (+ (dec 4) 5)))

Since 4 is in fact not equal to 0, this expression evaluates to
(inc (+ (dec 4) 5))

This is where things begin to get interesting. We can evaluate (dec 4), but then we need to evaluate the recursive call to + before we're able to evaluate the inc operation. That means the inc operation is deferred until we return from the recursive call.

Greatly simplified, the expansion of this expression proceeds as follows:
(inc (+ 3 5))
(inc (inc (+ (dec 3) 5)))
(inc (inc (+ 2 5)))
(inc (inc (inc (+ (dec 2) 5))))
(inc (inc (inc (+ 1 5))))
(inc (inc (inc (inc (+ (dec 1) 5)))))
(inc (inc (inc (inc (+ 0 5)))))

At this point, the innermost expression, (+ 0 5), can be evaluated because the first operand is 0, which is a the base case of the recursive definition. The process will now contract until we have our answer.
(inc (inc (inc (inc 5))))
(inc (inc (inc 6)))
(inc (inc 7))
(inc 8)

Evaluating (+ 4 5) with the iterative procedure is much more straight forward, as we should expect. No operations are deferred, so we don't see the expand-and-contract process shape that is the signature of a recursive process.
(+ 4 5)
(+ (dec 4) (inc 5))
(+ 3 6)
(+ (dec 3) (inc 6))
(+ 2 7)
(+ (dec 2) (inc 7))
(+ 1 8)
(+ (dec 1) (inc 8))
(+ 0 9)

Exercise 1.10 gives the following definition for Ackermann's function (note that this is a different definition than the one given in the Wikipedia entry for the Ackermann function. The analysis that follows is for the procedure given in the text.):
(define (A x y)
   (cond ((= y 0) 0)
         ((= x 0) (* 2 y))
         ((= y 1) 2)
         (else (A (- x 1)
                  (A x (- y 1))))))

The next part of the exercise asks us to evaluate a few expressions that use the function. Our Scheme interpreter is the best tool for this job.
> (A 1 10)
> (A 2 4)
> (A 3 3)

Next, we're asked to give concise mathematical definitions for the following functions defined in terms of A:
(define (f n) (A 0 n))

(define (g n) (A 1 n))

(define (h n) (A 2 n))

I'll take them one at a time. If we plug a few values in for (f n) we can see how the function grows.
> (f 0)
> (f 1)
> (f 2)
> (f 3)
> (f 4)

It's pretty clear from these results that

f(n) = 2n

which is what we should expect from examining the given procedures.
(define (f n) (A 0 n))
(f n) just calls A with the x argument set to 0. The procedure for A has a condition defined just for this input.
((= x 0) (* 2 y))

Next, plugging in a few values for (g n) we get:
> (g 0)
> (g 1)
> (g 2)
> (g 3)
> (g 4)

This sequence is growing exponentially.

g(n) = 2n, when n > 0
g(n) = 0, when n = 0

Finally, plugging in values for (h n) we get:
> (h 0)
> (h 1)
> (h 2)
> (h 3)
> (h 4)

This one had me scratching my head for a few minutes. If you evaluate (h 5) you'll see that it's a number so large that I won't even try to describe it. I didn't see a pattern in these results until I wrote them out in a slightly different way.

h(0) = 0
h(1) = 2 = 21
h(2) = 4 = 22
h(3) = 16 = 24
h(4) = 65536 = 216

From these results I noticed that

h(n) = 2h(n-1), when n > 1
h(1) = 2
h(0) = 0


For links to all of the SICP lecture notes and exercises that I've done so far, see The SICP Challenge.

Monday, November 2, 2009

SICP - Notes from Lecture 1B

Procedures and Processes:
Substitution Model
Covers Text Section 1.2

You can download the video lecture or stream it on MIT's OpenCourseWare site. It's also available on YouTube.

This lecture is presented by MIT’s Gerald Jay Sussman.

Part 1

This lecture concentrates on understanding how patterns of procedures and expressions cause particular behaviors and patterns of execution. In particular, we’re going to look at how different patterns of writing procedures can be used to write iterative and recursive processes. This was touched on in the previous lecture, but will be explored in much more depth here.

The substitution model is illustrated using the following definition for computing the sum of squares:
(define (sos x y)
(+ (sq x) (sq y)))

(define (sq x)
(* x x))

Substitution Rule

To evaluate an application:
  • Evaluate the operator to get procedure
  • Evaluate the operands to get arguments
  • Apply the procedure to the arguments
  • Copy the body of the procedure substituting the arguments supplied for the formal parameters of the procedure.
  • Evaluate the resulting new body.

When evaluating the operands, it doesn't really matter what order you do it in.

Key Quotes:
The key to Understanding complicated things is to know what not to look at, and what not to compute, and what not to think.

One of the things we have to learn how to do is ignore details.

Using the substitution rule (which is stressed to be an inaccurate model that we will refine later) and ignoring as much irrelevant detail as possible, we can show how the sum of squares is calculated.

(sos 3 4)
(+ (sq 3) (sq 4))
(+ (sq 3) (* 4 4))
(+ (sq 3) 16)
(+ (* 3 3) 16)
(+ 9 16)

In the first step we evaluate the sos operator to get its procedure. We substitute the values of the parameters into the definition. We can treat the + operator as primitive, so there’s no need to evaluate it. Next we need to evaluate the two sq operators. It doesn’t matter which order we evaluate them in. Each of these evaluations follows the same rule. We evaluate the operators to get the sq procedures, substituting the values of the arguments. At each step, we evaluate an expression as soon as it is reduced to primitives. Remember from the last lecture that this is called applicative-order evaluation. (This is in contrast to normal-order evaluation, where we would do the substitution of the expressions for the operands inside the body first. "Fully expand and then reduce.")

The IF Expression – A special form

One exception to our evaluation rules is the IF expression.

To evaluate an IF expression
  • Evaluate the predicate expression
  • If it yields TRUE
  • evaluate the consequent expression
  • otherwise
  • evaluate the alternative expression

The key here is that only one of the consequent or alternative expressions will ever be evaluated. This is different from the normal evaluation rule, which is why conditionals are termed a special form.

Peano Arithmetic (part 1)

The following procedure is given as an example of one way that addition could be defined in Scheme.
(define (+ x y)
(if (= x 0)
(+ (dec x) (inc y))))
Note: In the lecture, Prof. Sussman uses the 1+ and -1+ operators for increment and decrement. For readability, I've replaced them with inc and dec.

In this example we are counting down x, while counting up y. The real-world analogy given in the lecture is to imagine that you have two piles of stones, one with x number of stones, the other with y stones. With this definition for the + operator, we're just taking stones from the x pile and placing them in the y pile one at a time. When the size of the x pile reaches 0, the y pile has all the stones and we return that as the answer.

Part 2

During the first few minutes of this part of the lecture Professor Sussman takes a few moments to explain the idea of test strips in photography. He explains that a good photographer needs to be able to visualize how a photograph will look before they take a photo. In order to develop this ability, a photographer will take a series of photos under slightly different conditions in order to see what different effects are produced by each small change.

This introduces the concept of simple perturbational analysis in programming. The idea is basically the same as a photographer’s test strips. We make slight changes to a program and observe the change in output from before and after the change. By this method we can learn what kinds of program structure produce what kinds of effects in our processes.

Peano Arithmetic (part 2)

The following alternative procedure for addition is given:
(define (+ x y)
(if (= x 0)
(inc (+ (dec x) y))))

This alternative definition for the + operator is the same in all respects but one. In the last line of the procedure, all increment operations are deferred until all of the decrement operations have been completed.

In this example, you still remove the stones from the x pile one at a time. But instead of removing one stone from the x pile and immediately transferring it to the y pile, you hold each x stone in a buffer. Once all the stones are removed from the x pile, you then add them (again, one at a time) into the y pile. Incrementing the y pile is deferred until all decrements of the x pile are complete.

These two processes for addition have very different behaviors.

The complexities for the first procedure are

time = O(x)
space = O(1)

This is called an iterative process. The process will complete in an amount of time proportional to the size of x, and a constant amount of space is used.

The complexities for the second procedure are

time = O(x)
space = O(x)

This kind of process is called a linear recursion (linear because it is proportional to x in both time and space).

This brings up the difference between a recursive procedure and a recursive process. Both of the procedures for Peano Arithmetic that we've looked at have been defined recursively. A recursive procedure is simply one that is defined in terms of itself. A recursive process is one that defers some operation until a later stage of the process.

An iterative process carries all of its state in explicit variables. A recursive process keeps some of the state information as a deferred operation.

Part 3

The next part of the lecture starts with a recursive definition that most everyone will be familiar with, a procedure for generating Fibonacci numbers. You may remember that the Fibonacci sequence is generated by starting with 0 and 1, then the next number in the sequence is the sum of the previous two.

0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, …

Mathematically, you can compute the nth Fibonacci number by summing the (n-1)th and (n-2)th Fibonacci numbers. When n is 0 or 1, the nth Fibonacci number is simply n. This leads directly to the following recursive definition for the nth Fibonacci term:
(define (fib n)
(if (< n 2)
(+ (fib (- n 1))
(fib (- n 2)))))
Can you come up with an iterative process to compute the same value? See section 1.2.2 Tree Recursion of the text for the iterative solution.

Note that the fib procedure above calls itself not once, but twice. This produces a different process structure than the recursive structure we’ve seen before. This pattern is called tree recursion, because for every call to fib, it branches into two more calls, which themselves branch into two more calls each, and so on until you reach the leaf nodes of the tree.

The complexities for this procedure are

time = O(2n)
space = O(n)

The time complexity for the Fibonacci procedure is exponential because for each increment you add to n, you very nearly double the size of the graph that is produced in the execution of the procedure. You can see in the graph above (taken from the text) that the entire subtree for calculating fib(4) would be repeated again if you were to expand the graph to compute fib(6).

The space complexity is only linear because as n increases by one, the height of the tree (which represents the number of deferred operations that need to stay in memory) also only increases by one.

Key Quote:
The way you construct a recursive process is by wishful thinking.

The lecture concludes with a demonstration of the Towers of Hanoi problem. Most of the time in this section is spent explaining the problem and manually demonstrating a solution with a real set of wooden pegs and disks. I won’t try to reproduce that here. It’s worth watching the lecture to see how a recursive solution is arrived at by “wishful thinking.” That is, Prof. Sussman builds a solution for moving n disks by assuming you can move (n – 1) disks, then recursively applying the procedure down to the point where only 1 disk needs to be moved.

The recursive solution is built on the assumption that a procedure to solve the problem already exists, but only for smaller instances of the problem. Sussman then decomposes the problem until he reaches the point where a solution does exist, moving one disk. This is the general solution for solving problems recursively.