Saturday, December 4, 2010

SICP 2.7 - 2.11: Extended Exercise: Interval Arithmetic (Part 1)

From SICP section 2.1.4 Extended Exercise: Interval Arithmetic

Section 2.1.4 is a small project that has us design and implement a system for working with intervals (objects that represent the range of possible values of an inexact quantity). We need to implement interval arithmetic as a set of arithmetic operations for combining intervals. The result of adding, subtracting, multiplying, or dividing two intervals is itself an interval, representing the range of the result. We're provided with the procedures for adding, multiplying, and dividing two intervals.

The minimum value the sum could be is the sum of the two lower bounds and the maximum value it could be is the sum of the two upper bounds:
(define (add-interval x y)
(make-interval (+ (lower-bound x) (lower-bound y))
(+ (upper-bound x) (upper-bound y))))

The product of two intervals is computed by finding the minimum and the maximum of the products of the bounds and using them as the bounds of the resulting interval.
(define (mul-interval x y)
(let ((p1 (* (lower-bound x) (lower-bound y)))
(p2 (* (lower-bound x) (upper-bound y)))
(p3 (* (upper-bound x) (lower-bound y)))
(p4 (* (upper-bound x) (upper-bound y))))
(make-interval (min p1 p2 p3 p4)
(max p1 p2 p3 p4))))

We can divide two intervals in terms of the mul-interval procedure by multiplying the first interval by the reciprocal of the second. Note that the bounds of the reciprocal interval are the reciprocal of the upper bound and the reciprocal of the lower bound, in that order.
(define (div-interval x y)
(mul-interval x
(make-interval (/ 1.0 (upper-bound y))
(/ 1.0 (lower-bound y)))))


Exercise 2.7 gives us the implementation for make-interval and asks us to complete the interval abstraction by implementing the selectors.
; by convention the bounds are added in lower-upper order
(define (make-interval a b) (cons a b))

(define (upper-bound intrvl)
(cdr intrvl))

(define (lower-bound intrvl)
(car intrvl))

Now that we have most of our definitions in place, we can start testing some of the given procedures.
> (define a (make-interval 5 10))
> (define b (make-interval 10 20))
> (add-interval a b)
(15 . 30)
> (mul-interval a b)
(50 . 200)
> (div-interval a b)
(0.25 . 1.0)


Exercise 2.8 asks us to implement a procedure for computing the difference of two intervals, sub-interval, using reasoning similar to that used to implement add-interval. The smallest value possible when subtracting interval y from interval x is to subtract the upper bound of y from the lower bound of x. The largest result of subtracting interval y from interval x is to subtract the lower bound of y from the upper bound of x.
; interval subtraction: [a,b] − [c,d] = [a − d, b − c]
(define (sub-interval x y)
(make-interval (- (lower-bound x) (upper-bound y))
(- (upper-bound x) (lower-bound y))))

We can test this with a few different intervals to show that it works whether the intervals overlap or not.
> (define a (make-interval 1 10))
> (define b (make-interval 50 100))
> (define c (make-interval 5 20))
> (sub-interval b a)
(40 . 99)
> (sub-interval a b)
(-99 . -40)
> (sub-interval a c)
(-19 . 5)
> (sub-interval c a)
(-5 . 19)


Exercise 2.9 defines the width of an interval as half the difference between its upper and lower bounds. For some arithmetic operations the width of the result of combining two intervals is a function only of the widths of the argument intervals, whereas for others the width of the combination is not a function of the widths of the argument intervals. We are to show that the width of the sum (or difference) of two intervals is a function only of the widths of the intervals being added (or subtracted).

I think this is best proven mathematically by showing that the width of the sum of two intervals is the same as the sum of the widths of two intervals. We start with a couple of definitions for interval addition and computing the width of an interval:

[a,b] + [c,d] = [a + c, b + d]
width([x, y]) = (y - x) / 2

Now we can combine these to come up with a formula for the width of the sum of two intervals.

width([a, b] + [c, d]) = width([a + c, b + d])
= ((b + d) - (a + c)) / 2

Finally, we can derive a formula for the sum of the widths of two intervals.

width([a, b]) + width([c, d]) = (b - a) / 2 + (d - c) / 2
= ((b - a) + (d - c)) / 2
= ((b + d) - (a + c)) / 2

This is the same formula we derived above. This proves that if we add any two intervals with widths x and y, the width of the resulting interval will always be z, no matter what the bounds of the intervals are.

To prove that this is not the case for interval multiplication we can use a simple counterexample.
> (define a (make-interval 2 4))
> (define b (make-interval 5 10))
> (define c (make-interval 10 15))
> (mul-interval a b)
(10 . 40)
> (mul-interval a c)
(20 . 60)

The intervals b and c have the same width, but when we multiply each of them by interval a, the resulting intervals have different widths. This means that the width of the product of two intervals cannot be a function of only the widths of the operands.


Exercise 2.10 points out that it is not clear what it means to divide by an interval that spans zero. We need to modify the provided div-interval routine to check for this condition and signal an error if it occurs.

Before we make the required modifications, let's take a closer look at the div-interval procedure to see what the problem is.
(define (div-interval x y)
(mul-interval x
(make-interval (/ 1.0 (upper-bound y))
(/ 1.0 (lower-bound y)))))

If the interval y spans zero, then there's a small problem with this portion of the code:
(make-interval (/ 1.0 (upper-bound y))
(/ 1.0 (lower-bound y))

If y is the interval [-10, 10], for example, then the snippet above would produce the interval [0.1, -0.1], which has a lower bound that is higher than its upper bound.

To fix the problem we can just do as the exercise suggests and signal an error if a zero-spanning interval is found in the divisor.
; Exercise 2.10
(define (spans-zero? y)
(and (<= (lower-bound y) 0)
(>= (upper-bound y) 0)))

(define (div-interval x y)
(if (spans-zero? y)
(error "Error: The denominator should not span 0.")
(mul-interval x
(make-interval (/ 1.0 (upper-bound y))
(/ 1.0 (lower-bound y))))))

Run an example in your interpreter to verify.
> (define a (make-interval 2 5))
> (define b (make-interval -2 2))
> (div-interval a b)
. . Error: The denominator should not span 0.


Exercise 2.11 suggests that by testing the signs of the endpoints of the intervals, we can break mul-interval into nine cases, only one of which requires more than two multiplications. We are to rewrite mul-interval using this suggestion.

The suggestion is based on the result of multiplication of two numbers with the same or opposite signs. For each interval there are three possibilities, both signs are positive, both are negative, or the signs are opposite. (Note that an interval with the signs [+, -] is not allowed, since the lower bound would be higher than the upper bound.) Since there are two intervals to check, that makes nine possibilities. All nine possibilities are listed below.

[+, +] * [+, +]
[+, +] * [-, +]
[+, +] * [-, -]

[-, +] * [+, +]
[-, +] * [-, +]
[-, +] * [-, -]

[-, -] * [+, +]
[-, -] * [-, +]
[-, -] * [-, -]

For most of the combinations above, we can see directly which pairs need to be multiplied to form the resulting interval. For example, if all values are positive, then multiplying the two upper bounds and two lower bounds are the only two products we need to find. The only case where we need to do more than two multiplications is in the case [-, +] * [-, +], since the product of the two lower bounds could possibly be larger than the product of the two upper bounds.

Note that the following code is much less readable and would be harder to maintain than the original procedure. In addition to that, without benchmarking both procedures we aren't even sure which one is faster. The trade-off in maintainability certainly doesn't seem to be worth any potential savings you might get from eliminating two multiplications from most cases. The developer who suggested this enhancement should probably have their commit access revoked (and be shot out of a cannon).
; Exercise 2.11
(define (mul-interval x y)
(let ((xlo (lower-bound x))
(xhi (upper-bound x))
(ylo (lower-bound y))
(yhi (upper-bound y)))
(cond ((and (>= xlo 0)
(>= xhi 0)
(>= ylo 0)
(>= yhi 0))
; [+, +] * [+, +]
(make-interval (* xlo ylo) (* xhi yhi)))
((and (>= xlo 0)
(>= xhi 0)
(<= ylo 0)
(>= yhi 0))
; [+, +] * [-, +]
(make-interval (* xhi ylo) (* xhi yhi)))
((and (>= xlo 0)
(>= xhi 0)
(<= ylo 0)
(<= yhi 0))
; [+, +] * [-, -]
(make-interval (* xhi ylo) (* xlo yhi)))
((and (<= xlo 0)
(>= xhi 0)
(>= ylo 0)
(>= yhi 0))
; [-, +] * [+, +]
(make-interval (* xlo yhi) (* xhi yhi)))
((and (<= xlo 0)
(>= xhi 0)
(<= ylo 0)
(>= yhi 0))
; [-, +] * [-, +]
(make-interval (min (* xhi ylo) (* xlo yhi))
(max (* xlo ylo) (* xhi yhi))))
((and (<= xlo 0)
(>= xhi 0)
(<= ylo 0)
(<= yhi 0))
; [-, +] * [-, -]
(make-interval (* xhi ylo) (* xlo ylo)))
((and (<= xlo 0)
(<= xhi 0)
(>= ylo 0)
(>= yhi 0))
; [-, -] * [+, +]
(make-interval (* xlo yhi) (* xhi ylo)))
((and (<= xlo 0)
(<= xhi 0)
(<= ylo 0)
(>= yhi 0))
; [-, -] * [-, +]
(make-interval (* xlo yhi) (* xlo ylo)))
((and (<= xlo 0)
(<= xhi 0)
(<= ylo 0)
(<= yhi 0))
; [-, -] * [-, -]
(make-interval (* xhi yhi) (* xlo ylo))))))

We'll need several test cases to make sure we get good coverage of all possible conditions.
> (define a (make-interval 2 4))
> (define b (make-interval -2 4))
> (define c (make-interval -4 -2))
> (mul-interval a a)
(4 . 16)
> (mul-interval a b)
(-8 . 16)
> (mul-interval a c)
(-16 . -4)
> (mul-interval b a)
(-8 . 16)
> (mul-interval b b)
(-8 . 16)
> (mul-interval b c)
(-16 . 8)
> (mul-interval c a)
(-16 . -4)
> (mul-interval c b)
(-16 . 8)
> (mul-interval c c)
(4 . 16)


Related:

For links to all of the SICP lecture notes and exercises that I've done so far, see The SICP Challenge.

7 comments:

Alex Marandon said...

I got stuck on 2.11, but seeing what a solution looks like, it might not be be such a bad thing ;-)

James W Dunne said...

I also got stuck on 2.11, expecting the resulting procedure to be an improvement. Perhaps they are showing the dangers of micro-optimisations?

Alejandro Saucedo said...

Hahah love ur blogs bro! Thanks for posting so many useful things!

Anonymous said...

Defining upper-bound and lower-bound, I think you should find the minimum and maximum respectively.

(define (upper-bound interval)
(let ((x (car interval))
(y (cdr interval)))
(max x y)))

(define (lower-bound interval)
(let ((x (car interval))
(y (cdr interval)))
(min x y)))

Zak Bainazarov said...

Definitely agree with your comment on 2.11.

Just a note: since we assume that low < high you can change a couple of if statements to just be two checks instead of 4.

As in,

(and (>= xlo 0)
(>= xhi 0)
(>= ylo 0)
(>= yhi 0))

could be
(and (>= xlo 0)
(>= ylo 0))

Though, it's not like that change will magically make the code less ugly.

Anonymous said...

I think sub-interval will be the same as add interval. Think of it this way - While one of the intervals errs to be greater than it's mean value, the other errs to be less than it. So the total error while subtraction would be the addition of the two quantities, and that would give you the maximum answer.

MacLane Wilkison said...

I think you could do something much more maintainable like this:

(defn mul-interval [x y]
(let [p1 (* (lower-bound x)
(lower-bound y))
p2 (* (upper-bound x)
(upper-bound y))]
(if (and (span-zero? x)
(span-zero? y))
(let [p3 (* (lower-bound x)
(upper-bound y))
p4 (* (upper-bound x)
(lower-bound y))]
(make-interval
(min p1 p2 p3 p4)
(max p1 p2 p3 p4)))
(make-interval p1 p2))))