Sunday, July 24, 2011

SICP Lecture 3A: Henderson Escher Example

Structure and Interpretation of Computer Programs
Henderson Escher Example
Covers Text Section 2.2.2

You can download the video lecture or stream it on MIT's OpenCourseWare site. It's also available on YouTube.

This lecture is presented by Professor Harold Abelson.

This lecture starts out with a review of the material covered in the previous lecture on compound data. The main points covered are pairs and the closure property (the things we make by forming pairs can also be combined to form pairs).

We're then shown how cons can be used to create lists of primitive elements in Scheme.
(cons 1
(cons 2
(cons 3
(cons 4 nil))))

is the same as
(list 1 2 3 4)

A common pattern in programming is to write procedures that takes a list, do something to every element of that list, and return a list containing the results. We saw an example, scale-list, that takes a list and a factor and returns the result of each element in the list multiplied by the factor.
(define (scale-list items factor)
(if (null? items)
(cons (* (car items) factor)
(scale-list (cdr items) factor))))

In Scheme terms, the general pattern here is to recursively do something to the cdr of the list and cons that on to actually doing something to the first element of the list.
The very fact that there's a general pattern there means I shouldn't be writing this procedure at all.

Instead of embedding a general pattern in each procedure that uses it, we should write a procedure that's the general pattern itself and write our procedure in terms of that higher-order procedure. For this example, that higher-order procedure is map. (See SICP 2.21 - 2.23: Mapping over lists for examples.)
(define (map proc items)
(if (null? items)
(cons (proc (car items))
(map proc (cdr items)))))

The important idea here is that we stop thinking about control structures and start thinking about applying operations to aggregates.

Henderson Escher Example

The remainder of the lecture talks about one particular example, the Henderson Escher Example, that summarizes everything that's been covered in the course so far, list structure, abstraction, representation, and capturing commonality with higher-order procedures. It also goes into the third major theme of the course, meta-linguistic abstraction.

Meta-linguistic abstraction:
  • Primitives
  • Means of Combination
  • Means of Abstraction

The example is a language developed by Peter Henderson for describing images that are composed of repeated elements that are shifted and scaled, like the M. C. Escher woodcut "Square Limit" pictured below.

The other theme illustrated by this example is the issue that there is no real difference between procedures and data.

The language operates on primitives called pictures. A picture is a procedure that draws a figure scaled to fit a rectangle that you specify. The following operations are supported in the language.

  • rotate - Rotate a picture 90 degrees.
  • flip - Flip a picture horizonally or vertically.
  • beside - Draw two pictures beside one another.
  • above - Draw two pictures, one above the other.

The closure property allows you to build complex pictures with just these few operations and one primitive (the picture itself). When you combine two pictures with beside or above, the result is itself a picture, which can be combined with other pictures.

We're going to see a lot of details concerning the implementation of the picture language illustrated in this example as we go through the exercises in the next section of the text, so I won't reproduce those details here. The gist is that we define one set of procedures to draw pictures inside of rectangles, another set of procedures that do the geometric manipulations of those pictures listed above (rotate, flip, beside, and above), then a higher-order set of procedures that combine those manipulated pictures in different ways to make an image like the Escher drawing above..

Software Design Methodologies

The last part of the lecture talks about the common software design methodology of decomposing a problem into discrete components, which are themselves decomposed into discrete components, and so on until you reach a design that looks like the tree structure below.

Each component in this kind of design is created to do one specific task, and the components only communicate with direct parent or child components in the hierarchy. Contrast this with the way the picture language was developed in the Henderson Escher example. First, a language for drawing primitive pictures was presented, then on top of that we build a language for geometric rotations, and finally, on top of that we built a language for schemes of combination.

This scheme gives you a full range of linguistic power at each level.

Lisp is a lousy language for doing any particular problem. What it's good for is figuring out the right language that you want and embedding that in Lisp.

For links to all of the SICP lecture notes and exercises that I've done so far, see The SICP Challenge.

Saturday, July 9, 2011

Letter Frequencies and Keyboard Layouts

I saw Mike Knuepfel's Keyboard Frequency Sculpture a few weeks ago and thought it was an excellent idea for visualizing letter frequency data.

The height of each key corresponds to how frequently that letter is used in samples of English writing.

I wanted to see how the layout of the most frequently used letters on a QWERTY keyboard compared to those on a Dvorak keyboard. Unfortunately, I'm no sculptor, and I couldn't see an easy way to slice and edit the original image to rearrange the keys, so I decided to use a color mapping instead.

The images below were created using the same letter frequency chart as the original sculpture. Pure blue is for the least frequently used letters, while pure red is most frequent.



Note that the majority of the most frequently used keys are on the home row in the Dvorak layout, but they're scattered all around on a QWERTY keyboard. The QWERTY layout originated in the very early days of mechanical typewriters, so the keys were arranged such that common two-letter combinations were placed on opposite sides of the keyboard so that the mechanical parts would not jam. The Dvorak layout was designed to reduce finger motion in order to increase typing rate and decrease errors. Despite these advantages, the Dvorak layout has still failed to catch on.