VISION AS INFORMATION PROCESSING: David Marr

"Vision is a process that produces from images of the external world a description that is useful to the viewer and not cluttered with irrelevant information." (Marr and Nishihara)

Vision, the combination of looking and seeing, is more complicated than one might think. The way the eye focuses and collects light is well understood, but what the brain does next is more mysterious. For example, how do we recognize a chair by sight?

It is not as if we were comparing two photographs, one of the chair and one from the brain. When we 'see' we only focus on a small area, about one degree of our field of vision! Test this. Below is a word you have probably never seen before, because I invent ed it. This way you can't cheat and REMEMBER what the letters of the word are. Now, concentrate hard on the 'y' in the middle of the word, do not let your eyes wander! Next, try to use your peripheral vision to read the fifth letter to the right of 'y'.

uenbcysciueav

This may be difficult or impossible.

If we see only one degree of detail, then we are not dealing with an imaging process, in the photographic sense. How is it that we don't need to scan a chair's every detail to know what we are looking at? (Another issue is: how do we recognize a chair when chairs come in so many forms?) Also, every time you see a chair, it is from a new perspective. Therefore, comparing these two 'pictures' pixel for pixel would not work. Do we simply sense a kind of chair gestalt?

Whatever the process of perception is, it is fallible. Patients with right-brain lesions will not recognize a bucket, for example, from an unorthodox perspective, s uch as from above. In fact, they will deny that the object is a bucket at all! (Patients with left-brain lesions can identify the shape of the bucket from unorthodox angles but cannot conjure up the name of the object or its purpose.)

Quantified Vision

When Marr describes the work of Werner Reichardt and Tomaso Poggio on visual flight control system of the housefly, one gets the impression that he is not describing a living being, but a highly automated robot.

"If the visual field 'explodes' fast enough (because a surface looms nearby), the fly automatically 'lands' toward its center. If this center is above the fly, the fly automatically inverts to land upside down. When the feet touch, power to the wings is cut off. Conversely, to take off, the fly jumps; when the feet no longer touch the ground, power is restored to the wings, and the insect flies again." (Marr 32-33)

According to Marr, the landing mechanism and several other automated flight control s ystems make up %60 of fly vision. He writes that:

"it is extremely unlikely that the fly has any explicit representation of the visual world around him--no conception of surface, for example. . . . It is clear that human vision is much more complex than this, although it may well incorporate subsystems not unlike the fly's to help with specific and rather low-level tasks like the control of pursuit eye movements." (Marr 34)

Works Cited

"Vision: A Computational Investigation into the Human Representation and Processing of Visual Information," by David Marr, W.H. Freeman and Company, NY 29-61 (1982)

Return to the Architectonic

Dylan Cooke '99.

May 25, 1996.