February 1967
(First Draft)
Contrasting Assumptions of (A) the Classical Theory of Vision
and (B) a New Theory of Vision
J. J. Gibson
The World Wide Web distribution of James Gibson’s “Purple Perils” is for scholarly use with the understanding that Gibson did not intend them for publication. References to these essays must cite them explicitly as unpublished manuscripts. Copies may be circulated if this statement is included on each copy.
Note. This is a sort of skeleton for the rewriting
of The Perception of the Visual World, now 17 years
old. Criticisms are invited. The postulates here
are not in final form.
I, A. Vision is one of the channels of sense, consisting anatomically of the eye, its retina, the optic nerve, and the brain.
B. Vision is one of the perceptual systems, consisting (in vertebrate animals) of a bilateral pair of mobile eyes in a head on a body, the eyes being oriented to the environment and being adapted to pick up information from the environment. It shares organs with other perceptual systems (e.g., the vestibular apparatus and its cooperating reflex adjustments).
II, A. Light and color sensations are obtained when electromagnetic energy enters the eye (radiant light). How things are seen is a problem.
B. Things are seen by means of the optic array entering an eye at a given eye posture (a sample of the ambient light).
III, A. By physical optics, each point of an illuminated object radiates wave-fronts (rays) into the medium by virtue of the disturbances of the particles of its surface.
B. By ecological optics, each face or patch of an illuminated object (unless it is a mirror) absorbs part and scatter-reflects part of the light falling on it. The fraction absorbed depends on the pigment of the patch (its reflectance) and the intensity reflected in different directions depends on the layout of the face (elaborate elsewhere).
IV, A. The wavelengths (rays) from each particle or point of an object have a specific wavelength (or a specific spectrum of wavelengths?) and a specific initial amplitude, that correspond to the particle or point (or to the disturbance in the atom?). (The assumptions of physical and of geometrical optics are not the same here. The ancient puzzle of the “colorless atom” comes to it.)
B. The edges of the faces of an object and the borders of its pigment-patches are projected into a light filled medium (one with multiple scatter-reflection from many surfaces and therefore multiple station points for any eye) as discontinuities of intensity and spectral composition (elaboration elsewhere).
V, A. The image of an object, formed by a lens that focuses each pencil of rays at the entrance pupil from each point of the object, consists of energy-points that correspond to the radiating object-points in wavelength and amplitude.
B. The figure of an object in the ambient array at any station point in the medium consists of invariant relations, among which are the discontinuities of intensity and spectral composition in the array (transitions, contrasts) which, in turn, are the perspectives of the edges and pigment borders of the object. The invariant relations in this array constitute information about the layout and composition of the surfaces of the object.
VI, A. A retina is a mosaic of photosensitive cells (rods and cones).
B. A retina is a part of a mobile system for registering discontinuities in the ambient array.
VII, A. The retinal image produces a physiological image in the mosaic of photoreceptors, converting wavelength and amplitude into corresponding neural signals.
B. The overlapping interspersed neural units of the system respond in patterns that are transposable over the anatomical mosaic of cells.
VIII, A. The optic nerve and optic tract transmit the neural signals to the visual projection center of the brain, thus producing another physiological image on the cortex.
B. The visual system responds so as to preserve the invariant relations in the light.
IX, A. Sensory impressions of brightness, color, and location (and perhaps of extensity) that correspond to the neural signals arise in the sensory area of the brain.
B. Conscious awareness of the object and its distinctive features (perception) arises whenever the higher and lower centers of the visual system are not interrupted.
X, A. The brain (or the mind) performs operations on these data of sense, the kind of operation assumed depending on the theory of perception adopted. A minimal list of such operations follows.
1. The gap in the sensory image caused by the retinal blind spot is “filled in”.
2. The sensory image from the right eye is “fused” with that from the left eye to produce single vision, but when the two images do not exactly coincide the disparity is experiences as double imagery.
3. The half of each sensory image in one hemisphere of the brain is combined with its other half in the other hemisphere.
4. The displacement of the sensory image in the brain caused by an eye movement is canceled so that the phenomenal object does not move (or perhaps the local signs of each sensation are shifted by the amount of the eye movement).
5. The series of temporary sensory images caused when the animal turns around is converted by memory into a simultaneous composite image of the surroundings.
6. The expansion of the sensory image caused when the animal approaches an object (or vice versa) is compensated so that the object does not seem to get larger.
7. The rotation of the sensory image caused when the animal inclines its head to right or left is compensated so that the object does not seem to tilt.
8. The sensations of brightness and color composing the sensory image are transformed so as to correspond not to the amplitude and wavelength of the energy-points in the retinal image but approximately to the reflectance and pigmentation of the surfaces of the object.
9. The third dimension of depth (or distance) is added to the sensory image, perhaps by memories of touch becoming associated with sensations of vision (Berkeley), or by unconscious interpretation of visual cues (Helmholtz) or by spontaneous self-distribution of brain processes (Koffka) or by innate categories of mind (Kant).
10. In particular, the size of the sensory image is compensated so as not correspond to the angular size of the retinal image but approximately to the physical size of the object.
11. The form of the sensory image is transformed so as to correspond to the appearance of the object viewed from in front instead of the foreshortened appearance when viewed at a slant.
12. The sensory image of the front side of an object is supplemented by memory images of its back side, and the perception of the hidden background of an object is provided either by memory images or by the spontaneous segregation of ground and figure in the brain. The impermanent images are thus converted into a conception of permanent objects in a fixed layout.
B. The brain is the highest controlling center of the nervous system but it is not the organ of the mind nor the theater of consciousness. It does not perform operations on the data of sense, or view sensory images, or store and retrieve memory images, although it controls the visual system.
1. The gap of hole in the sensory image is an artifact of monocular eye-fixation. The optic array is perfectly continuous.
2. Single vision from two eyes is a false problem. If two images in the brain do not exist, they do not have to be combined.3. If half-images in the two hemispheres do not exist, they do not have to be combined.
4. If the physiological image in the striate cortex is myth, it is not something whose displacement with eye movement has to be canceled.
5. If the visual system samples the ambient light the sequence foes not have to be converted into a single pictorial scene. Awareness of the surroundings is not pictorial to begin with.
6. The incidental impression of an expanding scene during approach to an object is one symptom of locomotion (visual proprioception); the expansion of the figure of an object with increased covering of the background and without expansion of the array is information for an approaching object.
7. The incidental (usually unnoticed) rotation of the scene when one inclines his head is also visual proprioception; the rotation of a figure relative to the array, however, is information for a tilting object.
8. The information for the reflectance and pigmentation of any part-surface in the world is given by the discontinuities (transitions, contrasts) in the extended optic array from a layout of surfaces. Sensations of brightness and hue corresponding to amplitude and wavelength of radiation are either artifacts of “reduction-viewing” or are perception of the luminosity of radiating sources of light as such).
9. The third dimension of empty space as such is not specified for vision. The layout of surfaces in the environment, however (the dihedral angles, convexities, concavities, and occluding edges of one thing in front of another) is specified by the structure of contrasts in the ambient array that is invariant with change of position of change of illumination.
10. The size of an object is usually specified along with its distance in a natural optic array. The size is given by the amount of ground-texture its figure intercepts, and the distance along the ground by the place (in the here-to-horizon gradients) where the figure intercepts the ground. But there is no reason to suppose that a sensory image of size is enlarged in proportion with sensory impressions of a distance.
11. The form of the edges of a surface is specific in a natural optic array along with its slant relative to adjourning surfaces (dihedral angles and edges) by relative gradients. The rigid form is given by the invariants under perspective transformations. The perspective (pictorial) form is an incidental sensation that is seldom noticed in the detecting of slant-layout. There is no need to suppose that a pictorial impression of foreshortened shape is converted into another pictorial impression of frontal shape by a reciprocal impression of slant.
12. Animals and children are not aware of the pictorial impression of the front side of each object in the world. They do not notice at first its perspective size of form, nor do they notice the invisibility of its back side, not do they pay attention to the invisibility of an occluded surface behind it. They are interested in the invariant distinguishing features of objects instead. They are also concerned with permanent layout. The optical covering and uncovering of background by edges (and the simultaneous differential amounts of covering and uncovering of background in the two eyes) provides information for the perception of hidden surfaces. There is therefore no need to suppose that sensory images have to be supplemented by memory images in order to explain the phenomenal permanence of the environment.