Computational mechanisms underlying the perception of shape-from-texture

Ko Sakai, University of Pennsylvania

Abstract

Humans have a remarkable ability to determine three dimensional shape and depth from orderly changes in surface texture. When a surface is slanted, the texture on the surface is compressed in the direction of slant, and in the frequency domain, frequency spectra are expanded in the corresponding direction. It is computationally efficient to determine shape from texture in the frequency domain. Analysis of the local frequency spectrum does not require the identification of individual texture elements, and the transformations of the frequency spectra can be tracked by a small number of parameters, such as the second-order moments. I sought to investigate how the human visual system determines three dimensional shape from changes in spatial frequency. I carried out a set of psychophysical experiments to determine how the visual system represents or characterizes the local spatial frequency spectrum. The stimuli are constructed by filtering a white noise pattern in the Fourier domain. This technique generates families of stimuli whose frequency spectra are specifically controlled. These experiments, together with computational analysis, suggest that the visual system characterizes the frequency spectrum by the average peak frequency which is computed by averaging local peak frequencies over a spatial neighborhood. In order to investigate how the visual system might determine three dimensional shape and depth from the average peak frequency, I have developed a network-based model of shape-from-texture. Early stages of this model incorporate several properties of striate cortex, particularly those of complex cells. The model consists of four major stages: (1) spatial frequency extraction by an early vision model, (2) characterization of the spatial frequency by the average peak frequency, (3) determination of slant and tilt by normalization and lateral inhibition, and (4) determination of depth. The model is capable of determining three dimensional depth from both orthographic and perspective projections. The responses of the network show good agreement with human perception of three dimensional shape and depth for a variety of real and artificial images.

Subject Area

Biomedical research|Neurology|Psychology|Experiments

Recommended Citation

Sakai, Ko, "Computational mechanisms underlying the perception of shape-from-texture" (1995). Dissertations available from ProQuest. AAI9532268.
https://repository.upenn.edu/dissertations/AAI9532268

Share

COinS