Depth Perception

Learning Outcomes

1. How does the visual system obtain depth information?

2. How is retinal disparity related to depth?

3. What is the correspondence problem?

4. What natural constraints for depth are exploited by Marr & Poggio’s program?

5. Describe Marr’s 2½-D sketch.

6. How are visual illusions different from adaptation effects?

7. What can illusions tell us about functional architecture?

8. What are some natural constraints that lead to the perception of some illusions?

Depth Perception

The full primal sketch is a representation of what’s on the ______ (2-D), not of the 3-D world.

How is depth information recovered?

_____________ depth cues (distances less than about 3 m):

• _____________: crystalline lens gets fatter to focus on nearby objects; gets thinner to focus on faraway objects

- feedback from ciliary muscles provides information on lens curvature, and therefore distance

• ___________: rotation of eyes inward to cause image to fall on the fovea

- larger convergence angle: closer object; smaller: farther

convergence

_________/_______ disparity: retinal images of an object fall on disparate points on each eye’s retina

- fixated object produces no disparity

- ________: imaginary surface passing through fixation point; retinal images of objects not on horopter fall on disparate points on each eye’s retina

object on horopter corresponding points

not on horopter noncorresponding points

- distance between points on each retina is called the ______ of disparity

- the farther from the horopter, the greater the disparity

- _______ ________ Area (PFA): spatial points that fall on noncorresponding retinal areas, but lie within the PFA are fused into single images (Panum, 1858)

Panum’s Fusional Area

Evidence:

Hubel & Wiesel (1970):

- found cells in cat visual cortex that responded most to certain differences in degree of _________ (excited by objects not on the horopter)

- cells had no response when:

• an object was presented to each eye alone, or

• were presented with 0° _________

Poggio & Poggio (1984):

- ________ different disparity-sensitive binocular neurons found

- connections were excitatory or inhibitory

- brain must be determining disparity for some reason

__________: perception of depth based on retinal disparity alone

Hermann von Helmholtz (1909):

- believed that monocular form perception precedes stereopsis

- no evidence for this

Béla Julesz (1964):

- developed (computer-generated) random-dot ___________

random-dot stereograms

- these eliminate monocular depth cues

- each eye sees a patch of random dots, which contain no apparent global shape

- in each patch, there is a square-shaped region that is shifted over in each eye

- this creates retinal disparity: shifted pattern is perceived as a floating 3-D region, which is defined by the difference

- demonstrated that stereopsis is not dependent on detection of form occurring first

The Correspondence Problem

• how is it determined which elements in the two images match (and fuse 3-D)?

• _______-based methods make a match based on extracted image structure

• ___________-based methods are based on grey-level descriptions; look for statistically likely match

Julesz (1971):

1) point-by-point _________ comparison between images

- black pixel in left image vs. white pixel in right image no match (i.e., no correspondence)

- black pixel in left image vs. black pixel in right image potential match

2) ______-_______ ________ ______ model makes global match:

- to induce fusion, both images must lie on the horopter

- only neurons responding to 0° disparity are involved

- after images have fused, you can move them off the horopter without losing fusion within the PFA (hysteresis effect)

evidence: people do retain stereo fusion of images moved off the horopter within PFA

but: model only works with random-dot stereograms--fails with natural images

Potential solution:

1. particular ________ on a surface in a scene is selected from one image

2. ____ location must be identified in other image

3. _________ in the two corresponding image points is then determined

Problem:

- identifying the same location in both images beyond a doubt--_________ in correspondence:

ambiguity in depth

- _______ (black squares) are matchable elements

- each of the four points in one eye’s view could match any of the four projections in the other eye’s view

- only 4 of 16 possible matches are correct (filled circles); 12 are “_____ targets” (unfilled circles)

- to resolve ambiguities, the global display must be considered

Cooperative algorithms:

- disparity detecting neurons tuned to the same disparity mutually facilitate (excite) each other, but those tuned to different disparities inhibit each other

- mathematically results in only one depth solution, and only one global ____________ view would be perceived

- obtains stereopsis in the absence of shapes or forms

Marr & Poggio (1976):

- elaborated on the idea of cooperating neurons

- assumptions:

1) both eyes are looking at the same object

2) any point can occupy only one place at any time

3) matter is cohesive; objects tend to be uniform

- natural constraints:

1) _____________ (or similarity): to be matched, points on each retina must be physically similar

e.g., dark features in one correspond to dark features in the other

- people are unable to fuse an image with the same image of reversed contrast (Julesz, 1971)

2) __________ (or opacity): a feature on one retina should correspond uniquely to one feature on the other retina (called “epipolar geometry” in the reading)

- violated in the case of transparent surfaces, when an image feature is a combination of points from two physical surfaces

3) __________: disparity should vary smoothly (if two matched features are close together in the images, then typically their disparities will be similar, because the environment is made of continuous surfaces separated by boundaries)

- changes in disparity should be rare, occurring only at surface boundaries

- images project onto retina via lines of sight:

correspondence problem

• intersections of lines of sight represent possible target positions in space

• false targets (unfilled circles) are intersections of lines of sight from different targets

• targets (filled circles) are corresponding matchable elements

- two-stage model:

1. all possible matches are established; represented in a neural network of connections (via _____________ constraint)

2. false matches are inactivated by competition from true targets:

• __________ constraint produces mutual inhibition of matches along a line of sight

• __________ constraint produces mutual excitation of matches having same/similar depth

- example network:

cooperative network

• axes represent left and right _______

• compatibility: possible _______ represented by circles

• dashed lines are lines of sight; uniqueness produces mutual __________

• diagonals are “correlators,” corresponding to a particular disparity; continuity produces mutual __________

• targets are black circles, false matches are open circles

Pros & Cons:

biologically plausible; solved random-dot stereograms

resulting neural network model didn’t match known findings (blurring one “eye” of the network shouldn’t destroy stereopsis, but it did; produced “binocular _______” between images, not “fusion”)

Marr & Poggio (1979):

- based on matching edges, not points/pixels

- problem: how to obtain stereopsis despite blur?

(________ = adding higher spatial frequencies; lower ones unaffected)

teddy blur 0 teddy blur 1 teddy blur 2 teddy blur 3

- solution: use ________ spatial frequency channels:

• get zero-crossings from largest spatial filter

• determine matches

• hold in memory (i.e., 2½-D sketch: a dynamic buffer)

• compare results with successively _______ spatial filters: match pairs of zero-crossings or terminations, for a range of disparities

- result: 2½-D sketch

Marr’s 2½-D Sketch

- not a full 3-D model of the world; mostly represents surface ___________

- based on:

• full primal sketch, which contains contour, texture, shading, and occlusion depth cues

• stereopsis (e.g., retinal disparity)

• shape from ______ (e.g., rotation of object)

- represented by vectors:

• give ___________ of the surface at a given point

• also represent ______ of slant (wrt. reference plane)

2½-D sketch

- if there is not enough information to create a 2½-D sketch, certain natural constraints must be applied:

e.g., Ponzo illusion (1912):

Ponzo illusion

• converging lines serve as depth cue

• according to Marr, this depth mechanism is an artefact of processing at the 2½-D sketch stage

2½-D sketch characteristics:

• contains contour, texture, and shading information

• incorporates _____: via linear perspective, retinal disparity, etc.

• basic figure/ground separation begun (relative depth, not absolute)

• representation is “______ _______” (depends on your location)

Visual Illusions

- aftereffects are __________ _________: require adapting phase

- most visual illusions are simultaneous: interacting stimuli cause the effect

Pylyshyn’s (1989; 1999) cognitive penetrability criterion for FA:

1) test perceptual phenomenon

2) change a ______ related to the effect

3) retest

- if the effect has _______, the underlying processes are NOT part of the FA

Why are there illusions?

• (misapplied) ____ _________

Gregory (1963):

- as your distance from a given object increases, the retinal image becomes smaller, yet we do not perceive the object as getting smaller (size constancy)

- distance is taken into account when perceiving size--this is called ____-________ _______

e.g., Ponzo illusion:

Ponzo

- converging lines are a depth cue that activates size constancy; 2-D image treated like 3-D

- visual angles subtended by both horizontal bars are identical

- due to size-distance scaling, the upper bar seems farther away, and thus larger

• visual __________:

Gregory (1968, 1990): Müller-Lyer illusion

Mueller-Lyer illusion

- illusion is weaker in children, and in people living in dome-shaped huts

- “___________ _____” hypothesis:
(a) like inside corner of a room, (b) like outside edge of a building

- depth cues indicate (a) is relatively far away, (b) is relatively close

- due to size constancy, (a) appears ______, (b) appears _______

Gandhi et al. (2015): contrary evidence

- tested congenitally _____ children, ages 8-16, born with dense bilateral cataracts

- received cataract removal surgery and implantation of intraocular lenses

- 48 hours after surgery, were presented Ponzo and Müller-Lyer illusions

- were susceptible to the illusions, comparable to children of equivalent ages and socio-economic status

• both size constancy and visual experience:

McGraw (2003): size constancy explanation of Poggendorff illusion

Poggendorff illusion

- consider Poggendorff illusion as consisting of two separate components:

Poggendorff illusion

- each component is similar to the Müller-Lyer illusion:

Poggendorff/Müller-Lyer illusion

- upper segment appears _______ than it is; lower segment appears ______ than it is

- result: segment (a) appears _____ than it is; segment (b) appears ______ than it is

• cortical area

Schwartzkopf, Song, & Rees (2011):

- individual differences in size of V1 can be up to 3 times

- showed observers Ebbinghaus illusion and Ponzo illusion

Ebbinghaus illusion Ponzo illusion

- measured magnitude of illusions; mapped surface area of V1 with fMRI

- magnitude of illusions were negatively correlated with surface area of V1 (Ebbinghaus r = -.38, Ponzo r = -.48)

- that is, the larger V1, the weaker the illusion

- this relationship may arise during development, or be due to adult neural __________