Perceptual Organization

Learning Outcomes

1. What is the basis for Gestalt approach to perception?

2. What are some Gestalt laws of grouping?

3. Describe the pros and cons of the Gestalt approach.

4. What are some artificial intelligence approaches to organization?

5. How is the full primal sketch obtained?

6. How does Marr’s approach relate to the Gestalt approach?

Perceptual Organization

- so far, we’ve seen that the visual system breaks down stimuli into component parts (e.g., via simple cells)

- but our experience is ________--not a myriad of separate pieces

- how are the pieces organized into a meaningful whole?

Gestalt Psychology

- Ehrenfels (1890): often, groups of stimuli take on a _______-____ quality: Gestaltqualität

- approach started by Max Wertheimer (1912); also Wolfgang Köhler (1920), Kurt Koffka (1935)

- basis was that ______ could be produced by a succession of stationary stimuli

- guiding principle: “the whole is _________ than the sum of its parts”

e.g., . . . . . . . . . . . . . . . . . . . . . . . . . . . .

e.g., symbol

“It has been said: The whole is more than the sum of its parts. It is more correct to say that the whole is something else than the sum of its parts, because summing up is a meaningless procedure, whereas the whole-part relationship is meaningful.” (Koffka, 1935, p.176)

- perception is an organized, dynamic process:

• _______ _________: physical systems tend to settle into equilibria involving minimum energy or surfaces, etc.; Gestaltists looked for parallels in perception

e.g., _______ minimize surface area for a given volume

• ______________ ___________: (assumed) correlation between psychological experiences and physiological events in the CNS

e.g., if you see a tree, there must be a “tree-shaped” pattern of neurons active

e.g., Necker cube (1832):

Necker cube

- this percept is reversible (“________”); but the stimulus doesn’t change

Necker cube

- the percept depends on the way you want to see it

( top-down process?)

How does this happen?

Figure/ground segregation: How are figures separated from the background?

• relative ____ (or “area”): the smaller of two areas tends to be seen as a figure

relative size/area

• ______________: if one area surrounds another, it tends to be seen as the background

surroundedness

• ___________: horizontal or vertical objects tend to be seen as figures

orientation

• ________: symmetrical forms tend to be considered figures; non-symmetrical (or repeated) areas seen as background

symmetry

• other important factors include contrast, convexity, and parallel contours

Laws (or principles) of grouping: How are many small parts organized into wholes?

- Helson (1933) cataloged 114 Gestalt laws; over 700 have been proposed; Boring (1942) narrowed them down to 14

• law of _________ (or nearness): things that are near to each other tend to be grouped together

proximity

• law of __________: similar things tend to be grouped together

similarity

• law of ____ ____________ (or continuity): a) points that, if connected, would result in either straight or smoothly curving lines, are seen as belonging together; or b) lines tend to be seen in such a way as to follow the smoothest path

continuation

• law of _______: a space enclosed by a contour (real or illusory) tends to appear as a figure

closure illusory contour

• law of ______ ____: things that are moving in the same direction tend to be grouped together

common fate

• law of ______________ (or familiarity): things tend to form groups if items appear meaningful or familiar:

meaningfulness

• law of ________ (“good figure” or “simplicity”): every stimulus pattern tends to be seen in such a way that the resulting structure is as simple as possible (minimum principle)

Prägnanz

More recent laws:

• common region: elements tend to be grouped together if they are located within the same closed region (Palmer, 1992)

common region

• element connectedness: elements tend to be grouped together if they are connected by other elements (Palmer & Rock, 1994)

element connectedness

Problem: how are these terms defined (e.g., “good”)?

Hochberg & Brooks (1960): what is “goodness” of shape?

goodness of shape

- increasing perception of object as 3-D with increasing:

• number of interior angles (“__________”)

• number of continuous lines (“_____________”)

• average number of different angles (“_________”)

- it is possible to obtain ___________ definitions

Problem: is __________ necessary?

Weisstein & Wong (1986):

- used reversible ____/_____ picture:

Rubin figure

- on each trial, observers selected either vase or faces as figure

- tilted line presented on either vase or faces

- task: which way was line tilted?

- accuracy was 3× better when the line appeared on the ______ chosen by the observer, than when it appeared on the ______

- evidence for perceptual processing: perception doesn’t always “just happen”

- problematic for Gestaltists: where does __________ (e.g., attention) fit in?

Pros & Cons:

identified important _________ (still studied today)

is often ________ (Prägnanz simplicity)

poor definitions (laws “tend to”?)

needs explanations, not just post hoc descriptions

perception doesn’t always obey the _______ _________; is often probabilistic

AI/Robotic Vision Approach

• seeing (in the real world) is ____

• researchers at MIT simplified the stimuli: “______ _____”

• analyzed lines in images produced by edges

Guzmán (1968): SEE program

- _________ are places where lines in an image meet, including L, K, peak, fork, X, multi, arrow, and T junctions

• arrow junction (3 lines meeting at a point, with one of the angles greater than 180°) denotes edges of the same object

• T junction (3 concurrent lines, 2 of them collinear) denotes object segregation

junctions

- proposed a set of positive and negative cues to the connectivity of objects:

positive/negative cues

• positive (solid) links suggest that the regions in question correspond to faces of the same object

• negative (dotted) links suggest that the regions belong to different objects

- algorithm based on ____ __________: two regions are parts of the same object, triggered by positive cues from a junction, only in the absence of negative evidence from the junction at the other end

- pros & cons:

___________ attractive

but _______ (only junctions considered)

__________ (fails to find a possible interpretation for some objects) and _______ (attributes impossible interpretations to some objects):

cube?

(algorithm classifies this as a cube with a strange floating object above it)

Clowes (1971): OBSCENE program

- richer information about each junction is necessary; adding information makes the problem easier to solve

- categorized edges by line labeling:

• ________ edges: convex (+) or concave (-)

• ________ or occluding edges (→): to its right is the body for which the arrow line provides an edge; on its left is space (or another body)

e.g., edges

- each junction has a limited number of possible interpretations (constraints)

• arrow junctions have three interpretations:

arrow junctions

• Ls have six:

L junctions

• Ts have four:

T junctions

• Ys have five:

Y junctions

- pros & cons:

could reject __________ objects

impossible object

Waltz (1972):

- edge consistency constraint: an edge must be given the same line label at both ends (e.g., convex edge cannot become concave at another junction)

edge consistency constraint

- pros & cons:

could handle cracks, and scenes having _______

crack shadows

required “a good deal of ______” to explicitly code every possible edge and junction

Blocks World problems:

real world is not so “____”: not comprised of straight lines

objects have _______

based on the general-_________ assumption: small shifts in the ________ of the viewer do not affect the configuration of the line drawing

- this rules out the possibility of the accidental alignment of image features into a ________ (“accidental”) junction

general vs. accidental viewpoints

- however, viewing an object from different viewpoints can lead to very different interpretations

general vs. accidental viewpoints

Structural Description Approach

Instead of processing artefacts found in the (proximal) image, describe the structure of the (distal) object.

Marr (1982):

• Blocks World: too ______________

• applied program to everyday objects

e.g., chairs, plant, teddy bear

First: get raw primal sketch

• __________: edges, bars, blobs, terminations

• attributes: contrast, length, width, position, orientation

teddy bear raw primal sketch

What about whole objects? (not just parts)

- arrange the components, via

• _____ ______: neighbouring components are assigned the same location (like Gestalt Law of Proximity)

• ___________: adjacent place tokens are clustered/grouped according to texture (like Similarity)

or curvilinearly, according to the orientation of the elements (like Good Continuation)

e.g., - - - - - - -----------

or _____ ___________, which differs from the intrinsic orientation of the features (like Closure?)

theta aggregation

This is done for many different levels of detail

- representations created from small features to more global properties (bottom-up/data-driven)

- _______ ___________ are applied:

in general, things that are adjacent to each other, and/or are similar to each other tend to belong together (generally)

Result: full primal sketch:

full primal sketch

What about exceptions to the rule? (ambiguous cases)

e.g., teddy bear with three black dots (3 ____?)

ambiguity

- sketch must be interpreted (object recognition/identification)

- top-down process (conceptually driven) kicks in

e.g., 2 eyes and a ____

top-down

More recent programs process regions by texture to obtain object boundaries.

texture segmentation