Perceptual Organization

 

Learning Outcomes

1. What is the basis for Gestalt approach to perception?

2. What are some Gestalt laws of grouping?

3. Describe the pros and cons of the Gestalt approach.

4. What are some artificial intelligence approaches to organization?

5. How is the full primal sketch obtained?

6. How does Marr’s approach relate to the Gestalt approach?

 


 

Perceptual Organization

 

- so far, we’ve seen that the visual system breaks down stimuli into component parts (e.g., via simple cells)

- but our experience is ________--not a myriad of separate pieces

- how are the pieces organized into a meaningful whole?

 

Gestalt Psychology

 

- Ehrenfels (1890): often, groups of stimuli take on a pattern-like quality: Gestaltqualität

- approach started by Max Wertheimer (1912); also Wolfgang Köhler (1920), Kurt Koffka (1935)

- basis was that ______ could be produced by a succession of stationary stimuli

apparent motion

- guiding principle: “the whole is _________ than the sum of its parts”

e.g., . . . . . . . . . . . . . . . . . . . . . . . . . . . .

e.g., symbol

 

“It has been said: The whole is more than the sum of its parts. It is more correct to say that the whole is something else than the sum of its parts, because summing up is a meaningless procedure, whereas the whole-part relationship is meaningful.” (Koffka, 1935, p.176)

 

- perception is an organized, dynamic process:

_______ principle: physical systems tend to settle into equilibria involving minimum energy or surfaces, etc.; Gestaltists looked for parallels in perception

e.g., bubbles minimize surface area for a given volume

 

psychophysical ___________: (assumed) correlation between psychological experiences and physiological events in the CNS

e.g., if you see a tree, there must be a “tree-shaped” pattern of neurons active

e.g., Necker cube (1832), a.k.a. ____________:

Necker cube

this percept is reversible (________); but the stimulus doesn’t change

Necker cube Necker cube

the percept depends on the way you want to see it

(→ top-down process?)

 

How does this happen?

Figure/ground segregation: How are figures separated from the background?

 

relative ____ (or area): the smaller of two areas tends to be seen as a figure

relative size/area

 

surroundedness: if one area surrounds another, it tends to be seen as the background

surroundedness

 

___________: horizontal or vertical objects tend to be seen as figures

orientation

 

symmetry: symmetrical forms tend to be considered figures; non-symmetrical (or repeated) areas seen as background

symmetry

 

• other important factors include contrast, convexity, and parallel contours

 

Laws (or Principles) of Grouping: How are many small parts organized into wholes?

• Helson (1933) cataloged 114 Gestalt laws; over 700 have been proposed; Boring (1942) narrowed them down to 14

 

law of _________ (or nearness): things that are near to each other tend to be grouped together

proximity

 

law of similarity: similar things tend to be grouped together

similarity

 

law of good ____________ (or continuity): a) points that, if connected, would result in either straight or smoothly curving lines, are seen as belonging together; or b) lines tend to be seen in such a way as to follow the smoothest path

continuation continuation

 

law of closure: a space enclosed by a contour (real or illusory) tends to appear as a figure

closure   illusory contour

 

law of common ____: things that are moving in the same direction tend to be grouped together

common fate

 

law of meaningfulness (or familiarity): things tend to form groups if items appear meaningful or familiar:

meaningfulness

 

law of ________ (“good figure” or “simplicity”): every stimulus pattern tends to be seen in such a way that the resulting structure is as simple as possible (minimum principle)

Prägnanz

 

More recent laws:

common region: elements tend to be grouped together if they are located within the same closed region (Palmer, 1992)

common region

 

element connectedness: elements tend to be grouped together if they are connected by other elements (Palmer & Rock, 1994)

element connectedness

 

Problem: how are these terms defined (e.g., “good”)?

Hochberg & Brooks (1960): what is “________” of shape?

goodness of shape

- increasing perception of object as 3-D with increasing:

• number of interior angles (__________)

• number of continuous lines (discontinuity)

• average number of different angles (_________)

- it is possible to obtain operational definitions

 

Problem: is processing necessary?

Weisstein & Wong (1986):

- used reversible vase/faces picture:

 

Rubin figure

 

- on each trial, observers selected either vase or faces as figure

- tilted line presented on either vase or faces

- task: which way was line tilted?

- accuracy was 3× better when the line appeared on the ______ chosen by the observer, than when it appeared on the ______

- evidence for perceptual processing: perception doesn’t always “just happen”

- problematic for Gestaltists: where does processing (e.g., attention) fit in?

 

Pros & Cons:

☑ identified important _________ (still studied today)

☒ is often circular (Prägnanz ↔ simplicity)

☒ poor definitions (laws “tend to”?)

☒ needs explanations, not just post hoc descriptions

☒ perception doesn’t always obey the _______ principle; is often probabilistic

 


 

AI/Robotic Vision Approach

 

• seeing (in the real world) is ____

• researchers at MIT simplified the stimuli: Blocks World

• analyzed lines in images produced by edges

 

Guzmán (1968): SEE program

- _________ are places where lines in an image meet, including L, K, peak, fork, X, multi, arrow, and T junctions

• arrow junction (3 lines meeting at a point, with one of the angles greater than 180°) denotes edges of the same object

• T junction (3 concurrent lines, 2 of them collinear) denotes object segregation

junctions

 

- proposed a set of positive and negative cues to the connectivity of objects:

positive/negative cues

 

• positive (solid) links suggest that the regions in question correspond to faces of the same object

• negative (dotted) links suggest that the regions belong to different objects

- algorithm based on ____ __________: two regions are parts of the same object, triggered by positive cues from a junction, only in the absence of negative evidence from the junction at the other end

 

- pros & cons:

☑ intuitively attractive

☒ but _______ (only junctions considered)

☒ __________ (fails to find a possible interpretation for some objects) and unsound (attributes impossible interpretations to some objects):

cube?

(algorithm classifies this as a cube with a strange floating object above it)

 

Clowes (1971): OBSCENE program

- richer information about each junction is necessary; adding information makes the problem easier to solve

- categorized edges by line labeling:

• ________ edges: convex (+) or concave (-)

• ________ or occluding edges (→): to its right is the body for which the arrow line provides an edge; on its left is space (or another body)

e.g., edges

- each junction has a limited number of possible interpretations (constraints)

• arrow junctions have three interpretations:

arrow junctions

• Ls have six:

L junctions

• Ts have four:

T junctions

• Ys have five:

Y junctions

 

- pros & cons:

☑ could reject __________ objects

impossible object

☒ uses brute-force approach, systematically trying all possible combinations of potential solutions--poor model of human vision

 

Waltz (1972):

- edge ___________ constraint: an edge must be given the same line label at both ends (e.g., convex edge cannot become concave at another junction)

edge consistency constraint

 

- pros & cons:

☑ could handle cracks, and scenes having shadows

crackshadows

☒ required “a good deal of ______” (p.300) to explicitly code every possible edge and junction

 

Blocks World problems:

☒ real world is not so “____”: not comprised of straight lines

☒ objects have texture

☒ based on the general _________ assumption: small shifts in the position of the viewer do not affect the configuration of the line drawing

- this rules out the possibility of the accidental alignment of image features into a ________ (“accidental”) junction

general vs. accidental viewpoints

 

- however, viewing an object from different viewpoints can lead to very different interpretations

general vs. accidental viewpoints

 


 

Structural Description Approach

 

Instead of processing artefacts found in the (proximal) image, describe the structure of the (distal) object.

 

Marr (1982):

• Blocks World: too ______________

• applied program to everyday objects

e.g., chairs, plant, teddy bear

 

First: get raw primal sketch

• __________: edges, bars, blobs, terminations

• attributes: contrast, length, width, position, orientation

teddy bear image raw primal sketch

 

What about whole objects? (not just parts)

- arrange the components, via

_____ tokens: neighbouring components are assigned the same location (like Gestalt Law of Proximity)

___________: adjacent place tokens are clustered/grouped according to texture (like Similarity)

or curvilinearly, according to the orientation of the elements (like Good Continuation)

e.g., - - - - - -   →   –––––––––

or _____ aggregation, which differs from the intrinsic orientation of the features (like Closure?)

theta aggregation

 

This is done for many different levels of detail

- representations created from small features to more global properties (bottom-up/data-driven)

- _______ ___________ are applied:

in general, things that are adjacent to each other, and/or are similar to each other tend to belong together (generally)

 

Result: full primal sketch:

full primal sketch

 

What about exceptions to the rule? (ambiguous cases)

e.g., teddy bear with three black dots (3 ____?)

ambiguity

 

- sketch must be interpreted (object recognition/identification)

- top-down process (conceptually driven) kicks in

e.g., 2 eyes and a ____

 

 top-down

 

More recent programs process regions by texture to obtain object boundaries.

 texture segmentation