View
242
Download
0
Embed Size (px)
Citation preview
Chapter 5
物體與背景知覺
Fig 5.1Computer perception system
The Defense Advanced Research Projects Agency (DARPA)
The March/2004 race (142 miles across the Mojave Desert)—1 million prize
The October/2005 race (132 miles) –2 million prize winner
“…Now we need to teach them how to drive in traffic.”-- Gary Bradski, Intel Corporation as quote in the October 17, 2005
issue of the EE Times
物體知覺實際上並不簡單
兩大基本問題知覺組織( perceptual organization )—視覺系統如何把龐雜環境刺激組織成為「物體」的組合?圖形 -背景( Figure-ground )—視覺系統如何把龐雜環境刺激中的一部分歸為「背景」,一部份歸為「圖形」
Fig 5.2四個挑戰
網膜刺激型態未必能代表環境刺激3-D 2-D
反向映射問題( inverse projection problem)
物體部分被遮蔽或者影像模糊
Fig. 5-4, p. 96
由不同角度觀看時,同一物體的影像不同-- 能認出不同觀看角度的影像為同一物體稱為具有方位不變性( viewpoint invariance )
Fig. 5-7, p. 97
哪兩張臉是同一人?
影像中產生亮度改變的原因經常無法確定
不同材質
陰影有無
格式塔學派對知覺組織的研究取向
對結構主義( structuralism )的反動結構主義是馮特( Wundt )等人開始建立( 20 世紀初期)
知覺是由感覺因子結合而成心理化學( mental chemistry )
Max Wertheimer 覺得似動運動( apparent movement )否定了結構主義 AM,所以和K.Kofka, I.Kohler 從事格式塔心理學的研究
結構主義也不容易解釋錯覺輪廓( illusory contour) ic
back
黑色圓形是牆上的洞→錯覺輪廓消失
格式塔學派因而拒絕了結構主義(知覺是感覺的總和),而主張整體不等同於部分的總和,並開始注重知覺組織的問題
知覺組織的格式塔定理
完形律( law of Pragnanz ) =law of good figure, law of simplicity刺激型態的知覺以產生最簡結構為原則
good figure
相似律 相近的物體會被組織在一起
Fig. 5-15, p. 100
Fig. 5-16, p. 100
連續律( law of good continuation )傾向將可形成直線或平滑曲線的點連接起來,形成具有平滑路徑的線條型態
接近律( law of proximity )空間鄰近的物體會被組織在一起
共同命運( common fate )以相同方向運動的物體會被組織在一起
熟悉度能共同構成熟悉型態的影像成分會被組織在一起
13 faces
格式塔以外的知覺組織原則( Palmer & Rock )
共同區域( common region )落入共同區域的元素會被組織在一起
元素連結( element connectedness )連結的物體會被組織在一起
同步性( synchrony )同時發生的視覺事件會被組織在一起
(a)proximity
(b)common region
(c)connectedness
(d)synchrony
(e)common fate
這些格式塔定律的地位是什麼?
定理 (law) vs. 原則 (principle) vs. 經驗法則 (heuristics)
經驗法則(Heuristics) vs. 算則 (algorithm)
They are best-guess rules that do not work every time. But, when they do, they work very fast.
圖形背景( figure-ground )分離
格式塔學派可逆圖形( reversible figure ) vase圖形及背景的成立要件
圖形比較像東西,位於背景之前 front對稱的比較可能是圖形 sy佔據面積較小的比較可能是圖形 small水平或垂直方位的比較可能是圖形 vertical有意義的物體比較可能是圖形 meaning下方的比較可能是圖形,左右沒有差異Vecera et al. (2002)
back
Figure 5.24 A version of Rubin’s reversible face-vase figure.
back
back
Figure 5.27 (a) Stimuli from Vecera et al.
(2002). (b) Percentage of trials on which lower or left areas were seen
as figure
Vecera 用了二種方法:
1) 判斷那一邊是圖形
2) 30秒期間,根據知覺到的圖形(非背景)是哪一個而按鍵,結果下方的有 84%的時間被知覺為圖形
圖形一定要由背景中分離後才能被知覺嗎?
受試往往認為有意義的才是圖形→代表「辨認」與「圖形 / 背景分離」的發生順序是……
如何由不同觀看角度辨認物體?
結構描述( structural description )模型
將物體表徵為「部件」以及部件之間的「空間關係」 D Marr (1982)
「部件」為柱狀的,具有體積的單元
成分辨識論( Recognition by Components, RBC )
部件為幾何子( geons )為數不多的幾何子(及其間的空間關係)即可用以代表大量的物體
幾何子最重要的特性是它的解析不受觀看角度影響( view invariant )如立方柱的平行邊,在大多觀看角度下均可看到—非偶發特性( non-accidental property, NAP )少數觀看角度下, 2-D 影像中的特性,其實並不會出現在 3-D 物體– accidental
只要界定幾何子的重要特徵仍然保留,就不太受雜訊影響
可以用以表徵許多類型的物體但無法解釋一般人何以能區辨細節不同的物體
影像描述( image description )模型
觀看角度不變性( view invariance )未必成立,所以辨認歷程將影像與儲存的各種觀看角度表徵作比較
Perceiving Scenes
What is a scene ? Object: compact and acting uponScene: extended and acting withinEx. Walking down (acting within) the street (scene) and mailing a letter (acting upon) to the mailbox (object)
Perceiving Scenes■Perceiving the gist of the scene rapidly
Perceiving Scenes“Global image features” information of scenes features are holistic and perceived rapidly.
high low
Degree of naturalness: forest vs. street
Degree of openness : beach vs. forest
Degree of roughness : forest vs. beach
Degree of expansion : railroad vs. street
Color: blue sky; green forest
現代觀點格式塔心理學的貢獻偏向「描述」知覺現象現代觀點重視「測量」與「機制」為何視覺系統會特別對某類型的視覺刺激有反應?
人類知覺系統嘗試捕捉環境特性,所以往往根據環境規律性 (regularities)作反應規律性指規律地在很多情境下出現的環境特性
如一些格式塔經驗法則「連續律」顯示我們周遭環境中有很多直的輪廓以及平滑的輪廓「 uniform connectedness 」顯示物體的各部分往往有相同的顏色,材質等,所以具有一致性的往往來自同一物體
Perceiving Scenes and Objects in Scenes
Regularities in the environment 1.Physical regularities
Ex1. Light-from-above heuristicEx2. Oblique effect
Figure 5.46 (a) Some of these discs are perceived as jutting out, and some are perceived as indentations. (b) Light coming from above will illuminate the top of a shape that is jutting out, and (c) the bottom of an indentation.
Shape from shading
Light-from-above heuristics
如 oblique effect (知覺系統對於垂直以及水平的刺激特別敏感)可能是因為我們的自然環境中充斥垂直與水平的輪廓
Figure 5.47 Why does (a) look like indentations in the sand and (b) look like mounds of sand? See text for explanation.
Perceiving Scenes and Objects in Scenes
Regularities in the environment 1.Physical regularities
Ex. Light-from-above heuristicEx2. Oblique effect
2.Semantic regularitiesEx. context
Figure 5.45 Stimuli used in Palmer’s (1975) experiment. The scene at the left is presented first, and the observer is then asked to identify one of the objects on the right.
為何人的物體知覺表現超越機器甚多?
知覺智慧( perceptual intelligence )
von Helmholtz無意識推論理論( theory of unconscious inference )有些知覺經驗源自我們對於環境的無意識假設-- 可能性原則( likelihood principle )我們知覺到的物體是在造成網膜上 2-D 型態的所有可能性中,最有機會出現的刺激型態 知覺歷程如同問題解決,但我們「自動化」地運用知覺智慧來解決知覺問題
Figure 5.44 The display in (a) looks like (b) -- a blue rectangle in front of a red rectangle -- but it could be (c), a blue rectangle and an appropriately positioned 6-sided red figure.
機器視覺系統需要加入知覺智慧來模擬人類快速解決知覺問題的歷程
Physiology of Object and Scene Perception
Fig. 5.43 對於知覺組織產生反應的神經元特性
Figure 5.44 How a neuron in V1 responds to stimuli presented to its receptive field (green rectangle). (a) The neuron responded when the stimulus on the receptive field is figure. (b) There is no response when the same pattern on the receptive field is not figure (Adapted from Lamme et al., 1995.)
對於圖形 /背景有反應的 V1神經元反應模式
反應型態與知覺經驗而非刺激物理特性一致
為何出現在 V1神經元?可能是脈絡調節( contextual modulation )所造成
來自高階視覺處理的回饋
腦如何處理有關物體的訊息?Sheinberg & Logothetis (1997)
Monkey trained to pull lever in response to particular pattern
Neuron in IT 當猴子知覺到某個刺激時才會fire猴子接收到的物體刺激總是相同的,知覺意識的的改變是發生在腦
Grill-Spector et al. (2004)用 ROI (region of interest) 法決定每個人FFA 的位置用遮蔽( masking )法快速呈現人臉圖片發現 FFA 的激發程度與受試者的主觀知覺判斷(而非物理刺激)符合
Fig. 5-39, p. 112
House vs. Facebinocular rivalryPPA vs. FFA
Figure 5.40 Time-course of brain activation for trials in which Harrison Ford’s face was presented. (Grill-Spector, et al., 2004)
Freedman et al. (2003)用 morphing 方法製作出系列刺激
用延遲配對( delayed matching to sample )程序來測量猴子的辨認表現
Figure 5.43 (a) Response of a monkey IT neuron that responds better to a 100-percent dog stimulus (red line) than to a 100-percent cat stimulus (blue) during the “sample” period of the delayed-matching-to-sample task. Other combinations of dog and cat fell between these two extremes. (b) Response of PF neurons to the same stimuli. For this neuron, the response to dog is greater during the delay and text periods. (From Freedman, D. J. et al., (2003). A comparison of primate prefrontal and inferior temporal cortices during visual categorization. Journal of Neuroscience, 23, 5235-5246.)
IT 與 PF/神經元的反應模式不同
Models of Brain Activity & Perception
Image Brain Measured voxel activity patternImage Decoder Measured voxel activity pattern
Fig. 5.50
An orientation decoder was used to analyze the voxel activity.
-The decoder could accurately predict which orientation had been presented.
∥ ?∥ ?
End
Fig 5.43對於知覺組織產生反應的神經元特性
Figure 5.29 How a neuron in V1 responds to stimuli presented to its receptive field (green rectangle). (a) The neuron responded when the stimulus on the receptive field is figure. (b) There is no response when the same pattern on the receptive field is not figure (Adapted from Lamme et al., 1995.)
對於圖形 /背景有反應的 V1神經元反應模式
反應型態與知覺經驗而非刺激物理特性一致
為何出現在 V1神經元?可能是脈絡調節( contextual modulation )所造成
來自高階視覺處理的回饋
現代觀點格式塔心理學的貢獻偏向「描述」知覺現象現代觀點重視「測量」與「機制」為何視覺系統會特別對某類型的視覺刺激有反應?
人類知覺系統嘗試捕捉環境特性,所以往往根據環境規律性 (regularities)作反應規律性指規律地在很多情境下出現的環境特性
如一些格式塔經驗法則「連續律」顯示我們周遭環境中有很多直的輪廓以及平滑的輪廓「 uniform connectedness 」顯示物體的各部分往往有相同的顏色,材質等,所以具有一致性的往往來自同一物體
如 oblique effect (知覺系統對於垂直以及水平的刺激特別敏感)可能是因為我們的自然環境中充斥垂直與水平的輪廓
圖形一定要由背景中分離後才能被知覺嗎?
受試往往認為有意義的才是圖形→代表「辨認」與「圖形 / 背景分離」的發生順序是……
如何由不同觀看角度辨認物體?
結構描述( structural description )模型
將物體表徵為「部件」以及部件之間的「空間關係」 D Marr (1982)
「部件」為柱狀的,具有體積的單元
成分辨識論( Recognition by Components, RBC )
部件為幾何子( geons )為數不多的幾何子(及其間的空間關係)即可用以代表大量的物體
幾何子最重要的特性是它的解析不受觀看角度影響( view invariant )如立方柱的平行邊,在大多觀看角度下均可看到—非偶發特性( nonaccidental property )少數觀看角度下, 2-D 影像中的特性,其實並不會出現在 3-D 物體– accidental
只要界定幾何子的重要特徵仍然保留,就不太受雜訊影響
可以用以表徵許多類型的物體但無法解釋一般人何以能區辨細節不同的物體
影像描述( image description )模型
觀看角度不變性( view invariance )未必成立,所以辨認歷程將影像與儲存的各種觀看角度表徵作比較
腦如何處理有關物體的訊息?Sheinberg & Logothetis (1997)
Monkey trained to pull lever in response to particular pattern
Neuron in IT 當猴子知覺到某個刺激時才會fire猴子接收到的物體刺激總是相同的,知覺意識的的改變是發生在腦
Grill-Spector et al. (2004)用 ROI (region of interest) 法決定每個人FFA 的位置用遮蔽( masking )法快速呈現人臉圖片發現 FFA 的激發程度與受試者的主觀知覺判斷(而非物理刺激)符合
Fig. 5-39, p. 112
Figure 5.40 Time-course of brain activation for trials in which Harrison Ford’s face was presented. (Grill-Spector, et al., 2004)
Freedman et al. (2003)用 morphing 方法製作出系列刺激
用延遲配對( delayed matching to sample )程序來測量猴子的辨認表現
Figure 5.43 (a) Response of a monkey IT neuron that responds better to a 100-percent dog stimulus (red line) than to a 100-percent cat stimulus (blue) during the “sample” period of the delayed-matching-to-sample task. Other combinations of dog and cat fell between these two extremes. (b) Response of PF neurons to the same stimuli. For this neuron, the response to dog is greater during the delay and text periods. (From Freedman, D. J. et al., (2003). A comparison of primate prefrontal and inferior temporal cortices during visual categorization. Journal of Neuroscience, 23, 5235-5246.)
IT 與 PF/神經元的反應模式不同
為何人的物體知覺表現超越機器甚多?
知覺智慧( perceptual intelligence )
von Helmholtz無意識推論理論( theory of unconscious inference )有些知覺經驗源自我們對於環境的無意識假設-- 可能性原則( likelihood principle )我們知覺到的物體是在造成網膜上 2-D 型態的所有可能性中,最有機會出現的刺激型態 知覺歷程如同問題解決,但我們「自動化」地運用知覺智慧來解決知覺問題
Figure 5.44 The display in (a) looks like (b) -- a blue rectangle in front of a red rectangle -- but it could be (c), a blue rectangle and an appropriately positioned 6-sided red figure.
Figure 5.45 Stimuli used in Palmer’s (1975) experiment. The scene at the left is presented first, and the observer is then asked to identify one of the objects on the right.
Figure 5.46 (a) Some of these discs are perceived as jutting out, and some are perceived as indentations. (b) Light coming from above will illuminate the top of a shape that is jutting out, and (c) the bottom of an indentation.
Shape from shading
Light-from-above heuristics
Figure 5.47 Why does (a) look like indentations in the sand and (b) look like mounds of sand? See text for explanation.
Fig. 5-49, p. 117
機器視覺系統需要加入知覺智慧來模擬人類快速解決知覺問題的歷程