29
Gözde Özbal Carlo Strapparava FBK-irst Trento, Italy Daniele Pighin Google Inc. Zürich, Switzerland ACL 2013 BRAINSUP Brainstorming Support for Creative Sentence Generation

Gözde Özbal Carlo Strapparava FBK- irst Trento, Italy Daniele Pighin Google Inc

  • Upload
    angelo

  • View
    48

  • Download
    0

Embed Size (px)

DESCRIPTION

BRAINSUP Brainstorming Support for Creative Sentence Generation. Gözde Özbal Carlo Strapparava FBK- irst Trento, Italy Daniele Pighin Google Inc. Zürich, Switzerland ACL 2013. Introduction. 在現實世界裡 , 創作是一件非常費時費力的事 廣告標語 : punchy, catchy, memorable 前人有做過類似的研究 , 但是都未提出一個統一的格式 - PowerPoint PPT Presentation

Citation preview

Page 1: Gözde Özbal Carlo  Strapparava FBK- irst Trento, Italy Daniele  Pighin Google Inc

Gözde Özbal Carlo StrapparavaFBK-irst

Trento, Italy

Daniele PighinGoogle Inc.

Zürich, Switzerland

ACL 2013

BRAINSUPBrainstorming Support for

Creative Sentence Generation

Page 2: Gözde Özbal Carlo  Strapparava FBK- irst Trento, Italy Daniele  Pighin Google Inc

Introduction 在現實世界裡 , 創作是一件非常費時費力的事

廣告標語 : punchy, catchy, memorable 前人有做過類似的研究 , 但是都未提出一個統一的格式 作者提出 Brainsup, 一個可擴展的

framework, 使用者可以控制所有在創作過程中會使用到的參數 , 來更符合使用者的需求 .

Page 3: Gözde Özbal Carlo  Strapparava FBK- irst Trento, Italy Daniele  Pighin Google Inc

Domain Keywords BRAINSUP output examples

coffee waking, cup Between waking and doing there is a wondrous cup.

coke drink, exhaustion

The physical exhaustion wants the dark drink.

health day, juice, sunshine

With juice and cereal the normal day becomes a summer sunshine.

beauty kiss, lips Passionate kiss, perfect lips.Lips and eyes want the kiss.

mascara drama, lash Lash your drama to the stage.A mighty drama, a biting lash.

pickle crunch, bite Crunch your bite to the top.Crunch of a savage bite.A large bite may crunch a little attention.

soap skin, love, touch

A touch of love is worth a fortune of skin. The touch of froth is the skin of love.A skin of water is worth a touch of love.

Page 4: Gözde Özbal Carlo  Strapparava FBK- irst Trento, Italy Daniele  Pighin Google Inc

Architecture of BRAINSUP 首先 , 使用者可以選擇一定要出現在句子內的

target words, 另外也可以選擇像是 特定的 semantic domain: 運動 , 毯子… 特定的 emotion domain: 喜悅 , 憤怒 , 或者負面情緒 特定的 color: 紅 , 藍… 字的 phonetic properties: rhymes( 押韻 ),

alliterations ( 頭韻 )and plosives( 塞音 ) 使用者輸入 U = <t, d, c, e, p, w> 在 target 和 domain words, 使用者可以選擇

words 所要考慮的詞性 , 例如 : “drink/verb” or “drink/verb, noun.

Page 5: Gözde Özbal Carlo  Strapparava FBK- irst Trento, Italy Daniele  Pighin Google Inc

Architecture of BRAINSUP Pattern selection Searching the solution space Filler selection and solution scoring

Page 6: Gözde Özbal Carlo  Strapparava FBK- irst Trento, Italy Daniele  Pighin Google Inc

Architecture of BRAINSUP

Algorithm 1 Sentence Generation (U,Θ,P,L): U is the user specification, Θ is a set of meta-parameters;P and L are two dependency treebanks.O φfor all p ∈ CompatiblePatternsΘ (U,P) do while NotEnoughSolutionsΘ (O) do O O [ FillInPatternΘ (U, p,L)return SelectBestSolutions(O)

User input <t, d, c, e, p, w>

set of meta-parameters

最多 / 最少要產生幾個句子 , 最多要考慮幾種 pattern, 句子的最長長度…

從 curpos P 中挑選常見且符合使用者需求的 patterns根據 user input U, 從 treebank L 挑選符合 pattern p 的最佳解答

Page 7: Gözde Özbal Carlo  Strapparava FBK- irst Trento, Italy Daniele  Pighin Google Inc

Architecture of BRAINSUPPattern selection

從 corpus P 中挑選出 morpho-syntactic patterns

First: 選擇 corpus, 不同的 corpus 產生的句子其風格不同 Second: 用 Stanford parser 對 corpus 內句子做 parse, 再將 content words 移除 , 產生 patterns, 並記錄每種 pattern 在

corpus 中出現的次數

Page 8: Gözde Özbal Carlo  Strapparava FBK- irst Trento, Italy Daniele  Pighin Google Inc

Architecture of BRAINSUPPattern selection

空格內可以填入使用者所選的 target words嗎 ?target words t = [heading/VBG, edge/NN] X

t = [heading/NN, edge/NN]V

Page 9: Gözde Özbal Carlo  Strapparava FBK- irst Trento, Italy Daniele  Pighin Google Inc

Architecture of BRAINSUPPattern selection

空格的數量必須大於 target words 的數量 CompatiblePatterns(.) slots > t, slots 的最大 / 最小數量在 Θ 內控制 , 另外 , 為了避免同樣的 inputs 會產生相同的結果 , sort algorithm 內加入 random component( 一樣在 Θ 內控制 )

CompatiblePatterns(.) 最後依照 patterns出現的次數 ( 多少 ) 回傳

Page 10: Gözde Özbal Carlo  Strapparava FBK- irst Trento, Italy Daniele  Pighin Google Inc

Architecture of BRAINSUPSearching the solution space

挑選完 patterns 之後 , 再來要選擇每個空格內要填入哪些字 ( 從 dependencies 數量最多的空格開始執行 )

僅包含 stop words, syntactic relations, morphologic constraints(POS tags)

Page 11: Gözde Özbal Carlo  Strapparava FBK- irst Trento, Italy Daniele  Pighin Google Inc

Architecture of BRAINSUPSearching the solution space

分析大型 corpus L( 資料為 parsed sentences) 並記錄 head-relation-modifier(<h,r,m>) dependency relations 出現次數 (operator τr(h))

hmτamod(smoke)

hm

τ-1

nsubj(fires)

Page 12: Gözde Özbal Carlo  Strapparava FBK- irst Trento, Italy Daniele  Pighin Google Inc

Architecture of BRAINSUPSearching the solution space

τ-1dobj(smoke

)

τ-1nsubj(fires)

τ-1prep(in)

Page 13: Gözde Özbal Carlo  Strapparava FBK- irst Trento, Italy Daniele  Pighin Google Inc

Architecture of BRAINSUPFiller selection and solution

scoring 得到候選字的 lists 之後 , 再來要選擇填入哪些字分數最高且符合使用者的需求Algorithm 2 RankCandidates(U, f, c1, c2, s, X): c1 and c2 are two candidate fillers for the slot X in the sentence s = [s0, . . . sn]; f is the set of feature functions; U is the user specification.sc1 s, sc2 s, sc1 [X] c1, sc2 [X] c2for all f ∈ SortFeatureFunctions(U, f ) do if f(sc1 ,U) > f(sc2 ,U) then return c1 > c2 else if f(sc1 ,U) < f(sc2 ,U) then return c1 < c2return c1 Ξ c2

Page 14: Gözde Özbal Carlo  Strapparava FBK- irst Trento, Italy Daniele  Pighin Google Inc

Architecture of BRAINSUPFiller selection and solution

scoring 12 feature functions: Chromatic and emotional connotation

C 為使用者選定的 color, si 為句子中第 i 個word

Domain relatedness d 為使用者選定的 domain, si 為句子中第 i 個

word

Page 15: Gözde Özbal Carlo  Strapparava FBK- irst Trento, Italy Daniele  Pighin Google Inc

Architecture of BRAINSUPFiller selection and solution

scoring Semantic cohesion 與 Domain relatedness 相同 , 將 domain d換成 target words t

Target-words scorer 強迫 target words t 必須在 sentence 中出現

Page 16: Gözde Özbal Carlo  Strapparava FBK- irst Trento, Italy Daniele  Pighin Google Inc

Architecture of BRAINSUPFiller selection and solution

scoring Phonetic features (plosives, alliteration and rhyme) plosives: 計算 plosives 在一個 sentence 中出現的比例 alliteration: 用 trie 來紀錄 , ci 表示 node i 走過的次數 rhyme: 和 alliteration 相同 , 不過在加入 trie前先反轉

Page 17: Gözde Özbal Carlo  Strapparava FBK- irst Trento, Italy Daniele  Pighin Google Inc

Architecture of BRAINSUPFiller selection and solution

scoring Variety scorer calculated as the number of distinct

words in the sentence over the size of the sentence

Unusual-words scorer ci 表示從另一個 corpus V 中每一個 word si ∈ s所觀察到的次數

Page 18: Gözde Özbal Carlo  Strapparava FBK- irst Trento, Italy Daniele  Pighin Google Inc

Architecture of BRAINSUPFiller selection and solution

scoring N-gram likelihood Dependency likelihood

Page 19: Gözde Özbal Carlo  Strapparava FBK- irst Trento, Italy Daniele  Pighin Google Inc

Evaluation Five experienced annotators were asked to

rate 432 creative sentences 1) Catchiness: is the sentence attractive, catchy or

memorable? [Yes/No] 2) Humor: is the sentence witty or humorous?

[Yes/No]; 3) Relatedness: is the sentence semantically

related to the target domain? [Yes/No]; 4) Correctness: is the sentence grammatically

correct?[Ungrammatical/Slightly disfluent/Fluent]; 5) Success: could the sentence be a good slogan for

the target domain? [As it is/With minor editing/No].

Page 20: Gözde Özbal Carlo  Strapparava FBK- irst Trento, Italy Daniele  Pighin Google Inc

Evaluation Randomly selected a subset of these

slogans and for each of them generated an input specification U

Page 21: Gözde Özbal Carlo  Strapparava FBK- irst Trento, Italy Daniele  Pighin Google Inc

Evaluation t: 從句子中隨機選 2~3 個 d: commerical domain e: positive c: domain 如果有極度相關的顏色才使用 , 不然就隨機選擇一個顏色 產生 10 個 tuple<t, d, c, e, p> 再配合 5 種不同的 features 組合

Page 22: Gözde Özbal Carlo  Strapparava FBK- irst Trento, Italy Daniele  Pighin Google Inc

Evaluation Base: Target-word scorer + N-gram likelihood +

Dependency likelihood + Variety scorer + Unusual-words scorer + Semantic cohesion

Base + D: base + Domain relatedness Base + D + C: base + D + Chromatic

connotation Base + D + E: base + D + Emotional

connotation Base + D + P: base + D + Phonetic features

50 種 input 各產生 10 句 sentences, 總共產生 432 句

Page 23: Gözde Özbal Carlo  Strapparava FBK- irst Trento, Italy Daniele  Pighin Google Inc

Evaluation weight: set heuristically

Target Word scorer: 1.0 Variety and Unusual Word scorers: 0.99 Phonetic Features, Chromatic/Emotional Connotation

and Semantic Cohesion scorers :0.98 Domain, N-gram and Dependency Likelihood scorers:

0.97 Patterns : corpus of 16,000 proverbs Dependency operators : British National Corpus 只考慮字數不大於 20 個字 , 且裡面所有的字在

wordnet 中查得到的 sentences

Page 24: Gözde Özbal Carlo  Strapparava FBK- irst Trento, Italy Daniele  Pighin Google Inc

Evaluation - result

Page 25: Gözde Özbal Carlo  Strapparava FBK- irst Trento, Italy Daniele  Pighin Google Inc

Evaluation - result

Page 26: Gözde Özbal Carlo  Strapparava FBK- irst Trento, Italy Daniele  Pighin Google Inc

Evaluation - result 有 63 個 cases 每一個 dimensions 都是標為

YES, table 1 的例子就是選自其中 , 除了正確性外 , 還可以觀察到許多修辭方法 隱喻 : a summer sunshine 雙關 : lash your drama 擬人化 : lips and eyes want.

語音特性的使用 plosives : passionate kiss, perfect lips alliteration: the dark drink rhyme : lips and eyes want the kiss

Page 27: Gözde Özbal Carlo  Strapparava FBK- irst Trento, Italy Daniele  Pighin Google Inc

Domain Keywords BRAINSUP output examples

pleasure wine, tasting A pleasant tasting, a heady wine. A fruity tasting may drink a sparkling wine.

coke calorie, taste, good

A sixth calorie may taste an own good.

healthy day, juice, sunshine

Drink juice of your sunshine, and your weight will choose day of you.A same sunshine is fewer than a juice of day.

cigarette doctors, smoke

Unscrupulous doctors smoke armored units. Doctors smoke no arrow.

mascara drama, lash The such drama is the lash.coffee waking, cup You cannot cup hands without

waking some fats.soap skin, love,

touchThe touch of skin is the love of cacophony. You love an own skin for a first touch.

Page 28: Gözde Özbal Carlo  Strapparava FBK- irst Trento, Italy Daniele  Pighin Google Inc

Conclusion

提出一個 extensible framework Brainsup, 使用者可以依照個人需求定義參數 .

系統大量的使用 dependency parsed data 來保證創造出的句子符合句法性 .

雖然創造出的句子不一定完全符合使用者的需求 , 但至少會對使用者產生啟發作用 .

Page 29: Gözde Özbal Carlo  Strapparava FBK- irst Trento, Italy Daniele  Pighin Google Inc

Conclusion

It is wiser to believe in science than in everlasting love.