13
Weekly Report Semantic Web Research Center 진진진 2011-2-25

Weekly Report Semantic Web Research Center 진두현 2011-2-25

Embed Size (px)

DESCRIPTION

Progress  1. Selecting vocabulary( ~ 2/11)  2. Extracting sentences( ~ 2/18)  3. Workflow modeling (~ 2/24)  Calculating time duration  Making instructions  4. Developing tools  5. Constructing  6. Evaluation

Citation preview

Page 1: Weekly Report Semantic Web Research Center 진두현 2011-2-25

Weekly ReportSemantic Web Research Center진두현 2011-2-25

Page 2: Weekly Report Semantic Web Research Center 진두현 2011-2-25

Contents

Progress Work flow

Work environment Work flow

Experiment Time measured problems

The instructions Instructions

Plan

Page 3: Weekly Report Semantic Web Research Center 진두현 2011-2-25

Progress

1. Selecting vocabulary( ~ 2/11) 2. Extracting sentences( ~ 2/18) 3. Workflow modeling (~ 2/24)

Calculating time duration Making instructions

4. Developing tools 5. Constructing 6. Evaluation

Page 4: Weekly Report Semantic Web Research Center 진두현 2011-2-25

Work environment

Tools: CoreNet Browser, text editor, spread sheet Checking sentences on text editor, Searching concepts on CoreNet Browser, Typing case frames on spread sheet.

Page 5: Weekly Report Semantic Web Research Center 진두현 2011-2-25

Work flowA work flow for a one headword

1. Selecting a Headword2. Determine word sense3. Read a sentence from a file4. Working

a. Parsing Partial parsing Grammaticality check Assigning NP(distinguishing

complement, adjunct)b. Comparing NPs to the

NPs in mindc. Listing a case frame

or discarding or listing to another cate-

goryd. Assigning CoreNet con-

cept to arguments Search CoreNet with a word Find appropriate concept Insert information

단어 “뭉치다”선택표제어 “뭉치다”에서 격틀을 구축할 어의를 결정 : “ 뭉치다” ( 결합 )문장 : ‘ 온몸에 솜털이 나고 아주 작은 꽃들이 뭉쳐서 피지 .’

작업자가 머릿속으로 부분 파싱 : NPsubject( 꽃 ) + Verb( 뭉치다 )

작업자의 선택제약( 인간 등 유정물이 아닌것 ) + 뭉치다

( 꽃 ) 이 + 뭉치다제약에 일치

논항 없음 및 비문 다른 어의에 일치

Page 6: Weekly Report Semantic Web Research Center 진두현 2011-2-25

Experiment

5 headwords( 7 word senses): “ 끼우다” , “ 묶다” , “ 뭉치다 ( 자 , 타 )”, “ 통틀다” , “ 합하다 ( 자 ,타 )”

All words belong to a concept “ 결합” in CoreNet Time duration is measured Made Instruction with a supervisor(Dr. lee) No guarantee in coherence of the constructed

data

Page 7: Weekly Report Semantic Web Research Center 진두현 2011-2-25

Time measured

word Test sen-tences

Targetsentence(WS match)

not con-structed

ex-tracted

dis-carded

case-frame

Time du-ration

Time Per 1 case frame

끼우다 503 110 0 86 24 86 136min

1.58

묶다 976 367 287 70 10 70 100min

1.42

뭉치다 295 136 0 98 31 98 60min 0.61

통틀다 145 145 0 109 36 112 53min 0.47

합하다 158 158 0 115 43 118 84min 0.71

Page 8: Weekly Report Semantic Web Research Center 진두현 2011-2-25

Problems

Much of time was wasted in Typing a word on CoreNet browser(changing mouse

to keyboard) , finding spread sheet and typing and checking the in-

formation of arguments

Page 9: Weekly Report Semantic Web Research Center 진두현 2011-2-25

The instructions

1. Discarding useless sentences 2. Syntactic parsing 3. Selecting argument into case frame

Page 10: Weekly Report Semantic Web Research Center 진두현 2011-2-25

1. Discarding useless sentences

1. Has no explicit arguments: 예 > 끼워팔기 ..

2. Ungrammatical sentence: 심한 구어적 표현들

3. Can’t find any arguments with reasons below Can’t find concepts

‘ 수사 + 분류사’ 만 나타나는 경우 : 5 개를 뭉쳐서 의미 범주가 없거나 애매모호한 의존명사 : 짱 박은 것까지 합해서 명사형 절형태 : ~ ㅁ , ~ 기

인용된 것 중 의미범주를 알 수 없는 것 : ‘ 너에게’를 끼워 넣어

Can’t find words from anywhere : 삼적 (?) 을 합하여

Page 11: Weekly Report Semantic Web Research Center 진두현 2011-2-25

2. Syntactic parsing

Omitted Subject Many of verbs in the text have no explicit subject.

Solution: find in context, set the most probable concept if there’s no clue, set aside.

Noun + numeral +classifier: 나무 한 그루

Noun is argument(head).

Relative clause form: 뭉친 거품

Subject of ‘ 뭉치다’ is ‘ 거품’ Passive problem on “Verb + ‘ 지다’”

No absolute measure: 먹어지다 .

Follow context

Page 12: Weekly Report Semantic Web Research Center 진두현 2011-2-25

3. Selecting arguments into case frame

Coordinated NP Make case frame of each argument

Ex> “ 골프 ( 스포츠 ) 와 승마 ( 스포츠 ) 를 끼워” Case frame 1: 끼우 ; arg1: 골프 ( 스포츠 ) Case frame 2: 끼우 ; arg1: 승마 ( 스포츠 )

Unknown words Proper noun: “ 중국과 동남아국가연합을 묶는…”

Make case frame but exclude ‘word form’ from it Make proper noun list

Unknown in CoreNet but familiar: “ 펀드는 …”

Make new word list and assign it to appropriate concept Make case frame

Page 13: Weekly Report Semantic Web Research Center 진두현 2011-2-25

Plan

1. Selecting vocabulary( ~ 2/11) 2. Extracting sentences( ~ 2/18) 3. Workflow modeling (~ 2/24)

Calculating time duration Making instructions

//up to the comments.. 4. Developing tools ( ~ 3/4) 5. Constructing (3/?? ~ ) 6. Evaluation