C ONCORDANCE LINE AND CORPUS LINGUISTICS 윤 언 근. C OLLOCATION AND CONCORDANCE LINES...

Preview:

Citation preview

CONCORDANCE LINE AND CORPUS LINGUISTICS

윤 언 근

COLLOCATION AND CONCORDANCE LINES

전치사에 의해서 문장의미 결정 예제

구성하는 단어들에 의해 의미가 결정 예제

To throw your lunch into the sink

To throw up your lunch into the sink

A green phone is a phone that is green

COLLOCATION AND CONCORDANCE LINES

Collocation Words occur together more than chance

Ex) throw up(O), blue moon(O), green phone(X)

Probability of word in collocation > Probability of two independent words

Separation of word in collocation

He throw his hands up into the air

MORE WAY TO SORT CONCORDANCE LINES

문장 처리 단계

Ex ) $radius = 20

$radius 값 지정

removePunctua-tion

split the result

pick the last word

$radius 범위

MORE WAY TO SORT CONCORDANCE LINES

Word[-1] 이 마지막 entry

본 실험에서는Ordinal 을 1 로 설정

RESULT OF MORE WAY TO SORT CONCORDANCE LINES & APPLICAION : PHRASAL VERBS Ten concordance lines from the Call of the

wild

APPLICAION : PHRASAL VERBS IN THE CALL OF THE WILD

The wrong result

APPLICAION : PHRASAL VERBS

sprang upon 이 collocation ?

Ex) the call of the wild

Ex) Frankenstein

GROUPING WORDS: COLORS IN THE CALL OF THE WILD

MULTIVARIATE TECHNIQUES WITH TEXT

BASIC STATISTICS

표본 평균

표본 분산 Is a measure of the variability of the data Sum is divided by n-1 Zero every data value is the same

Z-SCORES APPLIED TO POE

A way to compute how a data value com-pares to a data set Z-score

분산

모집단의 평균

모집단의 표준편차

원수치

result

파일명 , 길이 , 평균 , 분산 , 표준편차

Z-SCORES APPLIED TO POE

Ex) z-score of four-letter word

Ex) z-score of three-letter word

Bell-shaped

Z = = -0.123

4 – 4.305

2.477

Z = = 0.526 3 – 4.305

2.477

WORD CORRELATION AMONG POE’S SHORT STORIES

how should a physicist compare masses and times? solution

Firstly, to convert both variables to z-score Secondly, sample correlation coefficient

Mean of the product of the z-score of each data value

r = 1 positive slope, r = -1 negative slope r = 0 no linear relationship

The correlation result

CORRELATIONS AND COSINES

X and Y have mean equal to zero, then 7.9 holds ture

Computing cosineCorrelation of the columns

of tern-document

CORRELATIONS AND COVARIANCES

Covariances 편차의 곱이다 .

만약 Y=X 이면 , Covariances 는 Variance 와동일 Experiment result

Recommended