30
CORPORA-2013 A study of inflectional morpheme development in English-speaking children using CHILDES Corpus Myung Sook Min 1 , Sun-Young Lee 2 , Jong-Sup J 1 Hankuk University of Foreign Languag 2 Cyber Hankuk University of Foreign Langua 26 JUNE, The International Conference on Corpus Linguistics CORPORA

A study of inflectional morpheme development in English-speaking children using CHILDES Corpus

  • Upload
    yukio

  • View
    25

  • Download
    0

Embed Size (px)

DESCRIPTION

A study of inflectional morpheme development in English-speaking children using CHILDES Corpus. Myung Sook Min 1 , Sun-Young Lee 2 , Jong-Sup Jun 1 1 Hankuk University of Foreign Language & 2 Cyber Hankuk University of Foreign Language. 26 JUNE, 2013 - PowerPoint PPT Presentation

Citation preview

Page 1: A study of inflectional morpheme development in English-speaking children using CHILDES  Corpus

CORPORA-2013

A study of inflectional morpheme develop-ment in English-speaking children using

CHILDES Corpus

Myung Sook Min1, Sun-Young Lee2, Jong-Sup Jun1

1 Hankuk University of Foreign Language & 2 Cyber Hankuk University of Foreign Language

26 JUNE, 2013

The International Conference on Corpus Linguistics CORPORA-2013

Page 2: A study of inflectional morpheme development in English-speaking children using CHILDES  Corpus

CORPORA-2013

2

Using the CHILDES(Child Language Data Exchange

System) database, this study investigated the order of ac-

quisition of inflectional morphemes and the overregular-

ization found in English children’s L1 acquisition.

Research Goal

Page 3: A study of inflectional morpheme development in English-speaking children using CHILDES  Corpus

CORPORA-2013

1. Introduction

Background Children’s L1 development is made by regularizing the linguistic knowledge ac-

quired through diverse input from caregivers. In English, the language development can be measured by the usage of the in-

flectional morphemes such as –ing and –(e)d. Brown(1973) proposed the mean order of acquisition of 14 morphemes and

Marcus et al.(1992) confirmed the U-shape development in the acquisition of English verbal irregular past tense.

Research purpose Using the whole CHILDES database, this study verifies the previous studies that

studied a limited number of subjects on inflectional morpheme development in child language.

3

Page 4: A study of inflectional morpheme development in English-speaking children using CHILDES  Corpus

CORPORA-2013

2. Literature Review

2.1 Acquisition order of inflectional morphemes

Berko(1958) Brown(1973)

Studied the acquisition of mor-phemes in 4-7 year old American children using WUG Test which investigates children’s ability to ap-ply the inflectional morphemes to nonsense words.

Order of acquisition of Infl. (1) Present progressive(-ing) (2) Past regular(-ed) (3) Third Person regular(-s) (4) Possessive(-’s)

Studied the acquisition of grammat-

ical morphemes by analyzing the

spontaneous utterance produced by

3 children. Order of acquisition of Infl.

(1) Present progressive

(2) Plural (3) Past irregular

(4) Possessive (5) Past regular

(6) Third person regular

(7) Third person irregular

4

Page 5: A study of inflectional morpheme development in English-speaking children using CHILDES  Corpus

CORPORA-2013

2. Literature Review

2.2 Overregularization

Marcus et al.(1992) – (-ed) Kuczaj(1977) – (-ing)

Studied the overregularization of past tense morpheme on the sponta-neous utterance produced by 83 subjects.

Overregularization rate was not high but its tendency existed.

Overregularization errors were found from the age of 2 till the be-ginning of school age.

U-shape development confirmed.

Studied the overregularization of present progressive morpheme on the spontaneous utterance produced by 15 subjects.

Overregularization was rarely found.

Claimed that it is because there is no irregular present progressive form for irregular verbs.

5

Page 6: A study of inflectional morpheme development in English-speaking children using CHILDES  Corpus

CORPORA-2013

2. Literature Review

2.3 Research questions

Limit of pre-vious studies

The results of previous studies are insufficient for the generalization of children’s language development due to a limited number of participants.

Research questions

1) Do children apply the inflectional morphemes to diverse verbs as they get older?

2) Is the overregularization error found? And is the U-shape developmental pattern found in children’s language acquisition?

3) Related to questions 1-2 above, is there a difference between the UK and the USA children’s language development? If so, is it due to mothers’ input?

6

Page 7: A study of inflectional morpheme development in English-speaking children using CHILDES  Corpus

CORPORA-2013

2. Literature Review

2.4 Research Method

Method The number of inflectional word types, their frequency and type

per token ratio, and D which stands for ‘lexical diversity’ were calculated to measure the development of inflectional morpheme by age.

D indicates the lexical diversity on randomly selected sentences. The higher D is, the more diverse the words to which the children apply the inflectional morphemes.

D is calculated by the command of VocD in CLAN on the CHILDES Corpus with different lengths of texts.

7

Page 8: A study of inflectional morpheme development in English-speaking children using CHILDES  Corpus

CORPORA-2013

3. Corpus Study

3.1 CHILDES Corpus

The CHILDES Corpus is one of the most frequently used for research on language acquisition and the caregiver’s input influence research.

Rearranged the entire CHILDES Corpus to analyze it in an easy way and focused on the corpus from the age of 1 to 7 which accounts for 97% of the entire CHILDES Corpus.

7,841 files were created with 2,272 files from 275 UK children and 5,569 files from 1,355 USA children.

35,130 word types with 1,937,624 tokens from the UK children and 63,705 word types with 2,771,312 tokens from the USA children were extracted by the command of FREQ in CLAN

8

Page 9: A study of inflectional morpheme development in English-speaking children using CHILDES  Corpus

CORPORA-2013

3. Corpus Study

3.3 Analysis

First, classified 4,700,000 words by regular inflectional morphemes such as –(e)d and then extracted irregular inflectional morphemes such as ‘wore’ and integrated it with the regular inflectional words. (1) Present progressive(-ing)(2) Regular and irregular past tense(-(e)d, irr), (3) Comparative and superlative(-er, -est, irr)(4) Third person singular present/plural (-(e)s, irr),(5) Possessive singular and plural(-’s, -s’)(6) Pronoun

Calculated Type, Token and TTR by the command of FREQ in CLAN- Command: freq +t*CHI +u +f @ file

Calculated D by the command of VocD in CLAN- Command: vocd +t"CHI" +r6 +s"@C:\CHILDES\CLAN\lib\17133_ ed_d_irr_2556.cut" +u +f @ file

9

Page 10: A study of inflectional morpheme development in English-speaking children using CHILDES  Corpus

CORPORA-2013

3. Corpus Study

3.4. Results

Extracted the inflectional word types of 13,528 and the tokens of 1,221,916

TTR and D of inflectional morphemes by country

Inflectional morphemes

UK USA

Type Token TTR D Type Token TTR D

-ing 1,229 39,759 0.031 33.26 1,084 47,458 0.023 27.38-d_ed_irr(V) 1,006 82,474 0.012 10.78 1,472 132,405 0.011 19.23

-er_-est_irr(A) 217 11,499 0.008 0.77 198 13,978 0.014 1.57-es_-s_irr(N) 4,245 114,524 0.037 18.18 3,905 172,631 0.023 30.78

pronoun 52 165,778 0.000 2.64 51 321,019 0.000 1.82-s'_-'s 1,359 49,219 0.028 4.48 1,904 71,172 0.027 3.77

Total 8,108 463,2530.019  11.685 8,614 758,663 0.016 14.0910

Page 11: A study of inflectional morpheme development in English-speaking children using CHILDES  Corpus

CORPORA-2013

3. Corpus Study

3.4.1 Present progressive (-ing)

TTR and D

AgeUK USA

Type Token TTR D Type Token TTR D1 97 719 0.135 13.83 290 3,378 0.086 27.602 1,158 33,264 0.035 33.27 637 15,904 0.040 28.143 256 3,071 0.083 20.18 558 10,924 0.051 23.934 131 642 0.204 21.84 569 11,343 0.050 26.675 154 1,009 0.153 27.07 383 3,740 0.102 28.956 83 264 0.314 27.07 244 1,210 0.202 36.057 118 790 0.149 22.48 200 959 0.209 27.23

Total 1,997 39,759 0.153 23.68 2,881 47,4580.106  28.37

11

Page 12: A study of inflectional morpheme development in English-speaking children using CHILDES  Corpus

CORPORA-2013

3. Corpus Study

3.4.1 Present progressive (-ing)

The difference in D is not found between the UK and the USA children. The correlations between D and the children’s age were not significant, which

seems to indicate that children already apply the present progressive morpheme to diverse verbs from the age of 1.

- UK children: r =0.025, p >.05 / USA children: r =0.385, p >.05 80-90% of the most frequently used 50 words in children’s speech were found

in the most frequently used 50 words in mothers’. Overregularization errors were rarely found.

- Noun+ing(tennising, swording, appetizing) one or twice of each Adjective+ing(noticeabling) only once

- However, present progressive and gerund shares the same form, it needs further study by reviewing their usage.

12

Page 13: A study of inflectional morpheme development in English-speaking children using CHILDES  Corpus

CORPORA-2013

3. Corpus Study

3.4.2 Past tense (-(e)d_irr(V))

TTR and D

AgeUK USA

Type Token TTR D Type Token TTR D1 94 1,800 0.052 5.24 223 5,145 0.043 16.342 913 68,022 0.013 11.45 726 33,190 0.022 19.923 262 6,474 0.040 12.63 757 33,020 0.023 21.244 166 1,600 0.104 15.68 820 38,482 0.021 21.005 181 2,327 0.078 14.82 547 13,106 0.042 22.606 108 610 0.177 12.34 352 4,755 0.074 24.527 162 1,641 0.099 13.39 321 4,707 0.068 20.05

Total 1,886 82,474 0.080 12.22 3,746 132,405 0.042 20.8113

Page 14: A study of inflectional morpheme development in English-speaking children using CHILDES  Corpus

CORPORA-2013

3. Corpus Study

3.4.2 Past tense (-(e)d_irr(V))

As children get older, the D of past tense increased by the age of 5 or 6 and decreased at age 7 in both the UK and the USA.

A marginal correlation was found between the D and the children’s age.(The critical value of significant correlation coefficient was 0.68) It means children tend to apply past tense morphemes to more diverse verbal words as their age increased.

UK children: r=0.643 p>.05 / USA children r= 0.66, p>.05

In all age groups, the D of the USA children is higher than that of the UK children.

That the D of past tense is lower than that of present progressive confirms the grammatical morpheme developmental order proposed by Brown(1973).

14

Page 15: A study of inflectional morpheme development in English-speaking children using CHILDES  Corpus

CORPORA-2013

3. Corpus Study

3.4.2 Past tense (-(e)d_irr(V))

The words with the highest frequencies are occupied mostly by irregular verbs. They were found four times more than regular verbs in both countries.

- UK: 25 irregular verbs, 8 irregular verbs whose bare form shares the same form as the past and the past participle, 7 auxiliary verbs, 6 regular verbs, 4 words with regular past tense morphemes but probably used as adjectives

- USA: 25 irregular verbs, 9 irregular verbs whose bare form shares the same form as the past and the past participle such as put, 5 auxiliary verbs, 5 regular verbs, 6 words with regular past tense morphemes but probably used as adjectives

15

Page 16: A study of inflectional morpheme development in English-speaking children using CHILDES  Corpus

CORPORA-2013

3. Corpus Study

3.4.2 Past tense (-(e)d_irr(V))

‘go’ and ‘fall’ were the most overregularized irregular verbs attached with regular past tense morpheme ‘-(e)d’.

Overregularization error type and frequency of irregular verb ‘go’

Age

Correct Overregularizationtotal

went Gone Subtotal goed goned wented subtotal

UK USA UK USA UK USA UK US

A UK USA

UK

USA UK US

A UK USA

1 - 29 580 239 580 268 1 - - - - - 1 - 581 268

2 784 572 4,978 627 5,762 1,199 17 38 3 2 - 1 20 41 5,782 1,240

3 73 675 192 142 265 817 3 52 - - - - 3 52 268 869

4 23 860 21 109 44 969 - 4 - - - - - 4 44 973

5 33 286 22 30 55 316 - - - - - - - - 55 316

6 24 158 2 6 26 164 - - - - - - - - 26 164

7 100 74 15 22 115 96 - - - - - - - - 115 9616

Page 17: A study of inflectional morpheme development in English-speaking children using CHILDES  Corpus

CORPORA-2013

3. Corpus Study

3.4.2 Past tense (-(e)d_irr(V))

Overregularization errors were not found at the age of 1 but appeared between the ages of 2 and 3 and then they began to disappear from the age of 4 or 5.

U-shape developmental pattern of irregular verb ‘go’

Overregularization rate of ‘go’ between the UK and the USA was significantly different by the Pearson chi-square.

17

1 2 3 4 5 6 791%92%93%94%95%96%97%98%99%

100%

UKUSA

age

Page 18: A study of inflectional morpheme development in English-speaking children using CHILDES  Corpus

CORPORA-2013

3. Corpus Study

3.4.2 Past tense (-(e)d_irr(V))

Overregularization error type and frequency of irregular verb ‘fall’

The overregularization error types of ‘fall’ were found more in the UK

children.

Age

Correct Overregularizationtotal

Fell Fallen subtotal falled felled fallened subtotal

UK USA UK US

A UK USA UK US

A UK USA UK US

A UK USA UK US

A

1 3 73 - - 3 73 - - - 1 - - - 1 3 74

2 315 462 290 2 605 464 57 94 6 5 4 - 67 99 672 563

3 37 324 16 5 53 329 4 34 - 2 - - 4 36 57 365

4 15 264 2 1 17 265 1 8 - 1 - - 1 9 18 274

5 16 94 2 - 18 94 - 2 - 1 - - - 3 18 97

6 7 32 - 1 7 33 1 - - - - - 1 - 8 33

7 5 15 1 1 6 16 - 1 - 1 - - - 2 6 18

18

Page 19: A study of inflectional morpheme development in English-speaking children using CHILDES  Corpus

CORPORA-2013

3. Corpus Study

3.4.2 Past tense (-(e)d_irr(V))

Overregularization error in irregular past tense tended to appear at the age of 2 and began to decrease from the age of 3 and disappeared at the age of 4 or 5.

U-shape developmental pattern of irregular verb ‘fall’

Overregularization rate of ‘fall’ between the UK and the USA was not significantly different by the Pearson chi-square.

19

age1 2 3 4 5 6 780%82%84%86%88%90%92%94%96%98%

100%

UKUSA

Page 20: A study of inflectional morpheme development in English-speaking children using CHILDES  Corpus

CORPORA-2013

3. Corpus Study

3.4.3 Comparative and Superlative (-er_-est_irr(A))

TTR and D

AgeUK USA

Type Token TTR D Type Token TTR D1 11 661 0.017 0.10 19 1,660 0.011 0.262 76 9,940 0.008 0.74 70 4,242 0.017 0.923 26 420 0.062 1.24 99 2,758 0.036 2.034 21 126 0.167 2.89 122 3,349 0.036 2.705 28 194 0.144 2.92 84 1,211 0.069 3.446 20 57 0.351 5.82 47 376 0.125 4.067 17 101 0.168 2.54 41 382 0.107 2.17

total 199 11,499 0.131  2.32 482 13,978 0.057  2.23

20

Page 21: A study of inflectional morpheme development in English-speaking children using CHILDES  Corpus

CORPORA-2013

3. Corpus Study

3.4.3 Comparative and Superlative (-er_-est_irr(A))

As children get older, the D of comparative -er and superlative –est increased by the age of 6 and slightly decreased at age 7 in both the UK and the USA. It confirms that children applied comparative and superlative form to diverse adjectival words as they get older.

Strong correlations were found between D and the children’s age.

UK: r = 0.779, p < .05 / USA: r = 0.776, p < .05

The Ds between the UK and the USA were not distinctively noticeable.

21

Page 22: A study of inflectional morpheme development in English-speaking children using CHILDES  Corpus

CORPORA-2013

3. Corpus Study

3.4.3 Comparative and Superlative (-er_-est_irr(A))

The words with the most frequency are more and better followed by last, bigger, higher in the UK children and cleaner, higher, bigger, later in the USA children.

Overregularization error type and frequency of ‘little’

AgeCorrect Overregularization

Totalless littler littlest

UK USA UK USA UK USA UK USA1 0 0 0 0 0 0 0 02 3 1 1 7 0 5 4 123 0 3 2 6 1 9 3 184 0 6 0 6 1 4 1 165 0 7 0 4 1 0 1 116 0 0 1 4 1 3 2 77 0 1 0 0 0 0 0 1

22

Page 23: A study of inflectional morpheme development in English-speaking children using CHILDES  Corpus

CORPORA-2013

3. Corpus Study

3.4.3 Comparative and Superlative (-er_-est_irr(A))

The overregularization errors were found till the age of 6 but still show the U-shape developmental pattern.

U-shape developmental pattern of ‘little’

Overregularization rate of ‘little’ between the UK and the USA was significantly different by the Pearson chi-square.

23

age1 2 3 4 5 6 70%10%20%30%40%50%60%70%80%90%

100%

UK_littlerUK_littlestUSA_littlerUSA_littlest

Page 24: A study of inflectional morpheme development in English-speaking children using CHILDES  Corpus

CORPORA-2013

3. Corpus Study

3.4.3 Comparative and Superlative (-er_-est_irr(A))

4 files that littler was found in both child and mother. In these files, children produced 8 times while mothers produced 17 times.

This finding tells us the possible influence of mothers’ input on child langauage.

24

Page 25: A study of inflectional morpheme development in English-speaking children using CHILDES  Corpus

CORPORA-2013

4. Discussion

1) Do children apply the inflectional morphemes to diverse verbs as they get older?

DPresent progressive(23.68~28.37) > Past tense (12.22~20.81) > Comparative/Superlative(2.32~2.23)- D confirms the grammatical morpheme developmental order proposed by Brown(1973).

The developmental patterns of each inflectional morpheme were different as children got older.

That irregular verbs were found more than 4 times than regular verbs in 50 most frequently used verbs supports Brown(1973)’s claim that children acquired irregular verbs earlier than regular verbs.

25

Page 26: A study of inflectional morpheme development in English-speaking children using CHILDES  Corpus

CORPORA-2013

4. Discussion

2) Is the overregularization error found? And is the U-shape developmental pattern found in children’s language acquisition?

The overregularization errors were found and the U-shape developmental pattern which was claimed in the previous studies like Brown(1973) and Marcus et al.(1992) were confirmed in CHILDES Corpus on a large scale.

The overregularizaiton errors were found in past tense the most and rarely found in present progressive.

26

Page 27: A study of inflectional morpheme development in English-speaking children using CHILDES  Corpus

CORPORA-2013

4. Discussion

3) Related to questions 1-2 above, is there a difference between the UK and the USA children’s language development? If so, is it due to mothers’ input?

Similarities(1) As children get older, they apply the inflectional morpheme to more diverse words. (2) U-shape developmental patterns were found in both the UK and the USA.

Differences(1) The overregularization error rate in English children was lower than that in American children. (2) The possible influence of mothers’ input on children’s language is suggestive.

27

Page 28: A study of inflectional morpheme development in English-speaking children using CHILDES  Corpus

CORPORA-2013

5. Conclusion

This study investigated the inflectional morpheme development in child language using the data from CHILDES Corpus from 1-7 years old.

Our findings are:

1) Children tended to apply the inflectional morpheme to more diverse words as they got older.

2) U-shape developmental pattern was confirmed.

3) The overregularization errors were found while children applied the inflectional morphemes to words.

4) With Ds, this study supports the grammatical developmental order proposed by Brown(1973).

5) This study showed the possible influence of mothers’ input on children’s language by the different developmental aspects of the UK and the USA children.

28

Page 29: A study of inflectional morpheme development in English-speaking children using CHILDES  Corpus

CORPORA-2013

References[1] Berko, Jean(1958), The child’s learning of English morphology. Word, 14, 47-56[2] Brown, Roger(1973), A first Language-The early Stages, Harvard University Press[3] CHILDES (http://childes.psy.cmu.edu/) [4] Johansson, Victoria(2008), Lexical diversity and lexical density in speech and writing: a developmental

perspective, Lund University, Dept. of Linguistics and Phonetics, Working Papers 53. p.61-79[5] Kuczaj, Stan A.(1977), Why do children fail to overgeneralize the progressive inflection?, Journal of

Child Language 5. p.167-171 [6] MacWhinney, B. & Snow, C. E.(2000), The Child Language Data Exchange System: An Update.

Journal of Child Language 17. p.457-472[7] Marcus, Gary F.; Pinker, Steven; Ullman, Michael; Hollander, Michelle; Rosen, T. John; and Su,

Fei(1992), Overregularization in Language Acquisition, MONOGRAPHS OF THE SOCIETY FOR RESEARCH IN CHILD DEVELOPMENT Serial No. 228 Vol. 57

[8] Malvern, David; Brian Richards; Ngoni Chipeer & Pilar Duran(2004), Lexical diversity and language development: quantification and assessment New York: Palgrave Macmillan

[9] Maslen, Robert J C; Theakston, Anna L; Lieven, Elena V M; Tomasello, Michael(2004), A Dense Corpus Study of Past Tense and Plural Overregularization in English, Journal of Speech, Language, and Hearing Research 47. 6. p.1319-1333

[10] McCathy, Philip M. & Jarvis S(2004), vocd: A theoretical and empirical evaluation, Language Testing 24.4 p.459-488

[11] Richards, Brian J. & David Malvern(1997), Quantifying lexical diversity in the study of language development. Reading: Faculty of Education and Community Studies

[12] Templin, M.C.(1957), Certain language skills in children. Minneapolis: University of Minnesota Press29

Page 30: A study of inflectional morpheme development in English-speaking children using CHILDES  Corpus

CORPORA-2013

30

Myung Sook Min ([email protected])

Sun-Young Lee ([email protected])

Jong-Sup Jun ([email protected])

Contact Info.