Multiword units lead to errors of commission in children's spontaneous production: "What corpus data can tell us?*"

McCauley, Stewart M, Bannard, Colin ORCID: 0000-0001-5579-5830, Theakston, Anna, Davis, Michelle, Cameron-Faulkner, Thea and Ambridge, Ben ORCID: 0000-0003-2389-8477
(2021) Multiword units lead to errors of commission in children's spontaneous production: "What corpus data can tell us?*". DEVELOPMENTAL SCIENCE, 24 (6). e13125-.

Access the full-text of this item by clicking on the Open Access link.


Psycholinguistic research over the past decade has suggested that children's linguistic knowledge includes dedicated representations for frequently-encountered multiword sequences. Important evidence for this comes from studies of children's production: it has been repeatedly demonstrated that children's rate of speech errors is greater for word sequences that are infrequent and thus unfamiliar to them than for those that are frequent. In this study, we investigate whether children's knowledge of multiword sequences can explain a phenomenon that has long represented a key theoretical fault line in the study of language development: errors of subject-auxiliary non-inversion in question production (e.g., "why we can't go outside?*"). In doing so we consider a type of error that has been ignored in discussion of multiword sequences to date. Previous work has focused on errors of omission - an absence of accurate productions for infrequent phrases. However, if children make use of dedicated representations for frequent sequences of words in their productions, we might also expect to see errors of commission - the appearance of frequent phrases in children's speech even when such phrases are not appropriate. Through a series of corpus analyses, we provide the first evidence that the global input frequency of multiword sequences (e.g., "she is going" as it appears in declarative utterances) is a valuable predictor of their errorful appearance (e.g., the uninverted question "what she is going to do?*") in naturalistic speech. This finding, we argue, constitutes powerful evidence that multiword sequences can be represented as linguistic units in their own right.

Item Type: Article
Uncontrolled Keywords: chunking, corpus analysis, language acquisition, questions
Divisions: Faculty of Health and Life Sciences
Faculty of Health and Life Sciences > Institute of Population Health
Depositing User: Symplectic Admin
Date Deposited: 14 Dec 2021 10:23
Last Modified: 18 Jan 2023 21:19
DOI: 10.1111/desc.13125
Open Access URL:
Related URLs: