# Statistical approaches to linguistic pattern learning

> **NIH NIH R01** · GEORGETOWN UNIVERSITY · 2022 · $562,312

## Abstract

PROJECT SUMMARY
The long-term aim of the proposed research is to provide an account of how children learn the grammatical
structure of their native language from distributional information in linguistic input, and also how these learning
mechanisms may differ from those of adult learners. Distributional information is the patterning of elements in a
large corpus of sentences. We hypothesize that learners acquire aspects of language structure from the
statistics arising from this distributional information, such as which elements co-occur, what positions they
regularly occupy in a word or sentence, and with what neighboring elements they frequently occur. Our
program of research to date has focused on word segmentation (how learners determine which sound
sequences form words) and on word categories (how learners determine which words form grammatical
categories such as noun and verb). This work has documented the power and robustness of infants’,
children’s, and adults’ ability to use complex distributional information to discover these aspects of language.
We now propose to extend our research in new directions, to examine two crucial aspects of learning higher-
level linguistic structure. In Part 1 we study the factors that lead learners to generalize a novel inflectional
morpheme (like –s for noun plurals) to novel words. In Part 2 we examine how learners acquire phrases and
simple hierarchical structure in sentences, and we ask what leads learners to prefer the types of phrase and
hierarchical structures that are most common in natural languages.
 In our proposed studies we test our hypotheses using miniature artificial language paradigms that afford
control over the distributional cues in the input, something that is virtually impossible using only data from
natural language learning. In each experiment, participants listen to utterances in a miniature language and
then produce their own utterances or make judgments about their acceptability. Crucially, during the learning
phase they hear only a sample of the possible utterances that are legal in the artificial language; some are
withheld for use in a later post-test, to determine whether learners generalize what they have observed to
novel instances (and if so, to which types of novel instances). We have developed highly successful paradigms
for engaging young children in miniature language studies, and we have demonstrated important differences
between child and adult language learners in these studies. We will also present children and adults with
comparable learning paradigms in the visual-motor domain, to assess the time-course of the learning process
and the specificity or generality of the results using auditory linguistic materials. Taken together, the results of
these studies will document the key variables that enable a distributional learning mechanism to acquire the
structure of words (inflectional morphology) and sentences (phrase and hierarchical structure) and will highlight
th...

## Key facts

- **NIH application ID:** 10348131
- **Project number:** 5R01HD037082-19
- **Recipient organization:** GEORGETOWN UNIVERSITY
- **Principal Investigator:** Richard N. Aslin
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2022
- **Award amount:** $562,312
- **Award type:** 5
- **Project period:** 1999-02-01 → 2024-11-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10348131

## Citation

> US National Institutes of Health, RePORTER application 10348131, Statistical approaches to linguistic pattern learning (5R01HD037082-19). Retrieved via AI Analytics 2026-05-21 from https://api.ai-analytics.org/grant/nih/10348131. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
