# Multi-Utterance Language Production

> **NIH NIH R01** · UNIVERSITY OF CALIFORNIA AT DAVIS · 2020 · $318,462

## Abstract

To tell a story, give directions, or describe the layout of a house, speakers must generate multiple utterances in
sequence. Because most psycholinguistic research on speaking is based on paradigms designed to elicit
single utterances, little is known about multi-utterance language production in children and adults. Linked to
this empirical focus on single utterances is the widespread use of a method in which subjects describe simple
visual images containing just a handful of objects shown in straightforward spatial arrangements. In contrast,
real world scenes contain multiple objects in various relationships that can be described in numerous ways,
and so the attentional and language systems face a challenging set of decisions concerning where to begin the
description, how to cluster objects and capture their relations, and what information to include or omit. This
project uses complex, real-world scenes as a tool to examine the linearization of complex thoughts into
sequenced utterances, focusing on adults at this investigate stage in order to establish developmental
benchmarks. Image and semantic characteristics of complex scenes will be precisely quantified and used to
generate predictions about the allocation of attention as a scene is viewed and described. The project
examines the conditions under which scene image features exogenously draw the eyes to specific visual areas
which the linguistic system then describes, and under what conditions the cognitive system guides the eyes to
meaningful regions of the scene, allowing the language system to prepare a description even before the
relevant object or region has been fixated. On this latter view, the language and cognitive systems use scene
meaning interactively to formulate a linearization plan for coordinating the production of multiple utterances. To
address these theoretical issues, the project focuses on three Specific Aims: (1) To use computational tools
from the field of visual cognition to measure image and meaning properties of complex scenes, which will
permit the precise quantification of features controlling attention during speaking tasks as well as the selection
and sequencing of linguistic content. (2) To determine the extent to which viewers predict the presence of
objects and their locations and use those predictions to get a head-start on linguistic encoding even before an
object is attended. (3) To extend our approach to the production of utterances describing events by applying
the same methods for quantifying scene image and meaning properties that have been developed for nonevent
scenes to scenes depicting events with and without animate agents. The project tests an innovative
theory of multi-utterance language production which assumes that speakers formulate a linearization plan to
guide the allocation of attention and linguistic decisions concerning inclusion and ordering of information. This
approach will lead to a deeper understanding of language production, which w...

## Key facts

- **NIH application ID:** 10050559
- **Project number:** 1R01HD100516-01A1
- **Recipient organization:** UNIVERSITY OF CALIFORNIA AT DAVIS
- **Principal Investigator:** Fernanda Ferreira
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2020
- **Award amount:** $318,462
- **Award type:** 1
- **Project period:** 2020-08-13 → 2025-04-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10050559

## Citation

> US National Institutes of Health, RePORTER application 10050559, Multi-Utterance Language Production (1R01HD100516-01A1). Retrieved via AI Analytics 2026-05-25 from https://api.ai-analytics.org/grant/nih/10050559. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*