Social Media Mining for Pharmacovigilance

NIH RePORTER · NIH · R01 · $67,501 · view on reporter.nih.gov ↗

Abstract

Project Summary In our parent grant, the overall goal is to develop novel NLP methods to leverage Social Media (SM) data for specific pharmacovigilance (PV) efforts that are hindered by known drawbacks of traditional PV approaches. We focus on methods to facilitate the use of longitudinal SM data for exploring (a) factors affecting medication adherence and persistence among the general population (Aim 1), and (b) possible associations between medications taken during pregnancy and pregnancy outcomes (Aim 2). We also address a major roadblock for case-control studies based on this data: finding controls (Aim 3). In general, the methodological focus of the parent grant is on advancing NLP methods that enable the large-scale integration of SM as a content-rich source of user-generated data in epidemiological studies, including relevant data posted over time. We are seeking to extend methods developed for Aims 1, 2, and 3 of the parent grant to focus on Alzheimer's Disease and Alzheimer's Disease Related Dementia (AD/ADRD). The Objectives for this supplement are responsive to NOT-AG-20-034, and include the following: Objective 1 is to develop and evaluate NLP methods to identify and characterize a cohort of Twitter users affected by an AD/ADRD diagnosis, specifically, Twitter users who declare that their direct relatives (parents, grandparents, or siblings), significant others, or even themselves, have been diagnosed with AD/ADRD. The collected timelines of users in the cohort will be further mined for demographic characterization (age, gender, ethnicity, and geographic location of residence) and specific methods will be developed to extract familial relationship to the AD/ADRD patient, and information relevant to two specific focus studies: 1) a study on early onset (also known as young onset) and potential lifetime factors reflected in SM and 2) study of medications taken by users in the whole cohort, with specific attention to neuropsychiatric symptoms also associated with AD/ADRD. Objective 2 is the development and evaluation of a linguistic model that can automatically classify patients' written productions in SM to determine potential cognitive impairment by analyzing postings over time on SM by people diagnosed with EOAD for signs of cognitive impairment evident in the content, grammar, and form of their narrative text. In addition to methods and systems already developed under the Parent Grant, the tasks leverage baseline approaches as well as our own collection of Tweets. These resources put us in the unique position to complete research approaches that would normally take several years well within the span of the year covered by the administrative supplement, allowing us to quickly ramp up our collaborations and research for further proposal submissions to the NIA or NIGMH.

Key facts

NIH application ID
10289130
Project number
3R01LM011176-08S2
Recipient
UNIVERSITY OF PENNSYLVANIA
Principal Investigator
GRACIELA GONZALEZ HERNANDEZ
Activity code
R01
Funding institute
NIH
Fiscal year
2021
Award amount
$67,501
Award type
3
Project period
2012-09-10 → 2022-05-31