Naomi Baes

NLP & Computational Social Science Researcher | Diachronic Semantic Change

About Me

I am an NLP and computational social science researcher studying diachronic lexical semantic change: how word meanings and psychologically important concepts shift across time and communicative contexts. My research develops interpretable methods for measuring changes in word meaning, concept use, and the ways social and psychological categories are described, evaluated, and represented in language, combining historical text corpora, contextual embeddings, lexical resources, large language models, and statistical modeling.

Substantively, my work focuses on mental health language, concept creep, and socially contested language, with empirical studies of concepts such as trauma, mental illness, schizophrenia, and introversion. I led the development of SIBling and LSC-Eval, frameworks for modeling and evaluating lexical semantic change with applications to psychology, media discourse, linguistics, and computational social science. My research has received support from the Australian Government Research Training Program Scholarship, ARC-funded Concept Creep research funding, and Change is Key!, funded by Riksbankens Jubileumsfond.

Interests

Lexical Semantic Change
Mental Health Concepts
Concept Creep
Computational Social Science
NLP Evaluation

Education

PhD, Natural Language Processing & Social Psychology
University of Melbourne (2023-26)
Graduate Diploma in Psychology (Advanced) with Honours
University of Melbourne

Research Overview

I study how word meanings change as concepts move across scientific writing, news media, and general language. My work focuses especially on language with social, cultural, or psychological significance, including mental health concepts and contested social terms. I ask how these terms extend into new contexts, shift in emotional intensity, and acquire more positive or negative associations. To examine these questions, I combine NLP methods with theory from social psychology, computational social science, and corpus linguistics, using historical text corpora, contextual embeddings, lexical resources, large language models, and statistical modeling.

Research streams

My work clusters around three overlapping streams: (1) diachronic lexical semantic change and evaluation, (2) conceptual change in psychology, and (3) computational social science projects on socially contested language.

Diachronic lexical semantic change and evaluation
I develop computational methods and evaluation resources for tracing different kinds of diachronic lexical semantic change in historical text corpora, with applications in psychology, computational social science, linguistics, and NLP method evaluation. Contributions include:
- SIBling, a multidimensional framework for modeling whether words become broader, more emotionally intense, or acquire more positive or negative connotations over time.
- LSC-Eval, an evaluation framework that uses LLM-generated historical datasets in experimental settings to test whether semantic change methods are sensitive to specific kinds of change.
- A threshold-calibrated sense tracking pipeline for estimating the prevalence of a word’s senses in historical corpora. It touches on LSC interpretability by showing how LSC scores may not reflect sense change.
- SenseRel, a sense-level benchmark for modeling semantic relations between word senses, connecting denotational semantic change types (e.g., metaphor, metonymy) with connotational dimensions of meaning (e.g., valence and arousal). We evaluate how well LLMs and fine-tuned models capture these relations.
Conceptual change in psychology and mental health language
I study how psychological and mental health concepts change in meaning, salience, severity, and use across academic psychology, news media, books, and general language corpora. This stream includes substantive case studies of concepts such as trauma, mental illness, schizophrenia, introversion, and generic and emotion related mental health terminology. Together, these studies examine when concepts become more culturally prominent, more widely used, emotionally intense, or evaluatively charged, interpreted through psychological theory, concept creep research, and corpus evidence.
Social meaning in contested language
I also contribute to collaborative projects that use NLP and computational social science methods to study socially important language, including dehumanization of women in incel discourse, mental health stigma detection in online communication, identity/person-first language for mental health conditions, and lay understandings of the common good. These projects extend my broader interest in how language reflects, organizes, and reshapes social and psychological categories.

Featured Publications

Lexical Semantics

Sense Rel: A Sense-Level Benchmark for Denotational and Connotational Meaning Relations

A sense-level benchmark for testing how humans and language models represent denotational (antonymy, homonymy, metaphor, metonymy, taxonomical relations) and connotational (valence, arousal) meaning.

May 30, 2026

Diachronic Word Sense Disambiguation

Threshold-Calibrated Word Sense Disambiguation: Semantic Broadening Without Sense Redistribution in Schizophrenia

A sense-tracking pipeline; in the case of schizophrenia we showed that rising semantic change scores can reflect a word's use in a broader range of contexts, while retaining its core meaning.

Mar 28, 2026

LLM-Generated Synthetic Data

LSC-Eval: A General Framework to Evaluate Methods for Assessing Dimensions of Lexical Semantic Change Using LLM-Generated Synthetic Data

An evaluation framework that generates historical synthetic benchmark datasets for testing whether semantic change methods are sensitive to detecting the kinds of change they claim to measure.

Mar 11, 2025

Lexical Semantic Change

A Multidimensional Framework for Evaluating Lexical Semantic Change with Social Science Applications

A multidimensional framework to evaluate lexical semantic change, for tracing whether words become broader, more emotionally intense, or more positive or negative over time.

Aug 11, 2024

See all papers

Highlights

ACL 2026: Joint-first author on SenseRel, a sense-level benchmark for denotational and connotational meaning relations, developed through my Change is Key! research internship.
Findings of ACL 2025: Lead author of LSC-Eval, a framework for evaluating methods for detecting dimensions of lexical semantic change using LLM-generated synthetic data.
ACL 2024: Lead author of SIBling, a multidimensional framework for modeling lexical semantic change with social science applications.
ICWSM 2026: Lead author of a computational social science paper on dehumanization of women and men in incel discourse.
Awards and research support: Australian Government Research Training Program Scholarship; selected for Change is Key! research support for an international research internship and collaboration; supported by Concept Creep research funding for conference travel and research dissemination.
Competitive team funding: Co-recipient of AUD $15,000 Hallmark Research Initiative seed funding for a 7-member team project on ‘Automatic evaluation of mental health stigma in online communication’, led by Yulia Otmakhova.
Research leadership and service: Australian English Language Co-Lead with Christine de Kock for the BLEnD SemEval-2026 Shared Task; Program Chair/PC for the LChange'26 Workshop, co-located with EACL 2026.
ACL Best Resource Paper Award: Contributor to BRIGHTER: BRIdging the Gap in Human-Annotated Textual Emotion Recognition Datasets for 28 Languages, awarded Best Resource Paper at ACL 2025; contributed Romanian-language gold-standard annotation (team lead: Daniela Teodorescu).
Invited talks: Presented PhD work on modeling semantic change in mental-health concepts at international research groups: Utrecht University, the University of Gothenburg, and the National Research Council Canada.

See all updates

Selected Presentations

Diachronic Word Sense Disambiguation

Threshold-Calibrated Word Sense Disambiguation: Semantic Broadening Without Sense Redistribution in Schizophrenia

A sense-tracking study presented at LChange'26 Workshop (EACL) introducing a threshold-calibrated, prototype-based pipeline for tracking word sense prevalence in historical U.S. news articles.

Mar 28, 2026

Computational Social Science

Dimensions of Semantic Change: Validation and Application of the SIBling Framework

An invited talk at the National Research Council Canada on SIBling and LSC-Eval as complementary frameworks for modeling and evaluating semantic change across time.

Sep 24, 2025

Conceptual Change

Dimensions of Semantic Change: Validation and Application of the SIBling Framework

A research talk at Change is Key! Conference (University of Gothenburg, Dept. of Philosophy, Linguistics & Theory of Science) on applying SIBling and LSC-Eval to trace semantic shifts in mental health concepts in historical corpora.

Sep 12, 2025

Lexical Semantic Change

A Multidimensional Framework for Evaluating Lexical Semantic Change with Social Science Applications

The foundational SIBling presentation, introducing a multidimensional framework for modeling lexical semantic change across Sentiment, Intensity, and Breadth.

Aug 14, 2024

Invited Talks

See all presentations

Research Ethos

My work is motivated by substantive questions about language, meaning, mental health, and how social and psychological categories are described, evaluated, and represented in language. I use computational methods and careful measurement to better understand how these categories change, and to make clearer, responsible claims about socially important questions. Theory becomes useful empirically when we can make clear, more interpretable, and well-grounded claims about what we are measuring.