LLM-Generated Synthetic Data

LSC-Eval: A General Framework to Evaluate Methods for Assessing Dimensions of Lexical Semantic Change Using LLM-Generated Synthetic Data

We introduce a scalable, domain-general framework that creates diachronic, LLM-generated synthetic datasets to simulate theory-driven Lexical Semantic Change (LSC) and evaluates various methods for measuring kinds of LSC--using examples from psychology, we apply this framework to assess the sensitivity of a suite of methods in detecting artificially induced changes in dimensions of Sentiment, Intensity, and Breadth (SIB), ultimately identifying the most suitable approach for each dimension.

Mar 11, 2025