Threshold-Calibrated Sense Tracking Pipeline
Mar 28, 2026
·
1 min read

A threshold-calibrated, prototype-based pipeline for estimating word sense prevalence in diachronic text corpora. Applied to schizophrenia in historical U.S. news, the pipeline combines sense inventories, generated prototype usages, target-aware embeddings, human-calibrated similarity thresholds, and sense prevalence estimation over time. The repository includes a sample of expert labeled U.S. news sentences (containing the term schizophrenia annotated for which Oxford English Dictionary sense they express).