Designing and Validating Intervention Opportunities for Suicide Prevention with Language Model Assistants



1        2 
The National Violent Death Reporting System (NVDRS) collects informa- tion including free text narratives (e.g., circumstances surrounding a sui- cide) about violent deaths (e.g., suicides) in the United States. In a demand- ing public health data pipeline, annotators manually extract structured information from NVDRS narratives following extensive guidelines devel- oped painstakingly by experts, all of which is aimed to inform prevention. To facilitate data-driven insights from this pipeline and support the devel- opment of novel suicide interventions, we investigate the effectiveness of language models (LMs) as efficient assistants to (a) data annotators and (b) experts. For data annotators, we find that LM assistants are able to predict the existing annotations about 85% of the time across 50 NVDRS variables. Furthermore, in a subset of cases where an LM disagrees with the annotation, further expert review finds that LM assistants can surface annotation discrepancies between free text narratives and structured data 38% of the time. For experts, we introduce an algorithm that efficiently refines guidelines for new variables by allowing experts to focus only on providing feedback for incorrect LM predictions. We apply our algorithm to a real-world case study for a new variable that characterizes victim interac- tions with lawyers and demonstrate that it achieves comparable annotation quality with a laborious manual approach. Our findings provide evidence that LMs can serve as effective assistants to public health researchers who handle sensitive data in high-stakes scenarios.
pitch
We employ LM assistants to reduce the mental burden of annotating NVDRS narratives when a codebook is available (top) and to help experts develop codebooks for new variables where one is not available (bottom). When a codebook is available for a variable, we instruct an LM to leverage it to generate a reasoning and identify the relevant span of text before predicting the label. For a new variable of interest for which a codebook is not available, we explore both a manual approach and an LM-assisted codebook development algorithm to develop one. With the latter, experts can efficiently develop a codebook by solely focusing on providing feedback for incorrect LM predictions without forsaking annotation quality

BibTeX


    @misc{ranjit2025placeholder,
      title={Designing and Validating Intervention Opportunities for Suicide Prevention with Language Model Assistants}, 
      author={Jaspreet Ranjit and Hyundong J. Cho and Claire J. Smerdon and Yoonsoo Nam and Myles Phung and Jonathan May and John R. Blosnich and Swabha Swayamdipta},
      year={2025},
      eprint={2406.14883},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={}, 
    }