@Article{info:doi/10.2196/68427, author="Reichenpfader, Daniel and Knupp, Jonas and von D{\"a}niken, Sandro Urs and Gaio, Roberto and Dennst{\"a}dt, Fabio and Cereghetti, Grazia Maria and Sander, Andr{\'e} and Hiltbrunner, Hans and Nairz, Knud and Denecke, Kerstin", title="Enhancing Bidirectional Encoder Representations From Transformers (BERT) With Frame Semantics to Extract Clinically Relevant Information From German Mammography Reports: Algorithm Development and Validation", journal="J Med Internet Res", year="2025", month="Apr", day="25", volume="27", pages="e68427", keywords="radiology; information extraction; mammography; large language models; structured reporting; template filling; annotation; quality control; natural language processing", abstract="Background: Structured reporting is essential for improving the clarity and accuracy of radiological information. Despite its benefits, the European Society of Radiology notes that it is not widely adopted. For example, while structured reporting frameworks such as the Breast Imaging Reporting and Data System provide standardized terminology and classification for mammography findings, radiology reports still mostly comprise free-text sections. This variability complicates the systematic extraction of key clinical data. Moreover, manual structuring of reports is time-consuming and prone to inconsistencies. Recent advancements in large language models have shown promise for clinical information extraction by enabling models to understand contextual nuances in medical text. However, challenges such as domain adaptation, privacy concerns, and generalizability remain. To address these limitations, frame semantics offers an approach to information extraction grounded in computational linguistics, allowing a structured representation of clinically relevant concepts. Objective: This study explores the combination of Bidirectional Encoder Representations from Transformers (BERT) architecture with the linguistic concept of frame semantics to extract and normalize information from free-text mammography reports. Methods: After creating an annotated corpus of 210 German reports for fine-tuning, we generate several BERT model variants by applying 3 pretraining strategies to hospital data. Afterward, a fact extraction pipeline is built, comprising an extractive question-answering model and a sequence labeling model. We quantitatively evaluate all model variants using common evaluation metrics (model perplexity, Stanford Question Answering Dataset 2.0 [SQuAD{\_}v2], seqeval) and perform a qualitative clinician evaluation of the entire pipeline on a manually generated synthetic dataset of 21 reports, as well as a comparison with a generative approach following best practice prompting techniques using the open-source Llama 3.3 model (Meta). Results: Our system is capable of extracting 14 fact types and 40 entities from the clinical findings section of mammography reports. Further pretraining on hospital data reduced model perplexity, although it did not significantly impact the 2 downstream tasks. We achieved average F1-scores of 90.4{\%} and 81{\%} for question answering and sequence labeling, respectively (best pretraining strategy). Qualitative evaluation of the pipeline based on synthetic data shows an overall precision of 96.1{\%} and 99.6{\%} for facts and entities, respectively. In contrast, generative extraction shows an overall precision of 91.2{\%} and 87.3{\%} for facts and entities, respectively. Hallucinations and extraction inconsistencies were observed. Conclusions: This study demonstrates that frame semantics provides a robust and interpretable framework for automating structured reporting. By leveraging frame semantics, the approach enables customizable information extraction and supports generalization to diverse radiological domains and clinical contexts with additional annotation efforts. Furthermore, the BERT-based model architecture allows for efficient, on-premise deployment, ensuring data privacy. Future research should focus on validating the model's generalizability across external datasets and different report types to ensure its broader applicability in clinical practice. ", issn="1438-8871", doi="10.2196/68427", url="https://www.jmir.org/2025/1/e68427", url="https://doi.org/10.2196/68427" }