@Article{info:doi/10.2196/63937, author="Parduzi, Qendresa and Wermelinger, Jonathan and Koller, Simon Domingo and Sariyar, Murat and Schneider, Ulf and Raabe, Andreas and Seidel, Kathleen", title="Explainable AI for Intraoperative Motor-Evoked Potential Muscle Classification in Neurosurgery: Bicentric Retrospective Study", journal="J Med Internet Res", year="2025", month="Mar", day="24", volume="27", pages="e63937", keywords="intraoperative neuromonitoring; motor evoked potential; artificial intelligence; machine learning; deep learning; random forest; convolutional neural network; explainability; medical informatics; personalized medicine; neurophysiological; monitoring; orthopedic; motor; neurosurgery", abstract="Background: Intraoperative neurophysiological monitoring (IONM) guides the surgeon in ensuring motor pathway integrity during high-risk neurosurgical and orthopedic procedures. Although motor-evoked potentials (MEPs) are valuable for predicting motor outcomes, the key features of predictive signals are not well understood, and standardized warning criteria are lacking. Developing a muscle identification prediction model could increase patient safety while allowing the exploration of relevant features for the task. Objective: The aim of this study is to expand the development of machine learning (ML) methods for muscle classification and evaluate them in a bicentric setup. Further, we aim to identify key features of MEP signals that contribute to accurate muscle classification using explainable artificial intelligence (XAI) techniques. Methods: This study used ML and deep learning models, specifically random forest (RF) classifiers and convolutional neural networks (CNNs), to classify MEP signals from routine supratentorial neurosurgical procedures from two medical centers according to muscle identity of four muscles (extensor digitorum, abductor pollicis brevis, tibialis anterior, and abductor hallucis). The algorithms were trained and validated on a total of 36,992 MEPs from 151 surgeries in one center, and they were tested on 24,298 MEPs from 58 surgeries from the other center. Depending on the algorithm, time-series, feature-engineered, and time-frequency representations of the MEP data were used. XAI techniques, specifically Shapley Additive Explanation (SHAP) values and gradient class activation maps (Grad-CAM), were implemented to identify important signal features. Results: High classification accuracy was achieved with the RF classifier, reaching 87.9{\%} accuracy on the validation set and 80{\%} accuracy on the test set. The 1D- and 2D-CNNs demonstrated comparably strong performance. Our XAI findings indicate that frequency components and peak latencies are crucial for accurate MEP classification, providing insights that could inform intraoperative warning criteria. Conclusions: This study demonstrates the effectiveness of ML techniques and the importance of XAI in enhancing trust in and reliability of artificial intelligence--driven IONM applications. Further, it may help to identify new intrinsic features of MEP signals so far overlooked in conventional warning criteria. By reducing the risk of muscle mislabeling and by providing the basis for possible new warning criteria, this study may help to increase patient safety during surgical procedures. ", issn="1438-8871", doi="10.2196/63937", url="https://www.jmir.org/2025/1/e63937", url="https://doi.org/10.2196/63937" }