Viewpoint
Abstract
Some models for mental disorders or behaviors (eg, suicide) have been successfully developed, allowing predictions at the population level. However, current demographic and clinical variables are neither sensitive nor specific enough for making individual actionable clinical predictions. A major hope of the “Decade of the Brain” was that biological measures (biomarkers) would solve these issues and lead to precision psychiatry. However, as models are based on sociodemographic and clinical data, even when these biomarkers differ significantly between groups of patients and control participants, they are still neither sensitive nor specific enough to be applied to individual patients. Technological advances over the past decade offer a promising approach based on new measures that may be essential for understanding mental disorders and predicting their trajectories. Several new tools allow us to continuously monitor objective behavioral measures (eg, hours of sleep) and densely sample subjective measures (eg, mood). The promise of this approach, referred to as digital phenotyping, was recognized almost a decade ago, with its potential impact on psychiatry being compared to the impact of the microscope on biological sciences. However, despite the intuitive belief that collecting densely sampled data (big data) improves clinical outcomes, recent clinical trials have not shown that incorporating digital phenotyping improves clinical outcomes. This viewpoint provides a stepwise development and implementation approach, similar to the one that has been successful in the prediction and prevention of cardiovascular disease, to achieve clinically actionable predictions in psychiatry.
J Med Internet Res 2024;26:e59826doi:10.2196/59826
Keywords
“It is difficult to make predictions, especially about the future” [
], as Yogi Berra stated.Thirty years after the US National Institute of Mental Health (NIMH) declared the final decade of the 20th century to be “the Decade of the Brain,” Tom Insel, the NIMH director at that time acknowledged,
I spent 13 years at NIMH really pushing on the neuroscience and genetics of mental disorders, and when I look back on that I realize that while I think I succeeded at getting lots of really cool papers published by cool scientists at fairly large costs—I think $20 billion—I don’t think we moved the needle in reducing suicide, reducing hospitalizations, improving recovery for the tens of millions of people who have mental illness.
[ ]
A challenge contributing to this issue is that day-to-day clinical interactions and decision-making processes in psychiatry remain fundamentally the same as they were 50 years ago. Decision-making is still based on binary (present/absent) diagnoses inferred from clinical symptoms assessed during an interview at a single time point. Once a diagnosis has been made, treatment focuses on managing acute symptoms, with simple long-term strategies (eg, “what gets you well, keeps you well”) and reactive handling of adverse outcomes (eg, readmission following a suicide attempt). Moreover, despite extraordinary empirical advances in psychopharmacology [
- ], all psychotropic medications help a subset of patients. Thus, most psychiatrists still follow a trial-and-error approach, which can take an inordinate amount of time [ ]. Once patients are stable, the traditional model of clinical monitoring typically involves monthly visits that are either too infrequent or too frequent given the labile nature of mental disorders.Some models for mental disorders or behaviors (eg, suicide) have been successfully developed and they allow predictions at the population level [
, ]. However, current demographic and clinical variables are neither sensitive nor specific enough for making individual actionable clinical predictions. Using suicide as an example, a recent meta-analysis concluded that predictive ability has not improved across 50 years of research [ ]. This is in part because these predictive models are still solely based on sociodemographic and static descriptive clinical variables. For instance, an older, White man, with depression and alcohol use may have a 100-fold higher likelihood of killing himself during the next year than somebody in the general population. Unfortunately, this astounding relative risk (100) means that the likelihood that this single patient kills himself during the next year is 10/1000 (1%) rather than 10/100,000. Psychiatrists cannot hospitalize 100 patients for one year to save one life. Using larger datasets with more demographic and clinical variables may improve the precision of these population-based models, but they are unlikely to impact individual clinical outcomes.A major hope of the “Decade of the Brain” was that biological measures (biomarkers) would solve these issues [
, ] and lead to precision psychiatry [ ]. A variety of biomarkers have now been reliably associated with mental disorders and their outcomes [ ]. However, as models based on sociodemographic and clinical data, even when these biomarkers differ significantly between groups of patients and control participants, they are neither sensitive nor specific enough to be applied to individual patients [ - ]. Typically, a quarter to a third of the patients have normal values and a quarter to a third of the controls have pathologic values [ ]. New models that will integrate sociodemographic, clinical, and biological data are being developed, but they have not yet been shown to improve clinical outcomes [ , , , ]. The field of preventive psychiatry still lacks an understanding of the complex mechanisms underlying mental disorders and their treatment, in contrast to the well-established pathophysiologic models in cardiology or oncology, which link risk factors to outcomes and have led to reductions in mortality related to heart disease, stroke, or cancer [ , ].Looking back, we believe that our inability over two or three “decades of the brain” to bridge the gap between biology and clinical symptoms is due to the lack of an intermediate level of description. In the field of artificial intelligence (AI), the early attempts to create neural networks had to be abandoned because the early 2-layer networks could not process information usefully (eg, interpret images or translate languages). It took more than 30 years to understand that neural networks needed an intermediate layer to process information usefully and to implement this intermediate layer [
, ]. The Research Domain Criteria initiative of the NIMH has attempted to bridge this gap with limited success to date [ ]. Technological advances over the past decade offer a promising approach based on new measures that may be essential for understanding mental disorders and predicting their trajectories. Several new tools allow us to continuously monitor objective behavioral measures (eg, hours of sleep) and densely sample subjective measures (eg, mood). The promise of this approach, referred to as digital phenotyping was recognized almost a decade ago [ ], with its potential impact on psychiatry being compared to the impact of the microscope on biological sciences [ ].However, simply gathering a large amount of informative data about a single patient is not helpful by itself. Just as a clinician struggles to synthesize the information from over 100 clinical notes and dozens of laboratory reports available in an electronic health record, the massive amount of data provided by digital phenotyping is useless unless these data can be properly analyzed in a clinical context and with the proper statistical tools. Machine learning, by extracting complex patterns from multiple sources of high-dimensional time-varying data [
], is an ideal tool to address this problem [ - ]. Nonetheless, some challenges with machine learning still need to be addressed before it can be used to make actionable predictions in psychiatry. These challenges include unreliable inherent assumptions [ , ], model instability [ ], and lack of interpretability [ ] or explainability [ ] of results (the black box problem).Despite the intuitive belief that collecting densely sampled data (big data) improves clinical outcomes, recent clinical trials have not shown that incorporating digital phenotyping improves clinical outcomes [
- ]. This is an example of the so-called “AI chasm,” which refers to the gap between developing algorithms and their actual real-world implementation and clinical impact [ ]. As discussed above, some reasons for this chasm include the disconnect between building good individual predictive models for the broader population and making individual inferences [ ]. Bayesian procedures offer a potential solution to link inferences and predictions [ ]. Other simpler reasons to address include the lack of expertise needed to implement tools into clinical practice [ ], poor data quality compromising the reliability and accuracy of models [ - ], and a lack of standardization [ ].During the next decade, achieving clinically actionable predictions in psychiatry will require a stepwise development and implementation approach [
], similar to the successful methods used in other medical fields, such as predicting and preventing cardiovascular disease [ ]. The first step will be to identify individual digital measures of objective behaviors and subjective mental states (digital markers) that can be integrated with sociodemographic data, clinical characteristics, and biomarkers to create multimodal signatures that predict clinical outcomes (akin to risk scores in other medical fields). These multimodal signatures will need to be reliably and accurately associated with individual clinical states and trajectories. Validation studies will require cohort studies with adequate sample sizes and sufficient duration to generate enough analyzable clinical events [ ]. We foresee that different multimodal signatures will be used for detection (diagnosis) and prediction (prognosis). Once the reliability, specificity, and sensitivity of these multimodal signatures are established, prospective randomized clinical trials (RCTs), with adequate sample sizes, complemented with real-world observation studies, will need to demonstrate that they can be used to tailor the treatment of individual patients and improve outcomes. Albeit costly, these RCTs are needed to fulfill the promise of clinically actionable predictions leading to individualized, timely treatment. We are also mindful of a recent review [ ] that emphasized methodological challenges in RCTs investigating smartphone-based treatment interventions for mental disorders, including lack of trial registrations, inappropriate comparators, lack of blinding, selection bias, and lack of generalizability.In parallel, the incorporation of digital phenotyping, first in RCTs and later in clinical practice, will require addressing complex ethical issues raised by the intense monitoring of behavior and mental state, which some people may consider too invasive regardless of its potential benefits [
]. This work can be informed by lessons learned from other fields [ ]. To prepare for the deployment of the new decision-making tools we foresee and to understand their potential and pitfalls [ ], we will need to start training in medical schools and continue training throughout our professional life [ ].In conclusion, we believe that technological advances, in the context of a more holistic approach that considers all determinants of health, will allow us to create individual multimodal signatures for early detection and personalized intervention for mental disorders. However, this potential transformation in psychiatry will require another decade of investment and effort to become a reality.
Conflicts of Interest
None declared.
References
- Dickstein DP. Editorial: it's difficult to make predictions, especially about the future: risk calculators come of age in child psychiatry. J Am Acad Child Adolesc Psychiatry. Aug 2021;60(8):950-951. [CrossRef] [Medline]
- Troisi A. Biological psychiatry is dead, long live biological psychiatry! Clin Neuropsychiatry. Dec 2022;19(6):351-354. [FREE Full text] [CrossRef] [Medline]
- Cipriani A, Furukawa TA, Salanti G, Chaimani A, Atkinson LZ, Ogawa Y, et al. Comparative efficacy and acceptability of 21 antidepressant drugs for the acute treatment of adults with major depressive disorder: a systematic review and network meta-analysis. Lancet. Dec 07, 2018;391(10128):1357-1366. [CrossRef] [Medline]
- Rush AJ. STAR*D: what have we learned? Am J Psychiatry. Mar 2007;164(2):201-204. [CrossRef] [Medline]
- Huhn M, Nikolakopoulou A, Schneider-Thoma J, Krause M, Samara M, Peter N, et al. Comparative efficacy and tolerability of 32 oral antipsychotics for the acute treatment of adults with multi-episode schizophrenia: a systematic review and network meta-analysis. Lancet. Sep 14, 2019;394(10202):939-951. [FREE Full text] [CrossRef] [Medline]
- Gaynes BN, Warden D, Trivedi MH, Wisniewski SR, Fava M, Rush AJ. What did STAR*D teach us? Results from a large-scale, practical, clinical trial for patients with depression. Psychiatr Serv. Nov 2009;60(11):1439-1445. [CrossRef] [Medline]
- Kirkbride JB, Jackson D, Perez J, Fowler D, Winton F, Coid JW, et al. A population-level prediction tool for the incidence of first-episode psychosis: translational epidemiology based on cross-sectional data. BMJ Open. 2013;3(2):e001998. [FREE Full text] [CrossRef] [Medline]
- Moriarty AS, Meader N, Snell KI, Riley RD, Paton LW, Chew-Graham CA, et al. Prognostic models for predicting relapse or recurrence of major depressive disorder in adults. Cochrane Database Syst Rev. May 06, 2021;5(5):CD013491. [FREE Full text] [CrossRef] [Medline]
- Franklin JC, Ribeiro JD, Fox KR, Bentley KH, Kleiman EM, Huang X, et al. Risk factors for suicidal thoughts and behaviors: a meta-analysis of 50 years of research. Psychol Bull. Mar 2017;143(2):187-232. [CrossRef] [Medline]
- Walter H. The third wave of biological psychiatry. Front Psychol. 2013;4:582. [FREE Full text] [CrossRef] [Medline]
- Blumberger DM, Daskalakis ZJ, Mulsant BH. Biomarkers in geriatric psychiatry: searching for the holy grail? Curr Opin Psychiatry. Nov 2008;21(6):533-539. [CrossRef] [Medline]
- No Author listed. The right treatment for each patient: unlocking the potential of personalized psychiatry. Nat Mental Health. Sep 06, 2023;1(9):607-608. [CrossRef]
- Hoy N, Lynch SJ, Waszczuk MA, Reppermund S, Mewton L. Transdiagnostic biomarkers of mental illness across the lifespan: a systematic review examining the genetic and neural correlates of latent transdiagnostic dimensions of psychopathology in the general population. Neurosci Biobehav Rev. Dec 2023;155:105431. [FREE Full text] [CrossRef] [Medline]
- Ioannidis JPA, Panagiotou OA. Comparison of effect sizes associated with biomarkers reported in highly cited individual articles and in subsequent meta-analyses. JAMA. Jun 01, 2011;305(21):2200-2210. [CrossRef] [Medline]
- Rost N, Dwyer DB, Gaffron S, Rechberger S, Maier D, Binder EB, et al. Multimodal predictions of treatment outcome in major depression: a comparison of data-driven predictors with importance ratings by clinicians. J Affect Disord. Apr 14, 2023;327:330-339. [CrossRef] [Medline]
- Sajjadian M, Uher R, Ho K, Hassel S, Milev R, Frey BN, et al. Prediction of depression treatment outcome from multimodal data: a CAN-BIND-1 report. Psychol Med. Sep 2023;53(12):5374-5384. [FREE Full text] [CrossRef] [Medline]
- Winter NR, Leenings R, Ernsting J, Sarink K, Fisch L, Emden D, et al. Quantifying Quantifying deviations of brain structure and function in major depressive disorder across neuroimaging modalities of brain structure and function in major depressive disorder across neuroimaging modalities. JAMA Psychiatry. Sep 01, 2022;79(9):879-888. [FREE Full text] [CrossRef] [Medline]
- Winter NR, Blanke J, Leenings R, Ernsting J, Fisch L, Sarink K, et al. A systematic evaluation of machine learning-based biomarkers for major depressive disorder. JAMA Psychiatry. Apr 01, 2024;81(4):386-395. [CrossRef] [Medline]
- Zierer C, Behrendt C, Lepach-Engelhardt AC. Digital biomarkers in depression: a systematic review and call for standardization and harmonization of feature engineering. J Affect Disord. Apr 05, 2024;356:438-449. [CrossRef] [Medline]
- Smith RA, Andrews KS, Brooks D, Fedewa SA, Manassaram-Baptiste D, Saslow D, et al. Cancer screening in the United States, 2017: a review of current American Cancer Society guidelines and current issues in cancer screening. CA Cancer J Clin. Mar 2017;67(2):100-121. [FREE Full text] [CrossRef] [Medline]
- Benjamin EJ, Virani SS, Callaway CW, Chamberlain AM, Chang AR, Cheng S, et al. Heart disease and stroke statistics-2018 update: a report from the American Heart Association. Circulation. Mar 20, 2018;137(12):e67-e492. [CrossRef] [Medline]
- Mulsant BH. A neural network as an approach to clinical diagnosis. MD Comput. 1990;7(1):25-36. [Medline]
- Rumelhart DE, Hinton GE, Williams RJ. Learning representations by back-propagating errors. Nature. Oct 1986;323(6088):533-536. [CrossRef]
- Weinberger DR, Glick ID, Klein DF. Whither Research Domain Criteria (RDoC)?: The good, the bad, and the ugly. JAMA Psychiatry. Dec 2015;72(12):1161-1162. [CrossRef] [Medline]
- Torous J, Kiang MV, Lorme J, Onnela J. New tools for new research in psychiatry: a scalable and customizable platform to empower data driven smartphone research. JMIR Ment Health. 2016;3(2):e16. [FREE Full text] [CrossRef] [Medline]
- Onnela J, Rauch SL. Harnessing smartphone-based digital phenotyping to enhance behavioral and mental health. Neuropsychopharmacology. Dec 2016;41(7):1691-1696. [FREE Full text] [CrossRef] [Medline]
- Ortiz A, Bradler K, Hintze A. Episode forecasting in bipolar disorder: is energy better than mood? Bipolar Disord. Jan 22, 2018;20(5):470-476. [CrossRef] [Medline]
- Kessler RC, van Loo HM, Wardenaar KJ, Bossarte RM, Brenner LA, Cai T, et al. Testing a machine-learning algorithm to predict the persistence and severity of major depressive disorder from baseline self-reports. Mol Psychiatry. Oct 2016;21(10):1366-1371. [FREE Full text] [CrossRef] [Medline]
- Hahn T, Kircher T, Straube B, Wittchen H, Konrad C, Ströhle A, et al. Predicting treatment response to cognitive behavioral therapy in panic disorder with agoraphobia by integrating local neural information. JAMA Psychiatry. Jan 2015;72(1):68-74. [CrossRef] [Medline]
- Khodayari-Rostamabad A, Hasey G, Maccrimmon DJ, Reilly J, de Bruin H. A pilot study to determine whether machine learning methodologies using pre-treatment electroencephalography can predict the symptomatic response to clozapine therapy. Clin Neurophysiol. Dec 2010;121(12):1998-2006. [FREE Full text] [CrossRef] [Medline]
- Goldstein BA, Navar AM, Pencina MJ, Ioannidis JPA. Opportunities and challenges in developing risk prediction models with electronic health records data: a systematic review. J Am Med Inform Assoc. Jan 2017;24(1):198-208. [FREE Full text] [CrossRef] [Medline]
- Crowley RJ, Tan YJ, Ioannidis JPA. Empirical assessment of bias in machine learning diagnostic test accuracy studies. J Am Med Inform Assoc. Jul 01, 2020;27(7):1092-1101. [FREE Full text] [CrossRef] [Medline]
- Riley RD, Collins GS. Stability of clinical prediction models developed using statistical or machine learning methods. Biom J. Dec 2023;65(8):e2200302. [FREE Full text] [CrossRef] [Medline]
- Park SH, Han K. Methodologic guide for evaluating clinical performance and effect of artificial intelligence technology for medical diagnosis and prediction. Radiology. Dec 2018;286(3):800-809. [CrossRef] [Medline]
- Joyce DW, Kormilitzin A, Smith KA, Cipriani A. Explainable artificial intelligence for mental health through transparency and interpretability for understandability. NPJ Digit Med. Jan 18, 2023;6(1):6. [FREE Full text] [CrossRef] [Medline]
- Fedor S, Lewis R, Pedrelli P, Mischoulon D, Curtiss J, Picard RW. Wearable technology in clinical practice for depressive disorder. N Engl J Med. Dec 28, 2023;389(26):2457-2466. [CrossRef]
- Faurholt-Jepsen M, Frost M, Christensen EM, Bardram JE, Vinberg M, Kessing LV. The effect of smartphone-based monitoring on illness activity in bipolar disorder: the MONARCA II randomized controlled single-blinded trial. Psychol Med. Apr 04, 2020;50(5):838-848. [CrossRef] [Medline]
- Vasey B, Ursprung S, Beddoe B, Taylor EH, Marlow N, Bilbro N, et al. Association of clinician diagnostic performance with machine learning-based decision support systems: a systematic review. JAMA Netw Open. Mar 01, 2021;4(3):e211276. [FREE Full text] [CrossRef] [Medline]
- Freeman K, Geppert J, Stinton C, Todkill D, Johnson S, Clarke A, et al. Use of artificial intelligence for image analysis in breast cancer screening programmes: systematic review of test accuracy. BMJ. Sep 01, 2021;374:n1872. [FREE Full text] [CrossRef] [Medline]
- Tønning ML, Faurholt-Jepsen M, Frost M, Martiny K, Tuxen N, Rosenberg N, et al. The effect of smartphone-based monitoring and treatment on the rate and duration of psychiatric readmission in patients with unipolar depressive disorder: the RADMIS randomized controlled trial. J Affect Disord. Mar 01, 2021;282:354-363. [FREE Full text] [CrossRef] [Medline]
- Keane PA, Topol EJ. With an eye to AI and autonomous diagnosis. NPJ Digit Med. 2018;1:40. [FREE Full text] [CrossRef] [Medline]
- Bzdok D, Ioannidis JP. Exploration, inference, and prediction in neuroscience and biomedicine. Trends Neurosci. Apr 2019;42(4):251-262. [CrossRef] [Medline]
- Hunter DJ, Holmes C. Where medical statistics meets artificial intelligence. N Engl J Med. Sep 28, 2023;389(13):1211-1219. [CrossRef] [Medline]
- Pinsky MR, Bedoya A, Bihorac A, Celi L, Churpek M, Economou-Zavlanos NJ, et al. Use of artificial intelligence in critical care: opportunities and obstacles. Crit Care. Apr 08, 2024;28(1):113. [FREE Full text] [CrossRef] [Medline]
- Cho S, Ensari I, Weng C, Kahn MG, Natarajan K. Factors affecting the quality of person-generated wearable device data and associated challenges: rapid systematic review. JMIR Mhealth Uhealth. Mar 19, 2021;9(3):e20738. [FREE Full text] [CrossRef] [Medline]
- Canali S, Schiaffonati V, Aliverti A. Challenges and recommendations for wearable devices in digital health: data quality, interoperability, health equity, fairness. PLOS Digit Health. Oct 2022;1(10):e0000104. [FREE Full text] [CrossRef] [Medline]
- Teno JM. Garbage in, garbage out-words of caution on big data and machine learning in medical practice. JAMA Health Forum. Mar 03, 2023;4(2):e230397. [FREE Full text] [CrossRef] [Medline]
- Khan SS, Matsushita K, Sang Y, Ballew SH, Grams ME, Surapaneni A, et al. Development and validation of the American Heart Association’s PREVENT equations. Circulation. Feb 06, 2024;149(6):430-449. [CrossRef]
- Firth J, Torous J, Nicholas J, Carney R, Pratap A, Rosenbaum S, et al. The efficacy of smartphone-based mental health interventions for depressive symptoms: a meta-analysis of randomized controlled trials. World Psychiatry. Oct 2017;16(3):287-298. [FREE Full text] [CrossRef] [Medline]
- Tønning ML, Kessing LV, Bardram JE, Faurholt-Jepsen M. Methodological challenges in randomized controlled trials on smartphone-based treatment in psychiatry: systematic review. J Med Internet Res. Oct 27, 2019;21(10):e15362. [FREE Full text] [CrossRef] [Medline]
- Martinez-Martin N, Insel TR, Dagum P, Greely HT, Cho MK. Data mining for health: staking out the ethical territory of digital phenotyping. NPJ Digit Med. Dec 19, 2018;1(1):68. [FREE Full text] [CrossRef] [Medline]
- Cohen C. Ethical issues in mandatory drug testing. In: Rosner R, Weinstock R, editors. Ethical Practice in Psychiatry and the Law. Boston, MA. Springer; 1990:313-325.
- Alowais SA, Alghamdi SS, Alsuhebany N, Alqahtani T, Alshaya AI, Almohareb SN, et al. Revolutionizing healthcare: the role of artificial intelligence in clinical practice. BMC Med Educ. Sep 22, 2023;23(1):689. [FREE Full text] [CrossRef] [Medline]
- Crigger E, Reinbold K, Hanson C, Kao A, Blake K, Irons M. Trustworthy augmented intelligence in health care. J Med Syst. Jan 12, 2022;46(2):12. [FREE Full text] [CrossRef] [Medline]
Abbreviations
AI: artificial intelligence |
NIMH: National Institute of Mental Health |
RCT: randomized clinical trial |
Edited by G Eysenbach; submitted 23.04.24; peer-reviewed by K Cho, R Philippe; comments to author 02.07.24; revised version received 09.07.24; accepted 16.07.24; published 05.08.24.
Copyright©Abigail Ortiz, Benoit H Mulsant. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 05.08.2024.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research (ISSN 1438-8871), is properly cited. The complete bibliographic information, a link to the original publication on https://www.jmir.org/, as well as this copyright and license information must be included.