Review
Abstract
Background: Assessment of cognitive decline in the earliest stages of Alzheimer disease (AD) is important but challenging. AD is a neurodegenerative disease characterized by gradual cognitive decline. Disease stages range from preclinical AD, in which individuals are cognitively unimpaired, to mild cognitive impairment (MCI) and dementia. Digital technologies promise to enable detection of early, subtle cognitive changes. Although the field of digital cognitive biomarkers is rapidly evolving, a comprehensive overview of the reporting of psychometric properties (ie, validity, reliability, responsiveness, and clinical meaningfulness) is missing. Insight into the extent to which these properties are evaluated is needed to identify the validation steps toward implementation.
Objective: This scoping review aimed to identify the reporting on quality characteristics of smartphone- and tablet-based cognitive tools with potential for remote administration in individuals with preclinical AD or MCI. We focused on both psychometric properties and practical tool characteristics.
Methods: This scoping review was conducted following the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews) guidelines. In total, 4 databases (PubMed, Embase, Web of Science, and PsycINFO) were systematically searched from January 1, 2008, to January 5, 2023. Studies were included that assessed the psychometric properties of cognitive smartphone- or tablet-based tools with potential for remote administration in individuals with preclinical AD or MCI. In total, 2 reviewers independently screened titles and abstracts in ASReview, a screening tool that combines manual and automatic screening using an active learning algorithm. Thereafter, we manually screened full texts in the web application Rayyan. For each included study, 2 reviewers independently explored the reported information on practical and psychometric properties. For each psychometric property, examples were provided narratively.
Results: In total, 11,300 deduplicated studies were identified in the search. After screening, 50 studies describing 37 different digital tools were included in this review. Average administration time was 13.8 (SD 10.1; range 1-32) minutes, but for 38% (14/37) of the tools, this was not described. Most tools (31/37, 84%) were examined in 1 language. The investigated populations were mainly individuals with MCI (34/37, 92%), and fewer tools were examined in individuals with preclinical AD (8/37, 22%). For almost all tools (36/37, 97%), construct validity was assessed through evaluation of clinical or biological associations or relevant group differences. For a small number of tools, information on structural validity (3/37, 8%), test-retest reliability (12/37, 32%), responsiveness (6/37, 16%), or clinical meaningfulness (0%) was reported.
Conclusions: Numerous smartphone- and tablet-based tools to assess cognition in early AD are being developed, whereas studies concerning their psychometric properties are limited. Often, initial validation steps have been taken, yet further validation and careful selection of psychometrically valid outcome scores are required to demonstrate clinical usefulness with regard to the context of use, which is essential for implementation.
doi:10.2196/65297
Keywords
Introduction
Background
Alzheimer disease (AD) is a neurodegenerative disease associated with gradual decline in cognition, where disease stages range from preclinical AD, in which cognitive decline is absent or only subtle, to mild cognitive impairment (MCI) and dementia [
, ]. Biologically, the disease is defined by pathological changes in amyloid accumulation and neurofibrillary tau protein tangles that appear years before the onset of cognitive symptoms [ - ]. However, it has been widely shown that, even in the preclinical stage, cognition does decline subtly [ - ]. With the increasing number of older adults, it is expected that AD will become a major societal problem, estimated to affect 131 million people in 2050 worldwide [ , ]. The emergence of AD biomarkers allows for the recognition of the disease in the preclinical stage, which has opened a window of opportunity for treatment studies, as interventions are likely to be most beneficial in early stages of the disease. Accordingly, disease-modifying treatments and nonpharmaceutical prevention trials have increasingly focused on preclinical stages of AD over the past years [ - ]. This shift toward the earliest AD stages highlights the need for new tools that provide outcome measures of cognition that are adequate for use in these early disease stages in the realm of early detection, diagnosis, disease monitoring, and evaluation of treatment effects [ - ]. In addition, to enable large-scale decentralized prevention, intervention, or disease-monitoring initiatives, tools are required that enable remote, time-efficient, and reliable assessment of cognition.The current gold standard to assess cognition in AD is through neuropsychological testing, using paper-and-pencil tests that need to be supervised by a trained neuropsychologist. Importantly, such traditional cognitive tests are not suitable for the earliest AD stages as these tests often lack the sensitivity that is required to detect subtle cognitive changes [
, , ]. Thus, novel, stage-specific testing paradigms are needed, and digital tools are promising to fill this gap [ , ]. A major advantage of digital cognitive tools is their suitability for unsupervised remote assessment, which enables highly frequent testing that may provide a more accurate reflection of cognition [ , ]. Given the intuitive person-device interaction of touch screen devices, smartphone- and tablet-based tools provide the optimal modality for unsupervised remote assessment [ , ]. Other advantages of such cognitive tools include reduced patient burden; time efficiency through automatic administration and scoring; high scalability; and the potential for rich data collection of precise measurements, including, for example, response times.Over the last decade, numerous digital tools have been developed that have the potential for remote assessment of cognition in early AD stages [
]. However, the development of such tools has not yet led to large-scale implementation [ ]. To date, the measurement quality of digital tools to assess cognition is largely unknown, although it has widely been acknowledged that information on psychometric properties is important for the use and implementation of these tools [ , , - ]. For the evaluation of cognitive tests in early AD stages, a recommendation framework has been proposed based on the Consensus-Based Standards for the Selection of Health Measurement Instruments methodology [ , , ]. This framework highlights the importance of psychometric properties concerning structural and construct validity, responsiveness, and clinical meaningfulness. Hence, available information on psychometric properties is a prerequisite for the successful implementation of digital tools. Although previous reviews have described some validation aspects of digital cognitive assessments [ , ], a comprehensive overview of the reporting of crucial psychometric properties (ie, structural and construct validity, reliability, responsiveness, and clinical meaningfulness) of digital cognitive tools for use in preclinical AD or MCI is currently missing.Objectives
This scoping review was conducted to identify the reporting on quality characteristics of smartphone- and tablet-based tools to assess cognition in individuals with preclinical AD or MCI. Our scope was limited to smartphone- and tablet-based cognitive tools with the potential for remote assessment. The primary focus was on the availability of information on psychometric properties, and the secondary focus on the reporting on practical tool characteristics. The findings of this review will be used to formulate recommendations about future steps that are needed to facilitate the implementation of remote smartphone- and tablet-based tools to assess cognition.
Methods
Overview
This scoping review is conducted in line with a methodological framework for conducting scoping studies [
]. The reporting complies with the guidelines of the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews) checklist [ ], as shown in . The review was not registered, and no protocol was prepared.Search Strategies
After several scoping searches, 4 bibliographic databases (PubMed, Embase [Elsevier], Web of Science Core Collection [Clarivate Analytics], and APA PsycINFO [EBSCO]) were searched for relevant literature from January 1, 2008 (ie, the year that the Google Play Store and Apple App Store, two of the biggest mobile app distribution platforms, were launched), to January 5, 2023. Searches were devised in collaboration with a medical information specialist (KAZ). Search terms including synonyms, closely related words, and keywords were used as free-text words (eg, “Alzheimer,” “digital,” and “cognition”). The search contained no methodological search filter that would limit results to specific study designs or languages. Duplicate studies were excluded using the R package ASYSD (R Foundation for Statistical Computing), an automated deduplication tool [
], which was followed by manual deduplication in EndNote (version X20.0.3; Clarivate Analytics). The full search strategy used for each database is detailed in .Eligibility Criteria
Published journal articles were included or excluded according to the criteria outlined in
.Inclusion criteria
- The following tool characteristics were met: (1) the digital tool aimed to measure cognition and (2) the tool was a performance-based measure.
- The study aim was to assess (one of the) psychometric properties as defined in the Consensus-Based Standards for the Selection of Health Measurement Instruments framework [ ] of a digital tool to assess cognition.
- The study sample comprised (1) individuals with preclinical Alzheimer disease (AD) or individuals with AD pathological change, defined by normal cognition but presence of abnormal amyloid and tau protein biomarkers or abnormal amyloid biomarkers, respectively (henceforth, preclinical AD) [ ]; or (2) individuals with mild cognitive impairment, which was diagnosed based on established clinical criteria, or it was specified that a clinical diagnosis was given in a memory clinic or (university) hospital or by a clinical or medical specialist.
- The following practical tool characteristics were met (to ensure potential for remote administration): (1) the digital tool was administered using a smartphone or tablet and (2) the digital tool was self-administered or had the potential to be self-administered, as indicated by the contextual information provided.
Exclusion criteria
- The studies involved self-reported diagnosis or classification solely based on AD risk factors (eg, Apolipoprotein E ε4 allele and family history of AD), or the investigated population only included individuals with dementia.
- The articles reported case studies, conference proceedings, conference abstracts, preprints, research protocols, qualitative studies, reviews, opinion papers, studies that used the digital tool as an outcome measurement instrument (eg, in randomized controlled trials), or studies that used the tool in a validation study of another instrument, or the articles were not fully available in English.
- The device of the digital tool was not specified.
- Additional equipment was required for using the digital tool (eg, stylus, digital pen, joystick, or virtual reality glasses).
- Self-administration of the digital tool was not possible, or the potential for self-administration was not indicated by contextual information.
- Automated scoring was not possible within the digital tool.
Screening Procedures
To minimize selection bias, 2 of the review authors (RLvdB and SMvdL) independently screened titles and abstracts based on the eligibility criteria (
) using ASReview (version 1.1) [ ] with default settings (naive Bayes, term frequency–inverse document frequency, and query strategy: max). Differences in judgment were resolved via a consensus procedure through discussion and, if necessary, by consulting a third reviewer (MJK). ASReview is an open-source screening tool using manual input in combination with an active learning algorithm to screen studies based on titles and abstracts. The algorithm was trained using 7 relevant and 7 irrelevant studies, where relevant studies were manually selected and irrelevant studies were randomly provided by the model and subsequently manually labeled as irrelevant. A data-driven stopping rule strategy was followed, which has been suggested previously, and proposes to end the screening process after n consecutive irrelevant records are provided by the active learning algorithm [ ]. No consensus exists yet on the optimal n, and the number varies across studies (eg, 20-500) [ - ]. Importantly, it has previously been noted that the stopping rule should aim to find the trade-off between the costs of manually labeling additional records and the costs of erroneous exclusions of relevant records by the algorithm [ ]. We stopped the screening process after 100 consecutive irrelevant studies were provided by the ASReview algorithm. This number was based on an initial pilot screening round where it was observed that, if the stopping rule was set to 50 consecutive irrelevant records, relevant records continued to appear after 45 to 49 records, so the cutoff was increased to 100 to include a higher certainty margin. After screening of titles and abstracts, 4 of the review authors (RLvdB, SMvdL, MJK, and AMvG) conducted full-text screening according to the eligibility criteria ( ) using the web application Rayyan (Qatar Computing Research Institute) [ ]. Each study was screened by one of the reviewers independently. Uncertainties were discussed, and if this resulted in disagreement, this was resolved through discussion, or if necessary, a fifth reviewer was consulted.Data Extraction and Data Synthesis
Overview
Two review authors (RLvdB and MJK) independently extracted data from the final set of included studies such that each study was assessed by 2 authors. Uncertainties were discussed, and if needed, a third author was consulted (SMvdL, AMvG, MvD, SAMS, and CdB). Data were extracted about practical tool characteristics, study characteristics, and psychometric properties. Practical tool characteristics and psychometric properties were extracted from the included studies and summarized at the tool level. Summarized information was calculated based on available data, and missing data were excluded. Study characteristics were extracted and reported at the study level. In case of missing information, this was reported as such. Examples of practical characteristics and psychometric properties were provided narratively.
Practical tool characteristics concerned the cognitive domains assessed (single or multiple), active or passive task completion (ie, evaluating performance during an assessment vs assessing performance in everyday life activities, respectively), administration time, testing design (ie, single or repeated assessment), and language in which the tool was examined in the included studies. Study characteristics included the study population examined, total sample size and investigative setting (on-site, eg, in the clinic, university, or day or community center, or remote, eg, at-home environment). In addition, we extracted information on the psychometric properties of the included digital tools, which we selected based on recommendations on how cognition should be measured as formulated in the “recommendation framework for evaluation of performance based cognitive tests” [
] related to construct validity and interpretability that were based on the Consensus-Based Standards for the Selection of Health Measurement Instruments methodology. The psychometric properties assessed in this review were (1) structural validity (factor analysis), (2) construct validity (clinical and biological associations and relevant clinical group differences), (3) test-retest reliability and responsiveness, and (4) interpretability (clinical meaningfulness), which are defined and explained in more detail in the following sections. Specifically, for each digital tool, we evaluated whether information on each of these 4 psychometric properties was reported (✓) or not available in the included studies. The ratings of ✓ and “not available” were assigned regardless of the method used or the quality of the reported psychometric property.Structural Validity
Structural validity refers to the degree to which scores are an adequate reflection of the dimensionality that is measured (ie, underlying structure of one or multiple cognitive domains) [
, ]. As such, it is recommended to conducted a factor analysis to demonstrate adequate use of outcome scores [ ]. The digital tool was given a rating of ✓ if a factor analysis was conducted.Construct Validity
Construct validity refers to the extent to which outcome scores are in line with hypotheses regarding, for example, relationships to scores on other instruments or differences between relevant groups [
]. For clinical outcome assessments in early AD, it is recommended that test scores are validated against relevant clinical or biological measures [ ]. We defined three subcategories for construct validity: (1) clinical association, (2) biological association, and (3) relevant group differences. The tool was assigned a rating of ✓ for clinical associations if correlations were evaluated between the digital outcome measures and traditional measures of global cognition or specific cognitive domains. The tool was given a rating of ✓ for biological associations if correlations between AD biomarkers (eg, β-amyloid and tau protein deposition), markers of neurodegeneration (eg, measures of cortical thickness), or AD biomarker group differences on the digital outcome measure were reported. The tool was given a rating of ✓ if differences between groups with different clinical status (eg, cognitively unimpaired [CU], MCI, and AD dementia) were reported.Reliability and Responsiveness
Test-retest reliability refers to the extent to which cognitive performance measured using a digital tool is consistent for the same patient over time [
, ]. Given that a patient has not changed, no cognitive changes are expected for the same patient under the same conditions within a short time frame. The tool was assigned a rating of ✓ if test-retest reliability was reported.Responsiveness refers to the sensitivity of a digital tool to detect change in cognition over time [
, ]. As such, a requirement for digital tools aiming at measuring cognitive change is that they are sensitive to it. Hence, responsiveness is an essential property of digital tools, and a validation study should be conducted on the ability to capture cognitive change in the target population [ ]. Tools were assigned a rating of ✓ if responsiveness of the digital tool to cognitive change over time was reported.Interpretability
Clinical meaningfulness is not considered a psychometric property but, rather, an important tool characteristic for interpretability of a (change in) score [
]. Clinical meaningfulness refers to a score (or its change) that can be interpreted as clinically relevant such that a qualitative meaning is assigned to the quantitative score [ , ]. For outcome measures in early AD, it is recommended that the scores that patients and caregivers perceive as clinically meaningful be examined [ ]. We assigned the tool a rating of ✓ if it specified what (change in) score was considered clinically relevant. The tool was assigned a rating of “not available” if it did not specify what (change in) score was considered clinically relevant or if the term “clinical meaningfulness” was used but no defined meaningful score was provided.Results
Search and Screening Results
The systematic literature search generated a total of 23,404 references: 5726 (24.47%) in PubMed, 8069 (34.48%; n=11,322, 48.38% with conference abstracts) in Embase, 7675 (32.79%) in Web of Science Core Collection, and 1934 (8.26%) in APA PsycINFO. After removing duplicates, of the 23,404 references, 11,300 (48.28%) remained. These references were screened based on title and abstract in ASReview. Of the 11,300 abstracts, the first reviewer (RLvdB) manually included 233 (2.06%) relevant ones and excluded 1465 (12.96%) irrelevant ones, and the second reviewer (SMvdL) manually included 234 (2.07%) relevant ones and excluded 1062 (9.4%) irrelevant ones. Within these 2 sets, there was agreement for 33.4% (156/467) of the included abstracts and 38.3% (969/2527) of the excluded abstracts, whereas 1.27% (144/11,300) of the abstracts were labeled differently (eg, included or excluded) such that, within the set of studies that were manually labeled by both reviewers, there was 88% agreement. After reaching consensus, 233 relevant studies were identified and included for further full-text screening. After full-text screening, of the 233 studies, 50 (21.5%) were included for data extraction. The PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) flowchart [
] of the search and selection process is presented in .
Tool and Study Characteristics
Practical Tool Characteristics
In total, 50 studies were included that comprised 37 tools. Although it is complex to make a binary distinction between one-to-one digitalized versions of paper-and-pencil tests and tools that are based to some extent on existing traditional paper-and-pencil tests, it was observed that digital tools used a variety of approaches. One-to-one digitalized versions of widely used traditional paper-and-pencil tests included, for example, a digital trail making task (digital Trail Making Test–Black and White) [
], digitalized word list recall (ReVeRe word list recall) [ ], a computerized associative learning task (paired associate learning [PAL]) [ ], and an automated story recall task (ASRT) [ , ]. The digitalization of these existing paper-and-pencil tests allowed for new variables, such as movement and reaction time measures. Other tools comprised batteries of multiple digital tests that were based on existing neuropsychological tasks (eg, the Boston Remote Assessment for Neurocognitive Health [BRANCH] [ ], cCOG [ ], and Electronic Cognitive Screen [ ]). Other approaches involved, for example, tapping as rapidly as possible or tapping following a presented sound rhythm (JustTouch [ ]), identifying nonidentical stimuli between 2 presented blocks (the tablet-based cancellation test e-CT [ ]), placing virtual objects and recalling their location (Altoida [ - ]), or remembering names and occupations associated with faces (FACEmemory [ , ]). One tool, Klondike Solitaire [ , ], incorporated an “off-the-shelf” game evaluating performance on the existing digitalized game Klondike solitaire. Other methods involved tasks simulating everyday life activities, such as dosing pills in a virtual pill box (Simulation-Based Assessment of Cognition [SIMBAC] [ , ]) or grocery shopping in a virtual supermarket (Virtual Supermarket Test [VST] [ - ]). Moreover, approaches included the measurement of everyday communication skills, such as typing (nQ Medical [ ] and Type of Mood [ ]) and speech production (ki:e speech biomarker for cognition [SB-C] [ ], Winterlight Assessment [WLA] [ ], and a “tablet-based automatic assessment using speech data” [ ]).Information about practical tool characteristics is shown in
. Most tools (26/37, 70%) were reported to assess multiple cognitive domains. The cognitive domains assessed varied across tools and included, for example, attention, episodic memory, executive functioning, inhibition, orientation, language, implicit learning, processing speed, or working memory [ , - ]. A total of 30% (11/37) of the tools assessed 1 specific domain. Of those 11 tools, 6 (55%) assessed memory (BRANCH [ ], PAL [ ], Episodix [ ], FACEmemory [ , ], Learning Over Repeated Exposures [LORE] [ ], and ReVeRe word list recall [ ]). In total, 8% (3/37) of the tools assessed language (speech; WLA [ ], ki:e SB-C [ ], and the “tablet-based automatic assessment using speech data” [ ]). A total of 3% (1/37) of the tools measured executive functioning (e-CT [ ]), and 3% (1/37) measured motor function (JustTouch [ ]).All but 1 digital tool (36/37, 97%) were active cognitive tasks. The passive tool (Type of Mood [
]) measured everyday typing performance. However, it should be noted that the active or passive dichotomization should be placed on a continuum where, for instance, tasks that are not commonly performed in everyday life, such as the e-CT [ ] (canceling nonmatching symbols) or JustTouch [ ] (rapidly tapping), might be considered more on the active side. On the other hand, tools analyzing actively elicited speech (eg, WLA [ ], ki:e SB-C [ ], and the “tablet-based automatic assessment using speech data” [ ]) or typing a copied text or simulated text conversation (nQ Medical app [ ]) might be considered more passive testing because such activities may be argued to interfere with everyday activities to a lesser extent.Tool | Cognitive domain | Passive or active | Administration time | Testing design | Language examined in the included studies |
Altoida [ | - ]Multiple | Active | 8 min | Varied across the studies—single or repeated | English, French, Spanish, Greek, German, and Italian |
ARCa [ | ]Multiple | Active | —b | Repeated | English |
ASRTc [ | , ]Multiple | Active | 3-5 min | Repeated | English |
BRANCHd [ | ]Memory (episodic, associative, and visual) | Active | 22 min | Single | English |
cCOG [ | ]Multiple | Active | 20 min | Single | English, Finnish, Danish, Dutch, and Italian |
CompCog [ | ]Multiple | Active | — | — | Portuguese |
CAMCIe [ | ]Multiple | Active | 20 min | Single | English |
C3f [ | , ]Multiple | Active | 25-30 min | Single | English |
dTMT-B&Wg [ | ]Multiple | Active | — | Single | Korean |
EC-Screenh [ | ]Multiple | Active | 5 min | Single | Chinese |
Episodix [ | ]Memory (episodic) | Active | — | — | Spanish |
Face-name associative memory exam (FACEmemory) [ | , ]Memory (episodic) | Active | 30 min | Single | Spanish |
GSCTi [ | ]Multiple | Active | 16-48 min | Single | Swedish |
Inbrain CSTj [ | ]Multiple | Active | 30 min | Single | Korean |
ICAk [ | ]Multiple | Active | 5 min | Single | Farsi and English |
JustTouch [ | ]Motor function | Active | 1 min | Single | Japanese |
ki:e SB-Cl [ | ]Language (speech) | Active | — | — | Dutch and English |
Klondike Solitaire [ | , ]Multiple | Active | — | — | Dutch (Belgian) |
LOREm [ | ]Memory (associative) | Active | — | Repeated | English |
Miro Health mobile assessment platform [ | ]Multiple | Active | — | — | English |
M2C2n [ | ]Multiple | Active | 4-5 min | Repeated | English |
mSTS-MCIo [ | ]Multiple | Active | 15 min | Single | Korean |
NNCTp [ | ]Multiple | Active | 10 min | Single | — |
NeuroUX [ | ]Multiple | Active | 11 min | Repeated | English |
NIHTB-CBq [ | ]Multiple | Active | — “brief” | — | English |
nQ Medical mobile phone app [ | ]Multiple (via keystroke dynamics) | Active | — | — | English |
PALr task [ | ]Memory (episodic) | Active | — “brief” | Single | English |
ReVeRe WLRs [ | ]Memory (episodic) | Active | — | — | English |
SIMBACt [ | ]Multiple | Active | 10 min | Single | English |
SMARTu [ | ]Multiple | Active | 5 min | Single | English |
Tablet-based cancellation test (e-CT) [ | ]Executive functioning | Active | 2 min | Single | French |
Type of Mood mobile app [ | ]Multiple | Passive | — | Repeated | Greek |
UCSF-BHAv (different versions) [ | - ]Multiple | Active | 10 min | Single | English and Spanish |
VSTw (different versions) [ | - ]Multiple | Active | 25-30 min | Single | Greek and Turkish |
WLAx [ | ]Language (speech) | Active | 5-10 min | Single | English |
“A tablet-based cognitive assessment” [ | ]Multiple | Active | 10 min | Single | Taiwanese |
“Tablet-based automatic assessment using speech data” [ | ]Language (speech) | Active | — | — | Japanese |
aARC: Ambulatory Research in Cognition.
bNot available (ie, not described in the study).
cASRT: automated story recall task.
dBRANCH: Boston Remote Assessment for Neurocognitive Health.
eCAMCI: Computer Assessment of Mild Cognitive Impairment.
fC3: Computerized Cognitive Composite.
gdTMT-B&W: digital Trail Making Test–Black and White.
hEC-Screen: Electronic Cognitive Screen.
iGSCT: Geras Solutions cognitive test.
jCST: Cognitive Screening Test.
kICA: Integrated Cognitive Assessment.
lSB-C: speech biomarker for cognition.
mLORE: Learning Over Repeated Exposures.
nM2C2: Mobile Monitoring of Cognitive Change.
omSTS-MCI: mobile screening test system for mild cognitive impairment.
pNNCT: Natural and Artificial Intelligence Health Assistant Neuro Cognitive Test.
qNIHTB-CB: National Institutes of Health Toolbox Cognition Battery.
rPAL: paired associate learning.
sWLR: word list recall.
tSIMBAC: Simulation-Based Assessment of Cognition.
uSMART: Survey for Memory, Attention, and Reaction Time.
vUCSF-BHA: University of California, San Francisco, Brain Health Assessment.
wVST: Virtual Supermarket Test.
xWLA: Winterlight Assessment.
Administration time was, on average, 13.8 (SD 10.1; range 1-32) minutes. The shortest tasks involved finger tapping; symbol cancelation; story recall; and a short battery of a symbol-matching, color-shape-memory, and location-memory task (JustTouch [
], e-CT [ ], ASRT [ , ], and Mobile Monitoring of Cognitive Change [ ]). The tools with the longest administration time covered multiple tasks assessing different cognitive domains (Geras Solutions cognitive test [ ] and Inbrain Cognitive Screening Test [CST] [ ]) or learning face-name-occupation associations and memorizing these after a 20-minute delay (FACEmemory [ , ]). For 38% (14/37) of the tools, the administration time was unspecified, and for 24% (9/37) of the tools, the testing design (ie, single or repeated testing) was not reported. Most tools (21/37, 57%) had a single-testing design, whereas 16% (6/37) were incorporated into a multiday protocol, and for 3% (1/37) of the tools (Altoida [ - ]), the testing design differed across studies. In the repeated-testing designs, (1) participants completed short cognitive tasks over several days (Altoida [ , ], Ambulatory Research in Cognition [ARC] [ ], ASRT [ , ], NeuroUX [ ], and Mobile Monitoring of Cognitive Change [ ]), (2) tasks were assessed monthly for 1 year (LORE [ ]), or (3) passive monitoring comprised a longer period (ie, 6 months; Type of Mood [ ]).Most tools (31/37, 84%) were tested in a single language, which was primarily English. A total of 11% (4/37) of the tools were tested in 2 languages (ie, VST [
- ] in Greek and Turkish; University of California, San Francisco, Brain Health Assessment [UCSF-BHA] [ - ] in English and Spanish; ki:e SB-C [ ] in English and Dutch; and Integrated Cognitive Assessment [ ] in English and Farsi). In total, 3% (1/37) of the tools were assessed in 5 languages (ie, cCOG [ ] in English, Finnish, Danish, Dutch, and Italian), and 3% (1/37) were assessed in 6 languages (ie, Altoida [ - ] in English, French, Spanish, Greek, German, and Italian). For 3% (1/37) of the tools (Natural and Artificial Intelligence Health Assistant Neuro Cognitive Test [ ]), the language was not specified.Study Characteristics
provides a summary of the characteristics of the included studies that examined a smartphone- or tablet-based cognitive tool in individuals with MCI or preclinical AD. Most tools (29/37, 78%) were assessed in a single study. A total of 14% (5/37) of the tools were examined in 4% (2/50) of the studies (ASRT [ , ], Computerized Cognitive Composite [C3] [ , ], FACEmemory [ , ], Klondike Solitaire [ , ], and SIMBAC [ , ]). One tool (Altoida [ - ]) was assessed in 6% (3/50) of the included studies, and 5% (2/37) of the tools (VST [ - ] and UCSF-BHA [ - ]) were examined in 8% (4/50) of the studies. Most of the included studies (36/50, 72%) had a cross-sectional design, whereas 28% (14/50) had a longitudinal design. These longitudinal studies assessed 32% (12/37) of the digital tools.
The mean total sample size in the included studies was 295.4 (SD 621.6), with a range of 12 to 4486 individuals. Most studies (41/50, 82%) examined the digital tool in individuals with MCI but not in individuals with preclinical AD, 10% (5/50) of the studies included both individuals with MCI and individuals with preclinical AD, and 8% (4/50) of the studies included individuals with preclinical AD but not individuals with MCI. Summarizing these studies at an individual tool level, most tools (29/37, 78%) were examined in individuals with MCI but not in individuals with preclinical AD. In total, 8% (3/37) of the digital tools (ARC [
], C3 [ , ], and LORE [ ]) were investigated in individuals with preclinical AD but not in individuals with MCI. A total of 14% (5/37) of the tools (ASRT [ , ], BRANCH [ ], FACEmemory [ , ], PAL [ ], and UCSF-BHA [ - ]) were examined in both individuals with preclinical AD and individuals with MCI. Thus, most tools were investigated in individuals with MCI (34/37, 92%), and fewer tools were examined in individuals with preclinical AD (8/37, 22%). A total of 54% (27/50) of the studies, covering 59% (22/37) of the tools, additionally included populations other than individuals with preclinical AD or MCI, such as individuals with dementia.In 48% (24/50) of the studies the digital tools were examined on-site. In 22% (11/50) of the studies, the psychometric properties of the tool were investigated in a remote setting, and in 16% (8/50) of the studies, these properties were examined in both an on-site and remote setting. In 14% (7/50) of the studies, the investigative setting was not described.
Study | Tool | Study design | Examined populations | Total sample size | Investigative setting |
Rai et al [ | ], 2020Altoida | Longitudinal | CUb and MCIc | 496 | On-site and remote |
Meier et al [ | ], 2021Altoida | Longitudinal | CU, MCI, and dementia | 525 | Remote |
Seixas et al [ | ], 2022Altoida | Longitudinal | CU, MCI, and dementia | 576 | On-site and remote |
Nicosia et al [ | ], 2023ARCd | Longitudinal | CU, preclinical ADe, and dementia | 290 | Remote |
Skirrow et al [ | ], 2022ASRTf | Longitudinal | CU, MCI, and dementia | 151 | On-site and remote |
Fristed et al [ | ], 2022ASRT | Cross-sectional | CU, preclinical AD, MCI, and dementia | 115 | Remote |
Papp et al [ | ], 2021BRANCHg | Cross-sectional | CU, preclinical AD, and MCI | 234 | On-site and remote |
Rhodius-Meester et al [ | ], 2020cCOG | Longitudinal | CU, MCI, and dementia | 495 | On-site and remote |
Hartle et al [ | ], 2022CompCog | Cross-sectional | CU and MCI | 52 | On-site |
Saxton et al [ | ], 2009CAMCIh | Cross-sectional | CU and MCI | 524 | On-site |
Papp et al [ | ], 2021C3i | Cross-sectional | CU and preclinical AD | 4486 | On-site |
Jutten et al [ | ], 2021C3 | Longitudinal | CU and preclinical AD | 114 | On-site and remote |
Simfukwe et al [ | ], 2022dTMT-B&Wj | Cross-sectional | CU and MCI | 44 | —k |
Chan et al [ | ], 2020EC-Screenl | Cross-sectional | CU, MCI, and dementia | 243 | On-site |
Valladares-Rodriguez et al [ | ], 2018Episodix | Cross-sectional | CU, MCI, and dementia | 64 | On-site |
Alegret et al [ | ], 2020FACEmemory | Cross-sectional | CU, preclinical AD, and MCI | 276 | On-site |
Alegret et al [ | ], 2022FACEmemory | Cross-sectional | MCI | 94 | On-site |
Bloniecki et al [ | ], 2021GSCTm | Cross-sectional | SCDn, MCI, and dementia | 98 | On-site |
Chin et al [ | ], 2020Inbrain CSTo | Cross-sectional | SCD, MCI, and dementia | 577 | On-site |
Kalafatis et al [ | ], 2021ICAp | Cross-sectional | CU, MCI, and dementia | 230 | On-site or remote with presence of researcher |
Suzumura et al [ | ], 2018JustTouch | Cross-sectional | CU, MCI, and dementia | 94 | — |
Tröger et al [ | ], 2022ki:e SB-Cq | Longitudinal | CU, SCD, and MCI | 159 | On-site and remote |
Gielis et al [ | ], 2021Klondike Solitaire | Cross-sectional | CU and MCI | 46 | Remote with presence of researcher |
Gielis et al [ | ], 2021Klondike Solitaire | Cross-sectional | CU and MCI | 46 | Remote with presence of researcher |
Samaroo et al [ | ], 2020LOREr | Longitudinal | CU and preclinical AD | 94 | Remote |
Sloane et al [ | ], 2022Miro Health platform | Cross-sectional | CU, MCI, and other | 174 | — |
Cerino et al [ | ], 2021M2C2s | Cross-sectional | CU and MCI | 311 | Remote |
Park et al [ | ], 2018mSTS-MCIt | Cross-sectional | CU and MCI | 177 | On-site |
Oliva and Losa [ | ], 2022NNCTu | Cross-sectional | CU, MCI, and dementia | 147 | — |
Moore et al [ | ], 2022NeuroUX | Longitudinal | CU and MCI | 94 | Remote |
Ma et al [ | ], 2021NIHTB-CBv | Cross-sectional | CU, MCI, dementia, and other | 411 | On-site |
Holmes et al [ | ], 2022nQ Medical app | Cross-sectional | CU and MCI | 77 | On-site |
Pettigrew et al [ | ], 2022PALw | Cross-sectional | CU, preclinical AD, and MCI | 73 | — |
Morrison et al [ | ], 2018WLRx | Cross-sectional | CU and MCI | 121 | On-site |
Ip et al [ | ], 2017SIMBACy | Cross-sectional | CU, MCI, dementia, and other | 173 | On-site |
Rapp et al [ | ], 2018SIMBAC | Cross-sectional | CU, MCI, and dementia | 161 | On-site |
Dorociak et al [ | ], 2021SMARTz | Cross-sectional | CU and MCI | 69 | Remote |
Wu et al [ | ], 2017e-CT | Cross-sectional | CU, MCI, and dementia | 325 | On-site |
Ntracha et al [ | ], 2020Type of Mood | Longitudinal | SCD and MCI | 23 | Remote |
Possin et al [ | ], 2018UCSF-BHAaa | Cross-sectional | CU, SCD, MCI, and dementia | 347 | On-site |
Tsoy et al [ | ], 2020UCSF-BHA | Longitudinal | Preclinical AD, CU, MCI, and dementia | 850 | On-site |
Tsoy et al [ | ], 2021UCSF-BHA | Cross-sectional | MCI and dementia | 140 | On-site |
Rodríguez-Salgado et al [ | ], 2021UCSF-BHA | Cross-sectional | CU, MCI, and dementia | 146 | On-site |
Zygouris et al [ | ], 2015VSTab | Cross-sectional | CU and MCI | 55 | On-site |
Zygouris et al [ | ], 2017VST | Longitudinal | CU and MCI | 12 | Remote |
Zygouris et al [ | ], 2020VST | Cross-sectional | SCD and MCI | 95 | On-site |
Eraslan Boz et al [ | ], 2020VST | Cross-sectional | CU and MCI | 89 | — |
Robin et al [ | ], 2021WLAac | Longitudinal | CU, MCI, and dementia | 50 | On-site |
Huang et al [ | ], 2019“A tablet-based cognitive assessment” | Cross-sectional | CU, MCI, and dementia | 120 | — |
Yamada et al [ | ], 2021“Tablet-based automatic assessment using speech data” | Cross-sectional | CU and MCI | 76 | On-site |
aTerminology for individuals who were cognitively unimpaired or individuals with subjective cognitive decline varied across the included studies (ie, cognitively unimpaired comprised cognitively unimpaired, cognitively healthy, cognitively normal, cognitively intact, [healthy] controls, healthy older adults, no impairment, and no diagnosis of dementia, and subjective cognitive decline comprised subjective memory complaints, subjective cognitive impairment, and normal with concerns). For individuals with mild cognitive impairment, no distinction was made between amnestic and nonamnestic mild cognitive impairment or between mild cognitive impairment due to Alzheimer disease or to other causes. For individuals with dementia, no distinction was made among mild, moderate, or severe dementia or between Alzheimer disease dementia and non–Alzheimer disease dementia. Sample sizes include the total number of participants included for analyses in the studies.
bCU: cognitively unimpaired.
cMCI: mild cognitive impairment.
dARC: Ambulatory Research in Cognition.
eAD: Alzheimer disease.
fASRT: automated story recall task.
gBRANCH: Boston Remote Assessment for Neurocognitive Health.
hCAMCI: Computer Assessment of Mild Cognitive Impairment.
iC3: Computerized Cognitive Composite.
jdTMT-B&W: digital Trail Making Test–Black and White.
kNot available (ie, not described in the study).
lEC-Screen: Electronic Cognitive Screen.
mGSCT: Geras Solutions cognitive test.
nSCD: subjective cognitive decline.
oCST: Cognitive Screening Test.
pICA: Integrated Cognitive Assessment.
qSB-C: speech biomarker for cognition.
rLORE: Learning Over Repeated Exposures.
sM2C2: Mobile Monitoring of Cognitive Change.
tmSTS-MCI: mobile screening test system for MCI.
uNNCT: Natural and Artificial Intelligence Health Assistant Neuro Cognitive Test.
vNIHTB-CB: National Institutes of Health Toolbox Cognition Battery.
wPAL: paired associate learning.
xWLR: word list recall.
ySIMBAC: Simulation-Based Assessment of Cognition.
zSMART: Survey for Memory, Attention, and Reaction Time.
aaUCSF-BHA: University of California, San Francisco, Brain Health Assessment.
abVST: Virtual Supermarket Test.
acWLA: Winterlight Assessment.
Psychometric Properties
Overview
provides a summary at the tool level on the reporting on psychometric properties in the included studies.
Tool | Structural validity: factor analysis | Construct validity | Reliability and responsiveness | Interpretability: clinical meaningfulness | |||||
Clinical associations | Biological associations | Relevant group differences | Test-retest reliability | Responsiveness | |||||
Altoida [ | - ]—b | — | ✓c | ✓ | — | ✓ | — | ||
ARCd [ | ]— | ✓ | ✓ | ✓ | ✓ | — | — | ||
ASRTe [ | , ]— | ✓ | ✓ | ✓ | — | — | — | ||
BRANCHf [ | ]— | ✓ | ✓ | ✓ | ✓ | — | — | ||
cCOG [ | ]— | ✓ | — | ✓ | ✓ | — | — | ||
CompCog [ | ]— | — | — | ✓ | — | — | — | ||
CAMCIg [ | ]— | — | — | ✓ | ✓ | — | — | ||
C3h [ | , ]— | ✓ | ✓ | ✓ | ✓ | ✓ | — | ||
dTMT-B&Wi [ | ]— | ✓ | — | ✓ | — | — | — | ||
EC-Screenj [ | ]— | — | — | ✓ | — | — | — | ||
Episodix [ | ]— | — | — | ✓ | — | — | — | ||
FACEmemory [ | , ]— | ✓ | ✓ | ✓ | — | — | — | ||
GSCTk [ | ]— | ✓ | — | ✓ | — | — | — | ||
Inbrain CSTl [ | ]✓ | ✓ | ✓ | ✓ | ✓ | — | — | ||
ICAm [ | ]— | ✓ | — | ✓ | — | — | — | ||
JustTouch [ | ]— | ✓ | — | ✓ | — | — | — | ||
ki:e SB-Cn [ | ]— | ✓ | — | ✓ | ✓ | ✓ | — | ||
Klondike Solitaire [ | , ]— | — | — | ✓ | — | — | — | ||
LOREo [ | ]— | — | ✓ | ✓ | — | ✓ | — | ||
Miro Health platform [ | ]— | ✓ | — | ✓ | ✓ | — | — | ||
M2C2p [ | ]— | — | — | ✓ | — | — | — | ||
mSTS-MCIq [ | ]— | ✓ | — | ✓ | ✓ | — | — | ||
NNCTr [ | ]— | ✓ | — | ✓ | — | — | — | ||
NeuroUX [ | ]— | ✓ | — | ✓ | — | — | — | ||
NIHTB-CBs [ | ]✓ | — | — | — | — | — | — | ||
nQ Medical app [ | ]— | ✓ | — | — | — | — | — | ||
PALt [ | ]— | — | ✓ | ✓ | — | — | — | ||
WLRu [ | ]— | ✓ | — | ✓ | — | — | — | ||
SIMBACv [ | , ]— | ✓ | — | ✓ | ✓ | — | — | ||
SMARTw [ | ]— | ✓ | — | ✓ | ✓ | — | — | ||
e-CT [ | ]— | — | — | ✓ | — | — | — | ||
Type of Mood [ | ]— | ✓ | — | ✓ | — | — | — | ||
UCSF-BHAx (different versions) [ | - ]— | ✓ | ✓ | ✓ | — | ✓ | — | ||
VSTy (different versions) [ | - ]— | ✓ | — | ✓ | — | — | — | ||
WLAz [ | ]— | ✓ | — | ✓ | ✓ | ✓ | — | ||
“A tablet-based cognitive assessment” [ | ]✓ | ✓ | — | ✓ | — | — | — | ||
“Tablet-based automatic assessment using speech data” [ | ]— | — | — | ✓ | — | — | — |
aThe available information varied across psychometric properties—factor analysis information was provided for 8% (3/37) of the tools, clinical association information was provided for 68% (25/37) of the tools, biological association information was provided for 27% (10/37) of the tools, relevant group difference information was provided for 95% (35/37) of the tools, test-retest reliability information was provided for 32% (12/37) of the tools, responsiveness information was provided for 16% (6/37) of the tools, and clinical meaningfulness information was provided for 0% of the tools.
bNot available (ie, the psychometric property was not examined in the study).
cIndicates that the psychometric property was examined in the study.
dARC: Ambulatory Research in Cognition.
eASRT: automated story recall task.
fBRANCH: Boston Remote Assessment for Neurocognitive Health.
gCAMCI: Computer Assessment of Mild Cognitive Impairment.
hC3: Computerized Cognitive Composite.
idTMT-B&W: digital Trail Making Test–Black and White.
jEC-Screen: Electronic Cognitive Screen.
kGSCT: Geras Solutions cognitive test.
lCST: Cognitive Screening Test.
mICA: Integrated Cognitive Assessment.
nSB-C: speech biomarker for cognition.
oLORE: Learning Over Repeated Exposures.
pM2C2: Mobile Monitoring of Cognitive Change.
qmSTS-MCI: mobile screening test system for mild cognitive impairment.
rNNCT: Natural and Artificial Intelligence Health Assistant Neuro Cognitive Test.
sNIHTB-CB: National Institutes of Health Toolbox Cognition Battery.
tPAL: paired associate learning.
uWLR: word list recall.
vSIMBAC: Simulation-Based Assessment of Cognition.
wSMART: Survey for Memory, Attention, and Reaction Time.
xUCSF-BHA: University of California, San Francisco, Brain Health Assessment.
yVST: Virtual Supermarket Test.
zWLA: Winterlight Assessment.
Structural Validity
For a minority of digital tools (3/37, 8%; Inbrain CST [
], National Institutes of Health Toolbox Cognition Battery [NIHTB-CB] [ ], and the tablet-based cognitive test battery [ ]), a factor analysis on outcome variables was conducted to explore the underlying structure of the construct. The factor structure was examined using a multiple-factor analysis, both an exploratory and confirmatory factor analysis, or a confirmatory factor analysis, respectively. For 5% (2/37) of the tools, (structural) validity was formally assessed (NIHTB-CB [ ] and the tablet-based cognitive test battery [ ]), whereas for other tools, limited details were provided with regard to methodology and model fit (Inbrain CST [ ]).Construct Validity
For most tools (36/37, 97%), construct validity was assessed with regard to at least one aspect (ie, clinical or biological associations or relevant group differences). For the tool that was not assessed regarding construct validity (NIHTB-CB [
]), the study focused rather on its structural validity. Specifically, a clinical association was examined for 68% (25/37) of the tools. Existing clinical tests that digital tools were correlated with were, for example, the Mini-Mental State Examination, Montreal Cognitive Assessment, Rey Auditory Verbal Learning Test, or Preclinical Alzheimer’s Cognitive Composite (PACC) [ , , , , , , ]. Correlation coefficients differed largely between tools as well as between scores within tools. To illustrate, depending on the subscale and clinical test, correlations ranged from 0.174 to 0.735 for the Natural and Artificial Intelligence Health Assistant Neuro Cognitive Test [ ], from −0.02 to −0.57 for ARC [ ], and from −0.382 to 0.617 for BRANCH [ ].Biological validation was evaluated for a minority of the tools (10/37, 27%). Specifically, 11% (4/37) of the tools (ASRT [
], BRANCH [ ], C3 [ , ], and LORE [ ]) were validated against AD biomarkers (ie, amyloid and tau protein). In total, 3% (1/37) of the tools were validated against a neuroanatomical correlate of cortical thickness (Inbrain CST [ ]), and 14% (5/37) of the tools (Altoida [ ], ARC [ ], FACEmemory [ , ], PAL [ ], and UCSF-BHA [ - ]) were validated against both AD biomarkers and neuroanatomical correlates (eg, cortical, subcortical, cerebral, and hippocampal volumes). All 22% (8/37) of the tools that were assessed in a population of individuals with preclinical AD were validated against biological measures. The methodology used to assess biological associations was (1) calculating correlations with continuous biological measures, (2) using regression analyses to assess differences between biomarker groups, or (3) conducting receiver operating characteristic analyses to predict biomarker status. Reported correlation coefficients ranged between and within tools depending on the outcome measure and biological correlate, as illustrated, for example, by correlations ranging from −0.23 to 0.29 for the ARC [ ] composite score or correlations ranging from −0.306 to 0.219 for BRANCH [ ] subtasks. Similarly, in receiver operating characteristic analyses, the reported area under the curve (AUC) values for predicting amyloid positivity varied between and within tools, as illustrated by AUC values of 0.752 for UCSF-BHA [ ] (combination of outcome scores) and AUC values ranging from 0.43 to 0.92 for ASRT [ ] depending on the subtask, the sample (full sample, CU, or MCI), and whether data transcription was manual or automatic.Relevant clinical group differences in digital outcome scores were examined for most tools (35/37, 95%). The tool’s ability to distinguish between clinical groups was reported for different clinical groups. For example, group differences were assessed for (1) large-contrast groups (CU vs dementia, eg, cCOG [
], e-CT [ ], UCSF-BHA [ ], and JustTouch [ ]) and (2) smaller-contrast groups (eg, CompCog [ ], digital Trail Making Test–Black and White [ ], cCOG [ ], e-CT [ ], UCSF-BHA [ ], and JustTouch [ ] for CU vs MCI and JustTouch [ ] for MCI vs dementia).Reliability and Responsiveness
Test-retest reliability was examined for a minority (12/37, 32%) of the digital tools. The methods used to assess test-retest reliability were (1) intraclass correlation coefficients (ICCs; ARC [
]; Inbrain CST [ ]; Miro Health platform [ ]; the mobile screening test system for MCI [ ]; SIMBAC [ ]; and Survey for Memory, Attention, and Reaction Time [ ]), (2) correlations (ie, Pearson correlations: BRANCH [ ], cCOG [ ], Computer Assessment of Mild Cognitive Impairment [ ], SIMBAC [ ], and WLA [ ]; Spearman rank partial correlations: ki:e SB-C [ ]), or (3) linear models to assess performance differences between test and retest assessments (C3 [ ]). The settings in which test-retest reliabilities were assessed differed across the studies that were on-site (C3 [ ], Computer Assessment of Mild Cognitive Impairment [ ], and Inbrain CST [ ]), remote (WLA [ ]), or both on-site and remote (cCOG [ ]). The test-retest reliability differed across tools and was largely dependent on the outcome score. It is inherent to digital tools that they generate large amounts of outcome scores, for each of which reliability features could be reported. This results in a range from low to high reliability coefficients within a tool, as reflected in ICCs that ranged from 0.49 to 0.91 for Inbrain CST [ ] and Pearson correlations that ranged from 0.24 to 0.82 for cCOG [ ] or from 0.38 to 0.83 for WLA [ ]. For other tools, high reliability was consistently found for multiple scores, such as for the mobile screening test system for MCI [ ], where ICCs ranged from 0.97 to 0.98.For 5% (2/37) of the tools, both test-retest reliability and responsiveness were assessed (ki:e SB-C [
] and WLA [ ]). Responsiveness was reported for 16% (6/37) of the tools (Altoida [ ], C3 [ ], ki:e SB-C [ ], LORE [ ], UCSF-BHA [ ], and WLA [ ]). The methodologies used to assess changes in digital scores over time were linear mixed-effects models (C3 [ ], LORE [ ], UCSF-BHA [ ], and WLA [ ]), a computation of a change score (ki:e SB-C [ ]), or a longitudinal intraindividual variability metric that was compared between diagnostic groups (Altoida [ ]). If a digital tool generated multiple outcome scores, the responsiveness could vary depending on the specific score. For example, for the WLA [ ], rates of decline in 4 aggregate scores were assessed, where 1 of those scores was found to decline more rapidly for individuals with MCI or early AD than for individuals with Montreal Cognitive Assessment scores above the threshold for cognitive impairment (≥26). For another tool (C3 [ ]), 8 subtask scores and 1 composite score were generated, with most of these scores showing responsiveness to change over time, and change in 4 of these scores was reported to be associated with amyloid or tau protein burden. For 2 tools (C3 [ ] and LORE [ ]), improvement over time was demonstrated, where individuals without cognitive impairment with AD pathology showed less steep learning curves than controls, indicating that diminished learning curves may also be a promising indicator of cognitive decline.Interpretability
Clinically meaningful (changes in) scores were not defined for any of the digital tools (0%). However, for 11% (4/37) of the tools, information on clinical meaningfulness or related aspects was argued to have been demonstrated, whereas defined meaningful scores were not provided such that clinical meaningfulness was not assessed according to its definition as provided in the Methods section. C3 [
] was demonstrated to be associated with AD biomarkers and the PACC such that the authors concluded that this tool captures meaningful cognitive decrements. The ki:e SB-C [ ] tool was associated with scores on the Clinical Dementia Rating scale and differed both between diagnostic groups and between those who cognitively declined versus those who did not, which the authors interpreted as support for the tool’s clinical relevance. For LORE [ ], the authors argued that diminished learning curves on the digital measurement were meaningful, whereas further research is required to quantify its clinical meaningfulness. Hence, although the term “clinically meaningful” was used in these studies, clinical meaningfulness was not examined according to its definition, which presupposes a defined meaningful score [ , ].Discussion
Principal Findings
In this scoping review, we identified 50 studies covering 37 different smartphone- and tablet-based tools to assess cognition in individuals with preclinical AD or MCI. These numbers indicate that multiple smartphone- and tablet-based cognitive tools are currently being developed, whereas only a limited number of validation studies in this target population have been conducted for individual tools. Specifically, except for construct validity, most psychometric properties (ie, structural validity, reliability, responsiveness, and clinical meaningfulness) have not been extensively assessed to date. Thus, considering the foundations of measurement, as reflected in the psychometric properties, these tools are still in the early stages of validation. Moreover, it was observed that practical characteristics such as administration time or the investigative setting were not consistently reported. Although the initial reporting on the psychometric properties of smartphone- and tablet-based tools to assess cognition in early AD support their potential, further and careful validation of their psychometric properties, as well as clear information on practical characteristics, is essential to facilitate implementation.
Practical tool characteristics may be argued to be important toward considering the tool’s use in clinical practice, supported by previous observations that, for health care professionals, time efficiency would facilitate the use of digital tools [
]. We highlight that, for a significant number of tools, information was not reported regarding administration time (14/37, 38%), the testing design (single or repeated; 9/37, 24%), or the investigative setting (on-site or remote; 7/37, 19%). This is concerning as information on the setting is crucial to interpret whether the psychometric properties assessed are applicable for on-site or remote use of a tool and for a single (eg, annual) or repeated (eg, daily or monthly) measurement. Especially given that the advantages of digital tools include their time-efficient administration and potential for remote and highly frequent administration, we recommend that such characteristics be clearly reported.The generation of high-dimensional data is inherent to digital tools that assess cognition. While this may be regarded as an advantage, we observed that it also poses challenges to the overall evaluation of the quality of a tool and complicates one-to-one comparisons between tools. For instance, it remains challenging to decide which scores should be evaluated and compared, which raises questions such as whether the psychometric properties of single scores or aggregated scores should be considered when appraising the quality of the digital tool and whether the tool’s quality is determined by the outcome score that holds the lowest or highest psychometric quality. These challenges of high-dimensional datasets were specifically observed in the psychometric properties of construct validity and test-retest reliability, which widely ranged for the number of generated scores of a single tool. We recommend that the promising large amounts of generated data be reduced carefully by selecting those outcome scores that hold high psychometric quality, which we consider a crucial step to develop digital cognitive tools that provide clinically useful outcome scores. Another observation was that different methodologies were used between tools to determine their psychometric properties. This highlights the need for more uniformity regarding psychometric evaluation, which would facilitate the comparison of digital tools regarding their psychometric quality, which is needed to enable the end user to select the most appropriate tool for its context of use.
To standardize the process of clinical evaluation of digital tools, we propose a more widespread use of operating procedures, such as the FDA guidance on software as a medical device [
]. This guideline recognizes the importance of psychometric property evaluation and recommends 3 steps. The first step concerns clinical association, where the digital outcome should be demonstrated to be associated with the targeted clinical condition (ie, clinical decline as a result of AD). The second evaluation step regards analytical validation of the software, which is out of scope for this review. The third step involves clinical validation, where the digital outcome should be demonstrated to be reliable and clinically meaningful with regard to its intended purpose. The importance of provided information on psychometric properties is further emphasized by the previously reported notion that clear information on psychometric properties would facilitate the use of digital tools in clinical practice [ ]. To this end, we will discuss the reporting on individual psychometric properties in the following paragraphs.A first psychometric property that could be argued to be placed at the first step of the clinical evaluation process regards the conduct of factor analyses to confirm that outcome scores adequately reflect the dimensionality that is measured (ie, the underlying structure of one or multiple cognitive domains). In this review, we observed that a factor analysis on outcome scores was conducted for a limited number of digital tools (3/37, 8%) to demonstrate the underlying structure of the measured construct. This is concerning, and it is advisable for future validation studies to increasingly focus on conducting factor analyses to ensure the digital tool’s structural validity.
Another psychometric property regarding demonstrating clinical associations is construct validity, which was reported for most digital tools (36/37, 97%). Correlation levels between generated digital scores and scores on gold-standard measures of cognition ranged significantly both within and between tools, similarly to previous findings regarding self-administered computerized cognitive assessments [
]. It has been highlighted that, as currently used cognitive tests may not be suitable for early AD stages, validation against these “gold standard” tests may be particularly relevant in more progressed stages, whereas validation against biological measures of AD may be more relevant in preclinical stages [ , , ]. Recently, proposed revisions to the diagnostic criteria for AD have been published, with the common denominator that preclinical AD concerns a stage characterized by the absence of symptoms [ , ]. Hence, to demonstrate an association with the disease in this stage, biological validation is needed instead of clinical validation, for which no gold-standard test exists at this stage. This importance of biological validation in the preclinical AD stage was reflected in our observation that validation against biological measures was reported for fewer tools (10/37, 27%) and all tools that were assessed in the preclinical stage were validated against biological measures. Still, most tools (36/37, 97%) were validated against clinical measures, including tools that were examined in preclinical AD. In the absence of a true gold standard for preclinical AD, it remains concerning that novel sensitive tests for preclinical populations are nonetheless validated against existing tests with limited sensitivity. As suggested previously, predictive validity might be considered as an alternative “copper standard” [ ]. As we did not evaluate the reporting on predictive validity, further research should examine the ability of smartphone- and tablet-based tools for use in early AD stages to predict future cognitive decline or progression to MCI or dementia.The third clinical evaluation step, clinical validation, involves demonstrating reliability. Clinical trials examining potential intervention effects inherently require repeated assessments over time, and to accurately interpret change in cognition over time, the test-retest reliability of an outcome measure should be ensured in the target population. The test-retest reliability was examined for a limited number of digital tools included in this review (12/37, 32%). As we selected studies that focused on the MCI and preclinical stages, it may be the case that test-retest reliability was assessed in studies focusing on other populations, including, for example, individuals with dementia. However, as test-retest reliability may differ across populations, it is important that this psychometric property is assessed in the target population. The importance of ensuring the tool’s reliability may be emphasized even more in the context of self-administered digital tools that may be more susceptible to noise than supervised paper-and-pencil tests. For the tools that had available information on the test-retest reliability, the reliability levels ranged substantially, largely depending on the outcome score of a tool (eg, total score or single scores), in line with previous observations [
]. This makes it highly challenging to determine the overall reliability of a specific tool and stresses the importance of careful selection of outcome scores. We recommend that the test-retest reliability should be thoroughly considered before selecting outcome scores for further validation steps. In addition, as the large absence of information on this psychometric property may hinder the implementation of digital tools, validation studies should increasingly focus on evaluating test-retest reliability. Alternatively, although we note that the independent validation studies are preferable, we recommend that the evaluation of test-retest reliability also be incorporated into clinical trial designs where, for example, the natural course of a digital outcome should be attested first before investigating its change after the intervention.To determine whether a digital cognitive tool is sensitive to measure potential treatment effects or cognitive changes over time for monitoring purposes, information on the tool’s responsiveness is essential. Responsiveness was reported for a small number of tools (6/37, 16%) using different methodologies, and findings varied within the outcome scores of a single tool. Interestingly, of these 6 tools, 2 (33%) showed that less steep improvement was associated with preclinical AD [
, ]. Thus, although it may be expected that declining performance would be indicative of cognitive change, these new digital repeated testing approaches showed that diminished learning curves may also serve as an indicator of cognitive change in individuals with preclinical AD. Our finding that responsiveness was evaluated to a lesser extent may be explained by the fact that it requires a more effortful, longitudinal study design. In addition, this low number of tools for which responsiveness was reported may reflect the relatively young character of the research field involving digital tools to assess cognition in early AD stages, which may indicate that these tools are not ready for clinical implementation yet and warrant further validation. Accordingly, it is recommended that future validation studies focus on examining the responsiveness of digital outcome scores to change over time in the target population as an essential step toward clinical implementation.Another aspect of the third clinical validation step regards clinical meaningfulness. To enable clinical interpretation of digital scores and clinically relevant change, the clinical meaningfulness of scores and their change should be determined. However, no consensus has been established yet on what score should be considered clinically meaningful [
]. The importance of clinically meaningful end points has also been stated by the US Food and Drug Administration draft guidance from March 2024 [ ], where it has been recognized that, in early AD stages, it may be challenging to demonstrate clinical meaningfulness due to subtle cognitive changes and the absence of overt cognitive impairment. With the lack of a clear definition of clinical meaningfulness in early AD stages, this quality characteristic may be easily overlooked in the validation of digital cognitive tools, as reflected in our observation that clinically meaningful scores were not defined for any of the identified digital tools. However, it should be noted that, for some digital outcome scores, it was argued that their meaningfulness was supported by associations with AD biomarkers and clinical measures (PACC or Clinical Dementia Rating), as well as by scores or learning curves that differed between diagnostic groups [ , , ]. Such definitions of clinical meaningfulness related to clinical, biological, and discriminative validity seem to refer to the relevance of the measure rather than the magnitude of the score or change in score that demonstrated clinical meaningfulness [ ]. Hence, there is a pressing need for a clear definition of clinical meaningfulness, and it has been suggested that this definition should be developed by incorporating perspectives of various stakeholders (eg, patients, clinicians, and regulatory bodies) [ ]. In addition, clinical meaningfulness may differ across patients, for which digital tools may offer an opportunity to determine personalized end points. Moreover, norm scores or cutoff scores may be considered toward establishing clinical meaningfulness. Thus, it is recommended that, within the field concerning cognitive assessment in early-stage AD in general, attention should be paid to reaching consensus on the definition of clinical meaningfulness of digital outcome scores to enable clinically useful interpretation of digital outcomes in future validation studies.The major strength of this review was that we focused on the preclinical AD and MCI stages. These predementia stages offer the greatest potential for benefit from intervention strategies as well as having the highest need and offering the greatest feasibility in the context of novel digital remote assessments. Another strength was that we used a systematic search to identify validation studies of digital tools to assess cognition, which allowed for a large selection of studies. In addition, as our scope covered a range of psychometric properties that were selected based on previous recommendations, as well as practical tool characteristics that enabled the identification of an “information gap” in both areas, this allowed for the provision of a clear overview on what information is missing and, thus, what should be focused on to support the implementation of digital tools.
This review has a number of limitations. First, although we used a broad and systematic search strategy to minimize the chance of missing studies, the identified studies relied on specific search terms such that potentially relevant studies might have been excluded. For example, studies were not identified within the search if keywords related to the digital aspect of tools, such as “digital,” “computerized,” “smartphone,” or “tablet,” were not used in the title or abstract. Second, we did not conduct quality assessments of the included studies, which may have potentially biased our overall conclusions. Third, because this is a rapidly developing field, new digital cognitive assessments will be continuously emerging, highlighting the need for ongoing evaluation. Therefore, our aim was not to provide an elaborate overview of currently available tools but rather identify possible information gaps on reported psychometric properties in the landscape of digital cognitive tools. In addition, our scope was limited to tools that had the potential for remote assessment. The evaluation of this characteristic was not based on a strict criterion but rather on the contextual information provided indicating the suitability for self-administration that implies the potential for remote administration, which may, thus, be slightly subjective. We decided not to formulate a stricter eligibility criterion for this remote design as, for most tools, it was not explicitly stated whether the tool was intended for remote assessment. As this is in fact important information, it is recommended that future validation studies provide clear information on the intended context of the digital tool. Moreover, as we focused on validation studies, we did not include development studies, nor did we assess additional information on the tools’ websites on the tool characteristics, such as administration time, intended use, or available languages. Therefore, we emphasize that the provided overview of such characteristics may not adequately reflect all practical information or languages available to date but rather provide an overview of all available information as reported in validation studies targeting a population of individuals with preclinical AD or MCI. However, it is crucial to include such information in scientific publications for end users to decide which tools are most appropriate for their context of use and to evaluate the quality of digital cognitive tools in future systematic reviews [
]. Furthermore, we did not evaluate the quality of the methods used to assess the psychometric properties, and it is subject to discussion whether any method used was appropriate. For instance, to demonstrate structural validity, confirmatory factor analyses are argued to be the preferred method over exploratory factor analyses [ - ]. Similarly, to evaluate the quality of the tool’s construct validity, it is of relevance whether the clinical or biological correlations or group differences assessed are in the hypothesized direction, and for evaluating responsiveness, the time frame (eg, days, weeks, or months) should be considered. We provided a first step to identify the reporting on psychometric properties regardless of their quality, but future research should focus on the quality of the methodology used to determine psychometric properties.Conclusions
Our results indicate that the landscape of smartphone- and tablet-based cognitive tools for early AD could currently be placed at the first step in the validation process, as described in the Food and Drug Administration guidance on software as a medical device [
]. Following this guidance, it is recommended that, after establishing a valid clinical association, digital tools should be further validated with regard to their context of use. For the development of novel tools, we recommend clearly stating the digital tool’s intended use to determine the prioritized validation steps with regard to the most relevant psychometric properties. For instance, if a tool would be used for monitoring purposes, demonstrating responsiveness is crucial. For digital screening tools, sensitivity should be demonstrated, whereas for digital diagnostic tools, the specificity for a clinical condition should be validated. As we identified multiple gaps in the available information on crucial psychometric properties, which halts the clinical validation of these tools, this stresses the need for more attention on psychometric quality, where the selection of outcome scores should be considered carefully and reported consistently. As a next step, future studies should determine the psychometric quality of currently developed digital tools for use in early AD stages. Ensuring psychometrically high-quality outcome scores must be a central theme in the development of smartphone- and tablet-based tools that assess cognition in individuals with preclinical AD and MCI. Further clinical validation tailored to the context of use will pave the way for the implementation of digital cognitive tools.Acknowledgments
This project was funded by the public-private partnership (PPP) allowance made available by Health~Holland, Top Sector Life Sciences and Health, to stimulate PPPs (LSHM20084-SGF). The authors acknowledge Dr Wieneke Mokkink for her consultation on psychometric properties and the neuropsychology of aging research group of the Alzheimer Center Amsterdam for the discussion on the data extraction process. WMvdF and SAMS are recipients of Innovative Health Initiative AD-RIDDLE grant 101132933, a project supported by the Innovative Health Initiative Joint Undertaking. The Joint Undertaking receives support from the European Union’s Horizon Europe research and innovation program and from COCIR, the European Federation of Pharmaceutical Industries and Associations, EuropaBio, MedTech Europe, and Vaccines Europe, with Davos Alzheimer’s Collaborative, Combinostics Oy, Cambridge Cognition Ltd, C2N Diagnostics LLC, and neotiv GmbH. The chair of WMvdF is supported by the Pasman stichting. SAMS is a recipient of funds from Health~Holland, Top Sector Life Sciences and Health (PPP allowance: “Deep and Frequent clinical Evaluation in Alzheimer’s Disease” DEFEAT-AD project [LSHM20084] and “Remote Monitoring of Dementia's Early warning signs” REMONIT-AD [LSHM22026]); Alzheimer Nederland (“Sustainable and Personalised Advances in Dementia care” SPREAD+); the Ministry of Health, Welfare, and Sport (90001586); the Netherlands Organisation for Health Research and Development in the context of the Dementia Research Program “Onderzoeksprogramma Dementie,” part of the Dutch National Dementia Strategy (Timely, Accurate, and Personalized diagnosis of dementia; 10510032120003); the Netherlands Organisation for Health Research and Development (Dissemination and Implementation Impulse “Verspreidings- en Implementatie Impuls” [VIMP]; 7330502051 and 73305095008); and the Dutch Research Council (YOD-MOLECULAR; KICH1.GZ02.20.004) as part of the Dutch Research Council Knowledge and Innovation Covenant research program 2020 to 2023 Living with Dementia. Young-Onset Dementia: Mechanisms of Selective Vulnerability And their contribution to disease presentation (YOD-MOLECULAR) receives cofinancing from Winterlight Labs, Alleo Labs, and the Dutch Brain Foundation. Team Alzheimer also contributes to YOD-MOLECULAR. The Alzheimer Center Amsterdam is part of the neurodegeneration research program of Amsterdam Neuroscience. The Research of Alzheimer Center Amsterdam is supported by Alzheimer Netherlands Foundation and Alzheimer Center Amsterdam Support Foundation.
Authors' Contributions
SAMS, CdB, RJJ, and JEH initially designed the project. RLvdB, MJK, CdB, and SAMS developed the search strategy in collaboration with the medical information specialist, KAZ. RLvdB, SMvdL, MJK, and AMvG were involved in the screening procedures. RLvdB, SMvdL, MJK, MvD, RJJ, CdB, WMvdF, and SAMS contributed to the interpretation of the results. RLvdB took the lead in writing the manuscript, and all authors contributed to the final version of the manuscript. All coauthors approved the final version.
Conflicts of Interest
MJK was an employee of Neurocast BV. JEH is an employee of and shareholder in Scottish Brain Sciences and a paid consultant to Cambridge Cognition. SAMS is a scientific advisory board member of Prothena Biosciences and Cogstate; provides consultancy services to AriBio Co. Ltd and Biogen; receives license fees from Brain Research Center, Green Valley, vTv Therapeutics, Alzheon, Vivoryon Therapeutics, and Roche; and is the developer of the Amsterdam Instrumental Activities of Daily Living Questionnaire. All license fees are for the organization. WMvdF’s research programs have been funded by the Netherlands Organisation for Health Research and Development (ZonMW), the Dutch Research Council, the European Union Joint Programme – Neurodegenerative Disease Research, the European Union Innovative Health Initiative, Alzheimer Nederland, Dutch Brain Foundation – Cardiovascular Research Netherlands, Health~Holland, Top Sector Life Sciences and Health, Dioraphte Foundation, Gieskes-Strijbis Fund, Equilibrio Foundation, Edwin Bouw Fund, Pasman Foundation, Alzheimer & Neuropsychiatry Foundation, Philips, Biogen MA Inc, Novartis Nederland, Life Molecular Imaging, Avid Radiopharmaceuticals, Roche Nederland BV, Fujifilm, Eisai, and Combinostics. WMvdF holds the Pasman chair. WMvdF is the recipient of A Personalized Medicine Approach for Alzheimer’s Disease, which is a public-private partnership receiving funding from ZonMW (73305095007) and Health~Holland, Top Sector Life Sciences and Health (public-private partnership allowance; LSHM20106). WMvdF is the recipient of Timely, Accurate, and Personalized diagnosis of dementia which receives funding from ZonMw (10510032120003). Timely, Accurate, and Personalized diagnosis of dementia receives cofinancing from Avid Radiopharmaceuticals and Amprion. All funding is paid to her institution. WMvdF has been an invited speaker at Biogen MA Inc, Danone, Eisai, WebMD Neurology (Medscape), Novo Nordisk, Springer Healthcare, and the European Brain Council. WMvdF is a consultant for Oxford Health Policy Forum CIC, Roche, Biogen MA Inc, and Eisai. WMvdF has participated in the advisory boards of Biogen MA Inc, Roche, and Eli Lilly and Company. WMvdF is a member of the steering committee of Evoke and Evoke+ (Novo Nordisk). All funding is paid to her institution. WMvdF is a member of the steering committee of Project Alzheimer’s Value Europe and Think Brain Health Global. WMvdF was an associate editor of Alzheimer’s Research & Therapy in 2020 and 2021. WMvdF is an associate editor at Brain. All other authors declare no other conflicts of interest.
PRISMA-ScR checklist.
DOCX File , 49 KBSearch strategies.
DOCX File , 47 KBReferences
- Jack CRJ, Bennett DA, Blennow K, Carrillo MC, Dunn B, Haeberlein SB, et al. NIA-AA research framework: toward a biological definition of Alzheimer's disease. Alzheimers Dement. Apr 2018;14(4):535-562. [FREE Full text] [CrossRef] [Medline]
- Scheltens P, De Strooper B, Kivipelto M, Holstege H, Chételat G, Teunissen CE, et al. Alzheimer's disease. Lancet. Apr 24, 2021;397(10284):1577-1590. [FREE Full text] [CrossRef] [Medline]
- Sperling RA, Aisen PS, Beckett LA, Bennett DA, Craft S, Fagan AM, et al. Toward defining the preclinical stages of Alzheimer's disease: recommendations from the National Institute on Aging-Alzheimer's Association workgroups on diagnostic guidelines for Alzheimer's disease. Alzheimers Dement. May 2011;7(3):280-292. [FREE Full text] [CrossRef] [Medline]
- Petersen RC, Wiste HJ, Weigand SD, Rocca WA, Roberts RO, Mielke MM, et al. Association of elevated amyloid levels with cognition and biomarkers in cognitively normal people from the community. JAMA Neurol. Jan 2016;73(1):85-92. [FREE Full text] [CrossRef] [Medline]
- Mueller KD, Van Hulle CA, Koscik RL, Jonaitis E, Peters CC, Betthauser TJ, et al. Amyloid beta associations with connected speech in cognitively unimpaired adults. Alzheimers Dement (Amst). May 27, 2021;13(1):e12203. [FREE Full text] [CrossRef] [Medline]
- Ebenau JL, Visser D, Kroeze LA, van Leeuwenstijn MS, van Harten AC, Windhorst AD, et al. Longitudinal change in ATN biomarkers in cognitively normal individuals. Alzheimers Res Ther. Sep 03, 2022;14(1):124. [FREE Full text] [CrossRef] [Medline]
- Dubbelman MA, Hendriksen HM, Harrison JE, Vijverberg EG, Prins ND, Kroeze LA, et al. Cognitive and functional change over time in cognitively healthy individuals according to Alzheimer disease biomarker-defined subgroups. Neurology. Jan 23, 2024;102(2):e207978. [FREE Full text] [CrossRef] [Medline]
- Livingston G, Sommerlad A, Orgeta V, Costafreda SG, Huntley J, Ames D, et al. Dementia prevention, intervention, and care. Lancet. Dec 16, 2017;390(10113):2673-2734. [CrossRef] [Medline]
- Prince M, Wimo A, Guerchet M, Ali GC, Wu YT, Prina M. World Alzheimer Report 2015: the global impact of dementia: an analysis of prevalence, incidence, cost and trends. Alzheimer’s Disease International. 2015. URL: https://www.alzint.org/u/WorldAlzheimerReport2015.pdf [accessed 2025-05-20]
- van Dyck CH, Swanson CJ, Aisen P, Bateman RJ, Chen C, Gee M, et al. Lecanemab in early Alzheimer's disease. N Engl J Med. Jan 05, 2023;388(1):9-21. [CrossRef] [Medline]
- Ngandu T, Lehtisalo J, Solomon A, Levälahti E, Ahtiluoto S, Antikainen R, et al. A 2 year multidomain intervention of diet, exercise, cognitive training, and vascular risk monitoring versus control to prevent cognitive decline in at-risk elderly people (FINGER): a randomised controlled trial. Lancet. Jun 06, 2015;385(9984):2255-2263. [CrossRef] [Medline]
- Cummings J, Lee G, Nahed P, Kambar ME, Zhong K, Fonseca J, et al. Alzheimer's disease drug development pipeline: 2022. Alzheimers Dement (N Y). May 04, 2022;8(1):e12295. [FREE Full text] [CrossRef] [Medline]
- Snyder PJ, Kahle-Wrobleski K, Brannan S, Miller DS, Schindler RJ, DeSanti S, et al. Assessing cognition and function in Alzheimer's disease clinical trials: do we have the right tools? Alzheimers Dement. Nov 2014;10(6):853-860. [CrossRef] [Medline]
- Cummings J, Feldman HH, Scheltens P. The "rights" of precision drug development for Alzheimer's disease. Alzheimers Res Ther. Aug 31, 2019;11(1):76. [FREE Full text] [CrossRef] [Medline]
- Weintraub S, Carrillo MC, Farias ST, Goldberg TE, Hendrix JA, Jaeger J, et al. Measuring cognition and function in the preclinical stage of Alzheimer's disease. Alzheimers Dement (N Y). Feb 13, 2018;4:64-75. [FREE Full text] [CrossRef] [Medline]
- Jutten RJ, Papp KV, Hendrix S, Ellison N, Langbaum JB, Donohue MC, et al. Why a clinical trial is as good as its outcome measure: a framework for the selection and use of cognitive outcome measures for clinical trials of Alzheimer's disease. Alzheimers Dement. Feb 2023;19(2):708-720. [FREE Full text] [CrossRef] [Medline]
- Jutten RJ, Sikkes SA, Amariglio RE, Buckley RF, Properzi MJ, Marshall GA, et al. Identifying sensitive measures of cognitive decline at different clinical stages of Alzheimer's disease. J Int Neuropsychol Soc. May 2021;27(5):426-438. [FREE Full text] [CrossRef] [Medline]
- Carlson S, Kim H, Devanand DP, Goldberg TE. Novel approaches to measuring neurocognitive functions in Alzheimer's disease clinical trials. Curr Opin Neurol. Apr 01, 2022;35(2):240-248. [FREE Full text] [CrossRef] [Medline]
- Öhman F, Hassenstab J, Berron D, Schöll M, Papp KV. Current advances in digital cognitive assessment for preclinical Alzheimer's disease. Alzheimers Dement (Amst). Jul 20, 2021;13(1):e12217. [FREE Full text] [CrossRef] [Medline]
- Gold M, Amatniek J, Carrillo MC, Cedarbaum JM, Hendrix JA, Miller BB, et al. Digital technologies as biomarkers, clinical outcomes assessment, and recruitment tools in Alzheimer's disease clinical trials. Alzheimers Dement (N Y). May 24, 2018;4:234-242. [FREE Full text] [CrossRef] [Medline]
- Sliwinski MJ, Mogle JA, Hyun J, Munoz E, Smyth JM, Lipton RB. Reliability and validity of ambulatory cognitive assessments. Assessment. Jan 2018;25(1):14-30. [FREE Full text] [CrossRef] [Medline]
- Nicosia J, Aschenbrenner AJ, Balota DA, Sliwinski MJ, Tahan M, Adams S, et al. Unsupervised high-frequency smartphone-based cognitive assessments are reliable, valid, and feasible in older adults at risk for Alzheimer's disease. J Int Neuropsychol Soc. Jun 2023;29(5):459-471. [FREE Full text] [CrossRef] [Medline]
- Canini M, Battista P, Della Rosa PA, Catricalà E, Salvatore C, Gilardi MC, et al. Computerized neuropsychological assessment in aging: testing efficacy and clinical ecology of different interfaces. Comput Math Methods Med. 2014;2014:804723. [FREE Full text] [CrossRef] [Medline]
- Tsoy E, Possin KL, Thompson N, Patel K, Garrigues SK, Maravilla I, et al. Self-administered cognitive testing by older adults at-risk for cognitive decline. J Prev Alzheimers Dis. 2020;7(4):283-287. [FREE Full text] [CrossRef] [Medline]
- Library of digital endpoints. Digital Medicine Society. URL: https://dimesociety.org/library-of-digital-endpoints/ [accessed 2024-06-17]
- van Gils AM, Visser LN, Hendriksen HM, Georges J, Muller M, Bouwman FH, et al. Assessing the views of professionals, patients, and care partners concerning the use of computer tools in memory clinics: international survey study. JMIR Form Res. Dec 03, 2021;5(12):e31053. [FREE Full text] [CrossRef] [Medline]
- Mc Carthy M, Schueler P. Editorial: can digital technology advance the development of treatments for Alzheimer's disease? J Prev Alzheimers Dis. 2019;6(4):217-220. [CrossRef] [Medline]
- Kaye J, Aisen P, Amariglio R, Au R, Ballard C, Carrillo M, et al. Using digital tools to advance Alzheimer's drug trials during a pandemic: the EU/US CTAD task force. J Prev Alzheimers Dis. 2021;8(4):513-519. [FREE Full text] [CrossRef] [Medline]
- Mokkink LB, Terwee CB, Patrick DL, Alonso J, Stratford PW, Knol DL, et al. The COSMIN study reached international consensus on taxonomy, terminology, and definitions of measurement properties for health-related patient-reported outcomes. J Clin Epidemiol. Jul 2010;63(7):737-745. [CrossRef] [Medline]
- Kristensen LQ, Muren MA, Petersen AK, van Tulder MW, Gregersen Oestergaard L. Measurement properties of performance-based instruments to assess mental function during activity and participation in traumatic brain injury: a systematic review. Scand J Occup Ther. Apr 2020;27(3):168-183. [FREE Full text] [CrossRef] [Medline]
- Tsoy E, Zygouris S, Possin KL. Current state of self-administered brief computerized cognitive assessments for detection of cognitive disorders in older adults: a systematic review. J Prev Alzheimers Dis. 2021;8(3):267-276. [FREE Full text] [CrossRef] [Medline]
- Arksey H, O'Malley L. Scoping studies: towards a methodological framework. Int J Soc Res Methodol. 2005;8(1):19-32. [CrossRef]
- Tricco AC, Lillie E, Zarin W, O'Brien KK, Colquhoun H, Levac D, et al. PRISMA Extension for Scoping Reviews (PRISMA-ScR): checklist and explanation. Ann Intern Med. Oct 02, 2018;169(7):467-473. [FREE Full text] [CrossRef] [Medline]
- Hair K, Bahor Z, Macleod M, Liao J, Sena ES. The Automated Systematic Search Deduplicator (ASySD): a rapid, open-source, interoperable tool to remove duplicate citations in biomedical systematic reviews. BMC Biol. Sep 07, 2023;21(1):189. [FREE Full text] [CrossRef] [Medline]
- van de Schoot R, de Bruin J, Schram R, Zahedi P, de Boer J, Weijdema F, et al. An open source machine learning framework for efficient and transparent systematic reviews. Nat Mach Intell. Feb 01, 2021;3:125-133. [CrossRef]
- Ros R, Bjarnason E, Runeson P. A machine learning approach for semi-automated search and selection in literature studies. In: Proceedings of the 21st International Conference on Evaluation and Assessment in Software Engineering. 2017. Presented at: EASE '17; June 15-16, 2017; Karlskrona, Sweden. [CrossRef]
- Bernardes RC, Botina LL, Araújo RD, Guedes RN, Martins GF, Lima MA. Artificial intelligence-aided meta-analysis of toxicological assessment of agrochemicals in bees. Front Ecol Evol. May 19, 2022;10:1-12. [CrossRef]
- Bourke M, Haddara A, Loh A, Carson V, Breau B, Tucker P. Adherence to the World Health Organization's physical activity recommendation in preschool-aged children: a systematic review and meta-analysis of accelerometer studies. Int J Behav Nutr Phys Act. Apr 26, 2023;20(1):52. [FREE Full text] [CrossRef] [Medline]
- Guan X, Feng X, Islam AY. The dilemma and countermeasures of educational data ethics in the age of intelligence. Humanit Soc Sci Commun. Apr 01, 2023;10:138. [CrossRef]
- van Dijk SH, Brusse-Keizer MG, Bucsán CC, van der Palen J, Doggen CJ, Lenferink A. Artificial intelligence in systematic reviews: promising when appropriately used. BMJ Open. Jul 07, 2023;13(7):e072254. [FREE Full text] [CrossRef] [Medline]
- König L, Zitzmann S, Fütterer T, Campos DG, Scherer R, Hecht M. An evaluation of the performance of stopping rules in AI-aided screening for psychological meta-analytical research. Res Synth Methods. Nov 2024;15(6):1120-1146. [CrossRef] [Medline]
- Boetje J, van de Schoot R. The SAFE procedure: a practical stopping heuristic for active learning-based screening in systematic reviews and meta-analyses. Syst Rev. Mar 01, 2024;13(1):81. [FREE Full text] [CrossRef] [Medline]
- Ouzzani M, Hammady H, Fedorowicz Z, Elmagarmid A. Rayyan-a web and mobile app for systematic reviews. Syst Rev. Dec 05, 2016;5(1):210. [FREE Full text] [CrossRef] [Medline]
- Koo TK, Li MY. A guideline of selecting and reporting intraclass correlation coefficients for reliability research. J Chiropr Med. Jun 2016;15(2):155-163. [FREE Full text] [CrossRef] [Medline]
- Page MJ, McKenzie JE, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. Syst Rev. Mar 29, 2021;10(1):89. [FREE Full text] [CrossRef] [Medline]
- Simfukwe C, Youn YC, Kim SY, An SS. Digital trail making test-black and white: normal vs MCI. Appl Neuropsychol Adult. 2022;29(6):1296-1303. [CrossRef] [Medline]
- Morrison RL, Pei H, Novak G, Kaufer DI, Welsh-Bohmer KA, Ruhmel S, et al. A computerized, self-administered test of verbal episodic memory in elderly patients with mild cognitive impairment and healthy participants: a randomized, crossover, validation study. Alzheimers Dement (Amst). Sep 27, 2018;10:647-656. [FREE Full text] [CrossRef] [Medline]
- Pettigrew C, Soldan A, Brichko R, Zhu Y, Wang MC, Kutten K, et al. Computerized paired associate learning performance and imaging biomarkers in older adults without dementia. Brain Imaging Behav. Apr 2022;16(2):921-929. [FREE Full text] [CrossRef] [Medline]
- Fristed E, Skirrow C, Meszaros M, Lenain R, Meepegama U, Cappa S, et al. A remote speech-based AI system to screen for early Alzheimer's disease via smartphones. Alzheimers Dement (Amst). Nov 03, 2022;14(1):e12366. [FREE Full text] [CrossRef] [Medline]
- Skirrow C, Meszaros M, Meepegama U, Lenain R, Papp KV, Weston J, et al. Validation of a remote and fully automated story recall task to assess for early cognitive impairment in older adults: longitudinal case-control observational study. JMIR Aging. Sep 30, 2022;5(3):e37090. [FREE Full text] [CrossRef] [Medline]
- Papp KV, Samaroo A, Chou HC, Buckley R, Schneider OR, Hsieh S, et al. Unsupervised mobile cognitive testing for use in preclinical Alzheimer's disease. Alzheimers Dement (Amst). Sep 30, 2021;13(1):e12243. [FREE Full text] [CrossRef] [Medline]
- Rhodius-Meester HF, Paajanen T, Koikkalainen J, Mahdiani S, Bruun M, Baroni M, et al. cCOG: a web-based cognitive test tool for detecting neurodegenerative disorders. Alzheimers Dement (Amst). Aug 25, 2020;12(1):e12083. [FREE Full text] [CrossRef] [Medline]
- Chan JY, Wong A, Yiu B, Mok H, Lam P, Kwan P, et al. Electronic cognitive screen technology for screening older adults with dementia and mild cognitive impairment in a community setting: development and validation study. J Med Internet Res. Dec 18, 2020;22(12):e17332. [FREE Full text] [CrossRef] [Medline]
- Suzumura S, Osawa A, Maeda N, Sano Y, Kandori A, Mizuguchi T, et al. Differences among patients with Alzheimer's disease, older adults with mild cognitive impairment and healthy older adults in finger dexterity. Geriatr Gerontol Int. Jun 2018;18(6):907-914. [CrossRef] [Medline]
- Wu YH, Vidal JS, de Rotrou J, Sikkes SA, Rigaud AS, Plichart M. Can a tablet-based cancellation test identify cognitive impairment in older adults? PLoS One. Jul 24, 2017;12(7):e0181809. [FREE Full text] [CrossRef] [Medline]
- Meier IB, Buegler M, Harms R, Seixas A, Çöltekin A, Tarnanas I. Using a digital neuro signature to measure longitudinal individual-level change in Alzheimer's disease: the Altoida large cohort study. NPJ Digit Med. Jun 24, 2021;4(1):101. [FREE Full text] [CrossRef] [Medline]
- Rai L, Boyle R, Brosnan L, Rice H, Farina F, Tarnanas I, et al. Digital biomarkers based individualized prognosis for people at risk of dementia: the AltoidaML multi-site external validation study. Adv Exp Med Biol. 2020;1194:157-171. [CrossRef] [Medline]
- Seixas AA, Rajabli F, Pericak-Vance MA, Jean-Louis G, Harms RL, Tarnanas I. Associations of digital neuro-signatures with molecular and neuroimaging measures of brain resilience: the altoida large cohort study. Front Psychiatry. Aug 09, 2022;13:899080. [FREE Full text] [CrossRef] [Medline]
- Alegret M, Muñoz N, Roberto N, Rentz DM, Valero S, Gil S, et al. A computerized version of the short form of the face-name associative memory exam (FACEmemory®) for the early detection of Alzheimer's disease. Alzheimers Res Ther. Mar 16, 2020;12(1):25. [FREE Full text] [CrossRef] [Medline]
- Alegret M, Sotolongo-Grau O, de Antonio EE, Pérez-Cordón A, Orellana A, Espinosa A, et al. Automatized FACEmemory® scoring is related to Alzheimer's disease phenotype and biomarkers in early-onset mild cognitive impairment: the BIOFACE cohort. Alzheimers Res Ther. Mar 18, 2022;14(1):43. [CrossRef] [Medline]
- Gielis K, Vanden Abeele ME, Verbert K, Tournoy J, De Vos M, Vanden Abeele V. Detecting mild cognitive impairment via digital biomarkers of cognitive performance found in Klondike solitaire: a machine-learning study. Digit Biomark. Feb 19, 2021;5(1):44-52. [FREE Full text] [CrossRef] [Medline]
- Gielis K, Vanden Abeele ME, De Croon R, Dierick P, Ferreira-Brito F, Van Assche L, et al. Dissecting digital card games to yield digital biomarkers for the assessment of mild cognitive impairment: methodological approach and exploratory study. JMIR Serious Games. Nov 04, 2021;9(4):e18359. [FREE Full text] [CrossRef] [Medline]
- Rapp SR, Barnard RT, Sink KM, Chamberlain DG, Wilson V, Lu L, et al. Computer simulations for assessing cognitively intensive instrumental activities of daily living in older adults. Alzheimers Dement (Amst). Feb 23, 2018;10:237-244. [FREE Full text] [CrossRef] [Medline]
- Ip EH, Barnard R, Marshall SA, Lu L, Sink K, Wilson V, et al. Development of a video-simulation instrument for assessing cognition in older adults. BMC Med Inform Decis Mak. Dec 06, 2017;17(1):161. [FREE Full text] [CrossRef] [Medline]
- Eraslan Boz H, Limoncu H, Zygouris S, Tsolaki M, Giakoumis D, Votis K, et al. A new tool to assess amnestic mild cognitive impairment in Turkish older adults: virtual supermarket (VSM). Neuropsychol Dev Cogn B Aging Neuropsychol Cogn. Sep 2020;27(5):639-653. [CrossRef] [Medline]
- Zygouris S, Giakoumis D, Votis K, Doumpoulakis S, Ntovas K, Segkouli S, et al. Can a virtual reality cognitive training application fulfill a dual role? Using the virtual supermarket cognitive training application as a screening tool for mild cognitive impairment. J Alzheimers Dis. 2015;44(4):1333-1347. [CrossRef] [Medline]
- Zygouris S, Iliadou P, Lazarou E, Giakoumis D, Votis K, Alexiadis A, et al. Detection of mild cognitive impairment in an at-risk group of older adults: can a novel self-administered serious game-based screening test improve diagnostic accuracy? J Alzheimers Dis. 2020;78(1):405-412. [FREE Full text] [CrossRef] [Medline]
- Zygouris S, Ntovas K, Giakoumis D, Votis K, Doumpoulakis S, Segkouli S, et al. A preliminary study on the feasibility of using a virtual reality cognitive training application for remote detection of mild cognitive impairment. J Alzheimers Dis. 2017;56(2):619-627. [CrossRef] [Medline]
- Holmes AA, Tripathi S, Katz E, Mondesire-Crump I, Mahajan R, Ritter A, et al. A novel framework to estimate cognitive impairment via finger interaction with digital devices. Brain Commun. Jul 28, 2022;4(4):fcac194. [FREE Full text] [CrossRef] [Medline]
- Ntracha A, Iakovakis D, Hadjidimitriou S, Charisis VS, Tsolaki M, Hadjileontiadis LJ. Detection of mild cognitive impairment through natural language and touchscreen typing processing. Front Digit Health. Oct 8, 2020;2:567158. [FREE Full text] [CrossRef] [Medline]
- Tröger J, Baykara E, Zhao J, Ter Huurne D, Possemis N, Mallick E, et al. Validation of the remote automated ki:e speech biomarker for cognition in mild cognitive impairment: verification and validation following DiME V3 framework. Digit Biomark. Sep 30, 2022;6(3):107-116. [FREE Full text] [CrossRef] [Medline]
- Robin J, Xu M, Kaufman LD, Simpson W. Using digital speech assessments to detect early signs of cognitive impairment. Front Digit Health. Oct 27, 2021;3:749758. [FREE Full text] [CrossRef] [Medline]
- Yamada Y, Shinkawa K, Kobayashi M, Nishimura M, Nemoto M, Tsukada E, et al. Tablet-based automatic assessment for early detection of Alzheimer's disease using speech responses to daily life questions. Front Digit Health. Mar 17, 2021;3:653904. [FREE Full text] [CrossRef] [Medline]
- Hartle L, Martorelli M, Balboni G, Souza R, Charchat-Fichman H. Diagnostic accuracy of CompCog: reaction time as a screening measure for mild cognitive impairment. Arq Neuropsiquiatr. Jun 2022;80(6):570-579. [FREE Full text] [CrossRef] [Medline]
- Ma Y, Carlsson CM, Wahoske ML, Blazel HM, Chappell RJ, Johnson SC, et al. Latent factor structure and measurement invariance of the nih toolbox cognition battery in an Alzheimer's disease research sample. J Int Neuropsychol Soc. May 2021;27(5):412-425. [FREE Full text] [CrossRef] [Medline]
- Oliva I, Losa J. Validation of the computerized cognitive assessment test: NNCT. Int J Environ Res Public Health. Aug 23, 2022;19(17):10495. [FREE Full text] [CrossRef] [Medline]
- Valladares-Rodriguez S, Fernández-Iglesias MJ, Anido-Rifón L, Facal D, Pérez-Rodríguez R. Episodix: a serious game to detect cognitive impairment in senior adults. A psychometric study. PeerJ. Sep 05, 2018;6:e5478. [FREE Full text] [CrossRef] [Medline]
- Samaroo A, Amariglio RE, Burnham S, Sparks P, Properzi M, Schultz AP, et al. Diminished learning over repeated exposures (LORE) in preclinical Alzheimer's disease. Alzheimers Dement (Amst). Jan 05, 2021;12(1):e12132. [FREE Full text] [CrossRef] [Medline]
- Saxton J, Morrow L, Eschman A, Archer G, Luther J, Zuccolotto A. Computer assessment of mild cognitive impairment. Postgrad Med. Mar 2009;121(2):177-185. [FREE Full text] [CrossRef] [Medline]
- Papp KV, Rentz DM, Maruff P, Sun CK, Raman R, Donohue MC, et al. The computerized cognitive composite (C3) in an Alzheimer's disease secondary prevention trial. J Prev Alzheimers Dis. 2021;8(1):59-67. [FREE Full text] [CrossRef] [Medline]
- Jutten RJ, Rentz DM, Fu JF, Mayblyum DV, Amariglio RE, Buckley RF, et al. Monthly at-home computerized cognitive testing to detect diminished practice effects in preclinical Alzheimer's disease. Front Aging Neurosci. Jan 13, 2022;13:800126. [FREE Full text] [CrossRef] [Medline]
- Bloniecki V, Hagman G, Ryden M, Kivipelto M. Digital screening for cognitive impairment - a proof of concept study. J Prev Alzheimers Dis. 2021;8(2):127-134. [CrossRef] [Medline]
- Chin J, Kim DE, Lee H, Yun J, Lee BH, Park J, et al. A validation study of the inbrain CST: a tablet computer-based cognitive screening test for elderly people with cognitive impairment. J Korean Med Sci. Aug 31, 2020;35(34):e292. [FREE Full text] [CrossRef] [Medline]
- Kalafatis C, Modarres MH, Apostolou P, Marefat H, Khanbagi M, Karimi H, et al. Validity and cultural generalisability of a 5-minute AI-based, computerised cognitive assessment in mild cognitive impairment and Alzheimer's dementia. Front Psychiatry. Jul 22, 2021;12:706695. [FREE Full text] [CrossRef] [Medline]
- Sloane KL, Mefford JA, Zhao Z, Xu M, Zhou G, Fabian R, et al. Validation of a mobile, sensor-based neurobehavioral assessment with digital signal processing and machine-learning analytics. Cogn Behav Neurol. Sep 01, 2022;35(3):169-178. [CrossRef] [Medline]
- Cerino ES, Katz MJ, Wang C, Qin J, Gao Q, Hyun J, et al. Variability in cognitive performance on mobile devices is sensitive to mild cognitive impairment: results from the einstein aging study. Front Digit Health. Dec 03, 2021;3:758031. [FREE Full text] [CrossRef] [Medline]
- Park JH, Jung M, Kim J, Park HY, Kim JR, Park JH. Validity of a novel computerized screening test system for mild cognitive impairment. Int Psychogeriatr. Oct 2018;30(10):1455-1463. [CrossRef]
- Moore RC, Ackerman RA, Russell MT, Campbell LM, Depp CA, Harvey PD, et al. Feasibility and validity of ecological momentary cognitive testing among older adults with mild cognitive impairment. Front Digit Health. Aug 05, 2022;4:946685. [FREE Full text] [CrossRef] [Medline]
- Dorociak KE, Mattek N, Lee J, Leese MI, Bouranis N, Imtiaz D, et al. The survey for memory, attention, and reaction time (SMART): development and validation of a brief web-based measure of cognition for older adults. Gerontology. 2021;67(6):740-752. [FREE Full text] [CrossRef] [Medline]
- Rodríguez-Salgado AM, Llibre-Guerra JJ, Tsoy E, Peñalver-Guia AI, Bringas G, Erlhoff SJ, et al. A brief digital cognitive assessment for detection of cognitive impairment in Cuban older adults. J Alzheimers Dis. 2021;79(1):85-94. [FREE Full text] [CrossRef] [Medline]
- Tsoy E, Erlhoff SJ, Goode CA, Dorsman KA, Kanjanapong S, Lindbergh CA, et al. BHA-CS: a novel cognitive composite for Alzheimer's disease and related disorders. Alzheimers Dement (Amst). Jun 21, 2020;12(1):e12042. [FREE Full text] [CrossRef] [Medline]
- Tsoy E, Strom A, Iaccarino L, Erlhoff SJ, Goode CA, Rodriguez AM, et al. Detecting Alzheimer's disease biomarkers with a brief tablet-based cognitive battery: sensitivity to Aβ and tau PET. Alzheimers Res Ther. Feb 08, 2021;13(1):36. [FREE Full text] [CrossRef] [Medline]
- Possin KL, Moskowitz T, Erlhoff SJ, Rogers KM, Johnson ET, Steele NZ, et al. The brain health assessment for detecting and diagnosing neurocognitive disorders. J Am Geriatr Soc. Jan 2018;66(1):150-156. [FREE Full text] [CrossRef] [Medline]
- Huang YP, Singh A, Chen S, Sun FJ, Huang CR, Liu SI. Validity of a novel touch screen tablet-based assessment for mild cognitive impairment and probable AD in older adults. Assessment. Dec 2019;26(8):1540-1553. [CrossRef] [Medline]
- Software as a Medical Device (SAMD): clinical evaluation. U.S. Food & Drug Administration. Dec 2017. URL: https://www.fda.gov/regulatory-information/search-fda-guidance-documents/software-medical-device-samd-clinical-evaluation [accessed 2025-05-20]
- Jack CRJ, Andrews JS, Beach TG, Buracchio T, Dunn B, Graf A, et al. Revised criteria for diagnosis and staging of Alzheimer's disease: Alzheimer's association workgroup. Alzheimers Dement. Aug 2024;20(8):5143-5169. [CrossRef] [Medline]
- Dubois B, Villain N, Schneider L, Fox N, Campbell N, Galasko D, et al. Alzheimer disease as a clinical-biological construct-an international working group recommendation. JAMA Neurol. Dec 01, 2024;81(12):1304-1311. [FREE Full text] [CrossRef] [Medline]
- Rentz DM, Wessels AM, Annapragada AV, Berger AK, Edgar CJ, Gold M, et al. Building clinically relevant outcomes across the Alzheimer's disease spectrum. Alzheimers Dement (N Y). Jun 26, 2021;7(1):e12181. [FREE Full text] [CrossRef] [Medline]
- Early Alzheimer’s disease: developing drugs for treatment. U.S. Food & Drug Administration. Mar 2024. URL: https://www.fda.gov/regulatory-information/search-fda-guidance-documents/early-alzheimers-disease-developing-drugs-treatment [accessed 2025-05-20]
- Edgar CJ, Vradenburg G, Hassenstab J. The 2018 revised FDA guidance for early Alzheimer's disease: establishing the meaningfulness of treatment effects. J Prev Alzheimers Dis. 2019;6(4):223-227. [CrossRef] [Medline]
- Elsman EB, Mokkink LB, Terwee CB, Beaton D, Gagnier JJ, Tricco AC, et al. Guideline for reporting systematic reviews of outcome measurement instruments (OMIs): PRISMA-COSMIN for OMIs 2024. Qual Life Res. Aug 2024;33(8):2029-2046. [CrossRef] [Medline]
- Prinsen CA, Mokkink LB, Bouter LM, Alonso J, Patrick DL, de Vet HC, et al. COSMIN guideline for systematic reviews of patient-reported outcome measures. Qual Life Res. May 2018;27(5):1147-1157. [FREE Full text] [CrossRef] [Medline]
- Mokkink LB, de Vet HC, Prinsen CA, Patrick DL, Alonso J, Bouter LM, et al. COSMIN risk of bias checklist for systematic reviews of patient-reported outcome measures. Qual Life Res. May 2018;27(5):1171-1179. [FREE Full text] [CrossRef] [Medline]
- Terwee CB, Prinsen CA, Chiarotto A, Westerman MJ, Patrick DL, Alonso J, et al. COSMIN methodology for evaluating the content validity of patient-reported outcome measures: a Delphi study. Qual Life Res. May 2018;27(5):1159-1170. [FREE Full text] [CrossRef] [Medline]
Abbreviations
AD: Alzheimer disease |
ARC: Ambulatory Research in Cognition |
ASRT: automated story recall task |
AUC: area under the curve |
BRANCH: Boston Remote Assessment for Neurocognitive Health |
C3: Computerized Cognitive Composite |
CST: Cognitive Screening Test |
CU: cognitively unimpaired |
ICC: intraclass correlation coefficient |
LORE: Learning Over Repeated Exposures |
MCI: mild cognitive impairment |
NIHTB-CB: National Institutes of Health Toolbox Cognition Battery |
PACC: Preclinical Alzheimer’s Cognitive Composite |
PAL: paired associate learning |
PRISMA: Preferred Reporting Items for Systematic Reviews and Meta-Analyses |
PRISMA-ScR: Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews |
SB-C: speech biomarker for cognition |
SIMBAC: Simulation-Based Assessment of Cognition |
UCSF-BHA: University of California, San Francisco, Brain Health Assessment |
VST: Virtual Supermarket Test |
WLA: Winterlight Assessment |
Edited by T de Azevedo Cardoso; submitted 12.08.24; peer-reviewed by J Hassenstab, G Butera, S Stout, J-H Park; comments to author 17.01.25; revised version received 24.01.25; accepted 24.02.25; published 27.05.25.
Copyright©Rosanne L van den Berg, Sophie M van der Landen, Matthijs J Keijzer, Aniek M van Gils, Maureen van Dam, Kirsten A Ziesemer, Roos J Jutten, John E Harrison, Casper de Boer, Wiesje M van der Flier, Sietske AM Sikkes. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 27.05.2025.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research (ISSN 1438-8871), is properly cited. The complete bibliographic information, a link to the original publication on https://www.jmir.org/, as well as this copyright and license information must be included.