Review
Abstract
Background: Most colorectal polyps are diminutive and benign, especially those in the rectosigmoid colon, and the resection of these polyps is not cost-effective. Advancements in image-enhanced endoscopy have improved the optical prediction of colorectal polyp histology. However, subjective interpretability and inter- and intraobserver variability prohibits widespread implementation. The number of studies on computer-aided diagnosis (CAD) is increasing; however, their small sample sizes limit statistical significance.
Objective: This review aims to evaluate the diagnostic test accuracy of CAD models in predicting the histology of diminutive colorectal polyps by using endoscopic images.
Methods: Core databases were searched for studies that were based on endoscopic imaging, used CAD models for the histologic diagnosis of diminutive colorectal polyps, and presented data on diagnostic performance. A systematic review and diagnostic test accuracy meta-analysis were performed.
Results: Overall, 13 studies were included. The pooled area under the curve, sensitivity, specificity, and diagnostic odds ratio of CAD models for the diagnosis of diminutive colorectal polyps (adenomatous or neoplastic vs nonadenomatous or nonneoplastic) were 0.96 (95% CI 0.93-0.97), 0.93 (95% CI 0.91-0.95), 0.87 (95% CI 0.76-0.93), and 87 (95% CI 38-201), respectively. The meta-regression analysis showed no heterogeneity, and no publication bias was detected. Subgroup analyses showed robust results. The negative predictive value of CAD models for the diagnosis of adenomatous polyps in the rectosigmoid colon was 0.96 (95% CI 0.95-0.97), and this value exceeded the threshold of the diagnosis and leave strategy.
Conclusions: CAD models show potential for the optical histological diagnosis of diminutive colorectal polyps via the use of endoscopic images.
Trial Registration: PROSPERO CRD42021232189; https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=232189
doi:10.2196/29682
Keywords
Introduction
Colorectal cancer (CRC) is the third most common cancer based on incidence statistics and the second leading cause of cancer-related deaths worldwide [
]. Most CRCs arise from benign neoplastic polyps, such as adenomas [ ]. Colonoscopy with the identification and removal of these neoplastic polyps is a standard screening method for CRC, which has been proven to reduce cancer-related mortality [ , ]. This is because polyp removal through colonoscopy prevents the development of CRCs by interrupting the adenoma-carcinoma sequence, which is the most reliable stepwise pathogenesis of CRC development [ ].With regard to the size of colorectal polyps, 90%-95% of the detected polyps are <1 cm, and about half of them are nonneoplastic [
- ]. In the context of diminutive colorectal polyps (DCPs; ≤5 mm), only 0.5%-1.7% of cases had advanced histology, indicating a lower probability of developing CRCs [ , - ]. However, current practice points out the removal of all detected polyps and sending them for histologic evaluation [ ]. This may help determine the surveillance interval for CRC screening with per-patient risk stratification [ ]. However, unnecessary polypectomy carries the risk of procedure-related adverse events and is not cost-effective [ , ].With the advancement of image-enhanced endoscopy, optical diagnosis has been attempted to predict the histology of the detected polyps during colonoscopy by characterizing the surface morphology. This can reduce the need for histologic evaluation after the removal of neoplastic lesions with a small risk of having an invasive component [
]. The unnecessary removal of benign polyps can be avoided with the adaptation of this technique. Therefore, optical diagnosis using electronic or dye-based methods has been recommended for histological classification in clinical practice [ ]. In accordance with this technique, Preservation and Incorporation of Valuable Endoscopic Innovations (PIVI) performance thresholds for in situ endoscopic histology prediction (optical biopsy) required for resect and discard and diagnose and leave strategies have been suggested for the management of DCPs [ ]. For the management of polyps suspected as neoplasm with diminutive size based on the optical biopsy, a resect and discard strategy should satisfy >90% agreement in postpolypectomy surveillance intervals compared with histologic assessment [ ]. For nonneoplastic polyps <5 mm in the rectosigmoid colon, the negative predictive value (NPV) should be >90% to adopt the diagnose and leave strategy based on the optical biopsy in PIVI performance thresholds [ ]. However, only studies by experienced endoscopists with a high level of confidence showed benefits in optical biopsy [ ]. Subjective interpretability, inter- or intraobserver variability, and the learning curve prohibits the widespread implementation of this technique.Studies on computer-aided diagnosis (CAD) using deep learning or machine learning methods to define the accuracy of CAD models are increasing [
, ]. The performance of the CAD model was not influenced by the endoscopists’ level of confidence, and the CAD model consistently provided robust answers. However, studies with small sample sizes have inadequate statistical strength. Thus, this study aims to evaluate the diagnostic test accuracy (DTA) of CAD models used for the histologic diagnosis of DCPs using endoscopic images.Methods
Adherence to the Statement of Systematic Review and Protocol Administration
This study was conducted in accordance with the PRISMA (Preferred Reporting Items for Systematic Review and Meta-Analyses) of DTA Studies [
]. The study protocol was registered at the International Prospective Register of Systematic Reviews database before the initiation of the systematic review (CRD42021232189). Approval from the institutional review board of the Chuncheon Sacred Heart Hospital was waived.Literature Search
Two authors (CSB and JJL) independently performed a core database search of MEDLINE, PubMed, Embase, and Cochrane Library using common search formulas, from inception to January 2020. Duplicate articles were excluded from the analyses. The titles and abstracts of all identified articles were reviewed, and irrelevant articles were excluded. Full-text reviews were subsequently performed to determine whether the pre-established inclusion criteria were satisfied in the identified literature. References were also reviewed to identify any additional relevant articles. Any disagreements in the results of the search process between the 2 authors were resolved by discussion or consultation with a third author (GHB). The search formulas used to identify the relevant articles are presented in
.Literature search strategy for the core databases. tiab: searching code for title and abstract; Mesh: Medical Subject Headings; ab,ti,kw: searching code for abstract, title, and keywords; Lang: searching code for language; lim: searching code by limiting certain conditions.
MEDLINE (Through PubMed)
- artificial intelligence[tiab] OR AI[tiab] OR deep learning[tiab] OR machine learning[tiab] OR computer[tiab] OR neural network[tiab] OR CNN[tiab] OR automatic[tiab] OR automated[tiab]: 502318
- diminutive[tiab] OR small[tiab]: 1417047
- polyp[tiab] OR polyps[Mesh]: 40395
- 1 AND 2 AND 3: 128
- 5-4 AND English[Lang]: 125
Embase
- artificial intelligence:ab,ti,kw OR AI:ab,ti,kw OR deep learning:ab,ti,kw OR machine learning:ab,ti,kw OR computer:ab,ti,kw OR neural network:ab,ti,kw OR CNN:ab,ti,kw OR automatic:ab,ti,kw OR automated: 638513
- diminutive:ab,ti,kw OR small: ab,ti,kw: 2056711
- polyp:ab,ti,kw: 29836
- 3-1 AND 2 AND 3: 198
- 4-3 AND ([article]/lim OR [article in press]/lim OR [review]/lim) AND [English]/lim: 104
Cochrane Library
- artificial intelligence:ab,ti,kw or AI:ab,ti,kw or deep learning:ab,ti,kw or machine learning:ab,ti,kw or computer:ab,ti,kw or neural network:ab,ti,kw or CNN:ab,ti,kw or automatic:ab,ti,kw or automated:ab,ti,kw: 56749
- Mesh descriptor: [polyps] explode all trees: 1087
- polyp:ab,ti,kw: 2855
- 2 or 3: 3397
- diminutive:ab,ti,kw or small:ab,ti,kw: 83388
- 1 and 4 and 5: 48 trials (2021-1-28)
Literature Selection Criteria
The literature included in this systematic review should meet the following inclusion criteria: designed to evaluate the diagnostic performance of CAD models in the prediction of histology of DCPs based on endoscopic images; presentation of the diagnostic performance of CAD models, including sensitivity, specificity, likelihood ratios, predictive values, or accuracy, which enabled the estimation of true-positive (TP), false-positive (FP), false-negative (FN), and true-negative (TN) values for the histologic diagnosis of DCPs based on endoscopic images; and studies written in English. The exclusion criteria were as follows: narrative review articles; studies with incomplete data; systematic reviews or meta-analyses; and comments, proceedings, or study protocols. Articles meeting at least one of the exclusion criteria were excluded from this systematic review.
Methodological Quality Evaluation
Two authors (CSB and JJL) assessed the methodological quality of the final included articles using the second version of the Quality Assessment of Diagnostic Accuracy Studies. This tool comprises four domains: patient selection, index test, reference standard, and flow and timing, and the first three domains have an applicability assessment. The 2 authors assessed each part as having a high, low, or unclear risk of bias [
].Data Extraction, Primary Outcomes, and Additional Analyses
Two authors (CSB and JJL) independently extracted the data from each included study and cross-checked the extracted data. If the data were unclear, the corresponding author of the study was contacted by email to obtain insight into the original data set. A descriptive synthesis was performed using a systematic review process, and DTA meta-analysis was conducted if the included studies were sufficiently homogenous.
The primary outcomes were the TP, FP, FN, and TN values in each study. For the CAD of the histology of DCPs using endoscopic images, the primary outcomes were defined as follows: TP referred to the number of subjects with a positive finding by a CAD model and have adenomas or neoplasms as evidenced by endoscopic images; FP referred to the number of subjects with a positive finding by a CAD model and do not have adenomas or neoplasms based on endoscopic images; FN referred to the number of subjects with a negative finding by a CAD model and have adenomas or neoplasms as evidenced by endoscopic images; and TN referred to the number of subjects with a negative finding on a CAD model and do not have adenomas or neoplasms based on the endoscopic images. With these definitions, the TP, FP, FN, and TN values were calculated for each included study. If the included studies presented comparative diagnostic performance of endoscopists versus CAD models, the TP, FP, FN, and TN values of endoscopists in each study were also extracted.
For additional analyses, such as subgroup analysis or meta-regression, the authors extracted the following variables from each included study: publication year, geographic origin of the data (ie, Western vs Asian), type of endoscopic images, type of CAD models, location of the DCPs (ie, any colon vs rectosigmoid colon), number of total images included, and type of test data sets (internal test vs external test).
Statistics
The bivariate method [
] and hierarchical summary receiver operating characteristic (HSROC) method [ ] were applied for the DTA meta-analysis. A forest plot of the sensitivity and specificity and a summary receiver operating characteristic (SROC) curve were generated using the bivariate method [ ] and HSROC [ ] method, respectively. The level of heterogeneity across the included articles was determined by the correlation coefficient between logit-transformed sensitivity and specificity by the bivariate method and the asymmetry parameter β, where β=0 corresponds to a symmetric receiver operating characteristic curve, in which the diagnostic odds ratio (DOR) does not vary along the curve according to the HSROC method. A positive correlation coefficient and a β with a significant probability (P<.05) indicated heterogeneity between the studies [ , ]. A visual examination of the SROC curve was also performed to identify heterogeneity. Subgroup analysis by univariate meta-regression using the modifiers identified during the systematic review was also performed to identify the reasons for heterogeneity. The pooled NPV by integrating conditional prevalence with respect to a previous distribution (considering the heterogeneity in prevalence) was calculated using a probability-modifying plot.STATA software version 15.1, including the METANDI and MIDAS packages, was used for the DTA meta-analysis. The METANDI and MIDAS packages require the inclusion of a minimum of four studies for DTA meta-analysis. Therefore, if less than four studies were included in the subgroup analysis, the Moses-Shapiro-Littenberg method [
], as implemented in Meta-DiSc 1.4 (XI Cochrane Colloquium), was used. Publication bias was evaluated using the Deek funnel plot asymmetry test.Results
Study Selection
A total of 277 articles were identified following a literature search of the three core databases. Two additional studies were identified by manual screening of the bibliographies. After excluding 111 duplicate studies, 57 additional articles were excluded after reviewing the titles and abstracts. Full-text versions of the remaining 111 articles were obtained and thoroughly reviewed based on the aforementioned inclusion and exclusion criteria. Among these, 98 articles were excluded from the final enrollment for the following reasons: 73 (74%) for incomplete data, 11 (11%) for narrative review, 8 (8%) for study protocol, and 6 (6%) for systematic review or meta-analysis. Finally, 13 studies [
- ] were included in the systematic review. A flowchart of the selection process is presented in .Clinical Features in Included Studies
The identified studies established and explored the diagnostic performance of CAD models for the classification of adenomatous or neoplastic versus nonadenomatous or nonneoplastic polyps. Among the 13 studies for the CAD of DCPs, 6564 images were identified (3207 cases vs 3357 controls).
Seven studies [
, , , - ] used endoscopic images from Asian populations, and six studies [ - , , , ] used endoscopic images from Western populations. All included studies adopted the definition of DCPs as size <5 mm. However, Shahidi et al [ ] adopted a stricter definition for DCPs with a size <3 mm. With regard to the type of CAD model, a deep neural network or convolutional neural network was used in six studies [ - , ], a support vector machine in six studies [ - , - ], and a software-based automatic color intensity analysis in one study [ ]. White-light imaging is currently the standard method for inspecting endoscopic lesions. However, one study [ ] used white-light imaging to establish a CAD model, and most of the included studies used image-enhanced endoscopic images, such as narrow-band imaging [ - , , , ] or autofluorescence imaging [ ] with or without magnification for the detailed characterization of the morphology of DCPs. Three studies [ , , ] have used images of endocytoscopy, which is a specialized endoscopy that allows the analysis of mucosal structures at the cellular level [ ]. With regard to the location of DCPs, most studies [ , - , , , - ] did not consider the location of DCPs. However, three studies [ , , ] separately collected DCPs from the rectosigmoid colon and evaluated the diagnostic performance of the CAD model for these polyps. Most of the included studies [ , - , - ], except for two studies [ , ], have evaluated diagnostic performance using an internal test data set. A study by Zachariah et al [ ] presented both external and internal test performance, and a study by Kudo et al [ ] presented an external test format of the CAD model previously established by Mori et al in 2016 [ ] and 2018 [ , ]. Five studies [ , , , , ] have presented the comparative diagnostic performance of endoscopists versus CAD models for the prediction of histology in DCPs. Detailed clinical features of the included studies are presented in .Study (year) | Nationality (data) | Definition of diminutive polyp | Type of CADa models | Type of endoscopic images | Type of case and controls | Location of polyps | Type of test data sets | Number of cases in test data set (adenoma) | Number of controls in test data set | TPb | FPc | FNd | TNe | Performance of endoscopists (TP/FP/FN/TN) |
Eladio Rodriguez-Diaz et al (2020) [ | ]United States | ≤5 mm | CNNf | NBIg with near focus magnification | Neoplastic versus nonneoplastic polyp | All | Internal test | 93 | 75 | 88 | 9 | 5 | 66 | N/Ah |
Zachariah et al (2020) [ | ]United States | ≤5 mm | CNN | WLIi or NBI | Adenomatous versus nonadenomatous | RSj colon | Internal test | 119 | 472 | 107 | 38 | 12 | 434 | N/A |
Zachariah et al (2020) [ | ]United States | ≤5 mm | CNN | WLI or NBI | Adenomatous versus nonadenomatous | RS colon | External test | 183 | 503 | 167 | 60 | 16 | 443 | N/A |
Shahidi et al (2020) [ | ]Canada | ≤3 mm | CNN | NBI with or without near focus magnification | Adenomatous versus nonadenomatous | All | Internal test | 458 | 186 | 409 | 168 | 49 | 18 | N/A |
Jin et al (2020) [ | ]South Korea | ≤5 mm | CNN | NBI with or without near focus magnification | Adenomatous versus hyperplastic polyp | All | Internal test | 180 | 120 | 150 | 10 | 30 | 110 | N/A |
Byrne et al (2019) [ | ]Canada | ≤5 mm | CNN | NBI with or without near focus magnification | Adenomatous versus hyperplastic polyp | All | Internal test | 66 | 40 | 65 | 7 | 1 | 33 | 43/15/9/35 |
Horiuchi et al (2019) [ | ]Japan | ≤5 mm | Software-based automatic color intensity analysis | AFIk | Neoplastic versus nonneoplastic polyp | All | Internal test | 212 | 217 | 164 | 15 | 48 | 202 | N/A |
Horiuchi et al (2019) [ | ]Japan | ≤5 mm | Software-based automatic color intensity analysis | TMEl (WLI, NBI with magnification, and AFI) | Neoplastic versus nonneoplastic polyp | All | Internal test | 212 | 217 | 191 | 18 | 21 | 199 | N/A |
Horiuchi et al (2019) [ | ]Japan | ≤5 mm | Software-based automatic color intensity analysis | AFI | Neoplastic versus nonneoplastic polyp | RS colon | Internal test | 65 | 193 | 52 | 9 | 13 | 184 | N/A |
Horiuchi et al (2019) [ | ]Japan | ≤5 mm | Software-based automatic color intensity analysis | TME (WLI, NBI with magnification, and AFI) | Neoplastic versus nonneoplastic polyp | RS colon | Internal test | 65 | 193 | 55 | 8 | 10 | 185 | N/A |
Kudo et al (2019) [ | ]Japan | ≤5 mm | SVMm | Endocytoscope with NBI | Neoplastic versus nonneoplastic polyp | All | External test | 1000 | 680 | 960 | 40 | 40 | 640 | 459/12/41/328 (expert); 578/97/422/583 (trainee) |
Kudo et al (2019) [ | ]Japan | ≤5 mm | SVM | Endocytoscope with CEn (methylene blue) | Neoplastic versus nonneoplastic polyp | All | External test | 1000 | 680 | 960 | 0 | 40 | 680 | 453/20/47/320 (expert); 690/236/310/444 (trainee) |
Cristina Sánchez-Montes et al (2019) [ | ]Spain | ≤5 mm | SVM | WLI | Neoplastic versus nonneoplastic polyp | All | Internal test | 50 | 50 | 43 | 6 | 7 | 44 | N/A |
Mori et al (2018) [ | ]Japan | ≤5 mm | SVM | Endocytoscope with NBI | Neoplastic versus nonneoplastic polyp | All | Internal test | 287 | 185 | 268 | 16 | 19 | 159 | N/A |
Mori et al (2018) [ | ]Japan | ≤5 mm | SVM | Endocytoscope with CE (methylene blue) | Neoplastic versus nonneoplastic polyp | All | Internal test | 287 | 185 | 263 | 17 | 24 | 158 | N/A |
Mori et al (2018) [ | ]Japan | ≤5 mm | SVM | Endocytoscope with NBI | Neoplastic versus nonneoplastic polyp | RS colon | Internal test | 104 | 144 | 98 | 6 | 6 | 138 | N/A |
Mori et al (2018) [ | ]Japan | ≤5 mm | SVM | Endocytoscope with CE (methylene blue) | Neoplastic versus nonneoplastic polyp | RS colon | Internal test | 104 | 144 | 96 | 11 | 8 | 133 | N/A |
Chen et al (2018) [ | ]Taiwan | ≤5 mm | Deep neural network | NBI with magnification | Neoplastic versus hyperplastic polyp | All | Internal test | 188 | 96 | 181 | 21 | 7 | 75 | 367/55/9/137 (expert); 671/95/81/289 (novice) |
Mori et al (2016) [ | ]Japan | ≤5 mm | SVM | Endocytoscope with WLI | Neoplastic versus nonneoplastic polyp | All | Internal test | 19 | 36 | 18 | 2 | 1 | 34 | 248/16/25/128 (expert); 646/106/264/374 (nonexpert) |
Kominami et al (2016) [ | ]Japan | ≤5 mm | SVM | NBI with magnification | Neoplastic versus nonneoplastic polyp | All | Internal test | 43 | 45 | 40 | 3 | 3 | 42 | N/A |
Gross et al (2011) [ | ]Germany | ≤5 mm | SVM | NBI with magnification | Adenomatous versus nonneoplastic polyp | All | Internal test | 140 | 135 | 133 | 11 | 7 | 124 | 217/17/23/253 (expert); 188/26/52/244 (nonexpert) |
aCAD: computer-aided diagnosis.
bTP: true-positive.
cFP: false-positive.
dFN: false-negative.
eTN: true-negative.
fCNN: convolutional neural network.
gNBI: narrow-band imaging.
hN/A: not applicable.
iWLI: white-light imaging.
jRS: rectosigmoid.
kAFI: autofluorescence imaging.
lTME: trimodal imaging endoscopy.
mSVM: support vector machine.
nCE: chromoendoscopy.
Quality Assessment of Study Methodology
The quality of the baseline image data is important because the CAD model is established using the learning features of the baseline training data. Theoretically, the images included in each study should reflect real-world conditions, as the CAD model was established for use in clinical practice. However, as some lesions are rare or abnormal, data imbalance is the main barrier to the learning of CAD models. Most of the included studies in the systematic review attempted to mitigate this pitfall by adopting specific inclusion and exclusion criteria for the enrollment of endoscopic images. However, four studies [
, , , ] did not include a detailed description of the image enrollment standard. Therefore, these studies were rated as unclear risk in the patient selection domain ( and ). This binary classification of low risk and unclear risk in the patient selection domain was adopted as a modifier in the subgroup or meta-regression analysis.DTA Meta-analysis of CAD Models
Among the 13 studies [
- ] for the meta-analysis of CAD of DCPs, the area under the curve (AUC), sensitivity, specificity, positive likelihood ratio, negative likelihood ratio, and DOR of CAD models for the diagnosis of DCPs were 0.96 (95% CI 0.93-0.97), 0.93 (95% CI 0.91-0.95), 0.87 (95% CI 0.76-0.93), 7.1 (95% CI 3.8-13.3), 0.08 (95% CI 0.06-0.11), and 87 (95% CI 38-201), respectively ( ; ). The SROC curve is shown in . To investigate the clinical utility of the CAD models, Fagan nomogram was generated. Positive findings indicated that adenomas or neoplasms were detected by the CAD models. Negative findings indicated that nonadenomas or nonneoplasms were detected by the CAD models. After assuming a 49% prevalence of adenomas or neoplasms (this value was calculated from the values in ; ie, the total number of cases/controls, ie, 3207/3357, 95.53%), the Fagan nomogram shows that the posterior probability of adenomas or neoplasms was 87% if the finding of the CAD model was positive, and the posterior probability of adenoma was 7% if the finding of the CAD model was negative ( ).Subgroup | Included studies (n=13), n (%) | AUCa, mean (95% CI) | Sensitivity (95% CI) | Specificity (95% CI) | PLRb, mean (95% CI) | NLRc, mean (95% CI) | DORd (95% CI) | |
Value of meta-analysis in all the included studies | 13 (100) | 0.96 (0.93-0.97) | 0.93 (0.91-0.95) | 0.87 (0.76-0.93) | 7.1 (3.8-13.3) | 0.08 (0.06-0.11) | 87 (38-201) | |
Comparative performance of CADe models and endoscopists | ||||||||
Value of CAD models in the comparative analysis | 4 (31) | 0.96 (0.94-0.98) | 0.96 (0.95-0.97) | 0.91 (0.84-0.95) | 10.5 (5.7-19.1) | 0.05 (0.03-0.06) | 231 (113-473) | |
Value of expert endoscopists in the comparative analysis | 4 (31) | 0.97 (0.95-0.98) | 0.93 (0.89-0.96) | 0.89 (0.80-0.95) | 8.8 (4.7-16.7) | 0.08 (0.05-0.12) | 116 (80-168) | |
Value of novice endoscopists in the comparative analysis | 4 (31) | 0.85 (0.82-0.88) | 0.78 (0.68-0.86) | 0.78 (0.68-0.86) | 3.6 (2.3-5.8) | 0.28 (0.18-0.43) | 13 (6-30) | |
Methodological quality of included studies | ||||||||
High | 9 (69) | 0.97 (0.95-0.98) | 0.93 (0.90-0.95) | 0.91 (0.88-0.93) | 10.2 (7.7-13.5) | 0.08 (0.05-0.11) | 132 (83-211) | |
Low | 4 (31) | 0.92 (0.89-0.94) | 0.93 (0.88-0.96) | 0.87 (0.84-0.90) | 7.4 (5.9-9.3) | 0.08 (0.04-0.14) | 96 (52-180) | |
Nationality of data | ||||||||
Western | 6 (46) | 0.93 (0.90-0.95) | 0.92 (0.90-0.94) | 0.80 (0.51-0.94) | 4.6 (1.6-13.7) | 0.10 (0.06-0.16) | 47 (11-213) | |
Asian | 7 (54) | 0.97 (0.95-0.98) | 0.93 (0.89-0.96) | 0.91 (0.87-0.94) | 10.7 (7.4-15.3) | 0.08 (0.05-0.12) | 141 (80-248) | |
Type of test data sets | ||||||||
Internal test | 12 (92) | 0.95 (0.93-0.96) | 0.92 (0.89-0.94) | 0.86 (0.75-0.93) | 6.8 (3.5-13.3) | 0.09 (0.06-0.13) | 76 (32-179) | |
External test | 2 (15) | Null | 0.95 (0.94-0.96) | 0.92 (0.90-0.93) | 11.1 (4.7-26.4) | 0.06 (0.03-0.15) | 174 (36-841) | |
Location of polyps | ||||||||
All | 12 (92) | 0.96 (0.93-0.97) | 0.93 (0.90-0.95) | 0.87 (0.75-0.93) | 7.0 (3.6-13.9) | 0.08 (0.06-0.11) | 89 (36-220) | |
Rectosigmoid colon | 3 (23) | 0.97 (0.94-0.99) | 0.91 (0.87-0.94) | 0.91 (0.89-0.93) | 6.8 (3.5-13.3) | 0.09 (0.06-0.13) | 76 (32-179) | |
Total number of included images | ||||||||
≥200 | 8 (61) | 0.95 (0.92-0.96) | 0.92 (0.90-0.95) | 0.84 (0.65-0.94) | 5.9 (2.4-14.7) | 0.09 (0.06-0.13) | 66 (20-222) | |
<200 | 5 (38) | 0.96 (0.94-0.98) | 0.94 (0.90-0.96) | 0.89 (0.84-0.93) | 7.9 (5.5-11.2) | 0.08 (0.04-0.15) | 114 (57-230) | |
≥300 | 6 (46) | 0.94 (0.91-0.95) | 0.91 (0.88-0.94) | 0.84 (0.57-0.96) | 5.9 (1.8-19.6) | 0.10 (0.06-0.17) | 57 (12-275) | |
<300 | 7 (54) | 0.97 (0.95-0.98) | 0.94 (0.92-0.96) | 0.88 (0.83-0.92) | 8.1 (5.7-11.7) | 0.06 (0.04-0.09) | 127 (80-203) | |
Type of CAD models | ||||||||
Neural network | 6 (46) | 0.94 (0.92-0.96) | 0.93 (0.88-0.96) | 0.76 (0.48-0.92) | 4.0 (1.5-10.6) | 0.09 (0.05-0.18) | 42 (10-184) | |
SVMf | 6 (46) | 0.97 (0.96-0.98) | 0.94 (0.91-0.96) | 0.92 (0.90-0.94) | 12.1 (9.0-16.5) | 0.07 (0.04-0.10) | 186 (101-344) | |
Type of endoscopic image | ||||||||
Endocytoscope | 3 (23) | 0.98 (0.94-0.99) | 0.95 (0.94-0.97) | 0.94 (0.92-0.95) | 13.8 (9.8-19.5) | 0.05 (0.04-0.08) | 248 (109-566) | |
Endoscopy | 10 (77) | 0.95 (0.92-0.96) | 0.92 (0.89-0.94) | 0.85 (0.70-0.93) | 6.0 (2.8-12.7) | 0.09 (0.06-0.14) | 64 (24-169) |
aAUC: area under the curve.
bPLR: positive likelihood ratio.
cNLR: negative likelihood ratio.
dDOR: diagnostic odds ratio.
eCAD: computer-aided diagnosis.
fSVM: support vector machine.
Five studies [
, , , , ] compared the performance of CAD models and endoscopists. Among these, four studies [ , , , ] have presented comparative performance between CAD models and endoscopists according to the expertise of the endoscopists (expert endoscopists vs CAD models or novice endoscopists vs CAD models). The pooled AUC, sensitivity, specificity, and DOR of CAD models for the diagnosis of DCPs were 0.96 (95% CI 0.94-0.98), 0.96 (95% CI 0.95-0.97), 0.91 (95% CI 0.84-0.95), and 231 (95% CI 113-473), respectively. For the expert endoscopists, the pooled AUC, sensitivity, specificity, and DOR were 0.97 (95% CI 0.95-0.98), 0.93 (95% CI 0.89-0.96), 0.89 (95% CI 0.80-0.95), and 116 (95% CI 80-168), respectively. For the novice endoscopists, the pooled AUC, sensitivity, specificity, and DOR were 0.85 (95% CI 0.82-0.88), 0.78 (95% CI 0.68-0.86), 0.78 (95% CI 0.68-0.86), and 13 (95% CI 6-30), respectively. The forest plot of AUCs is illustrated in , and no significant difference was found between CAD models and expert endoscopists; however, novice endoscopists showed lower pooled AUC for the histologic diagnosis of DCPs than those for CAD models or expert endoscopists.With regard to the NPV of CAD models for the diagnosis of adenomatous polyps in the rectosigmoid colon, the pretest prevalence of adenomatous polyp in the rectosigmoid colon was 13.2% (95% CI 10.2%-16.5%) in a recent meta-analysis [
]. For the assumption of this prevalence, the NPV of CAD models was 0.99 (95% CI 0.87-0.99; ). In this meta-analysis, the prevalence of adenomatous polyp in the rectosigmoid colon was 30% (95% CI 27%-32%) based on 29.53% (352/1192) of polyps in the rectosigmoid colon. For the assumption of this prevalence, the NPV of CAD models was 0.97 (95% CI 0.87-0.99; ). If we adopt a simple follow-up equation for the NPV of CAD models for the diagnosis of adenomatous polyps in the rectosigmoid colon using pooled sensitivity and specificity, the NPV of CAD models was 0.96 (95% CI 0.95-0.97). The follow-up equation is as follows:NPV = (specificity × [1−prevalence])/(specificity × [1−prevalence] + prevalence × [1−sensitivity])
Assessment of Heterogeneity With Meta-Regression and Subgroup Analysis
First, the authors observed a positive correlation coefficient between the logit-transformed sensitivity and specificity (r=0.22) and an asymmetric β parameter, with a significant P value (P=.004), implying that heterogeneity exists among the studies. Second, a coupled forest plot of sensitivity and specificity was obtained (
). Compared with the enrolled studies, the study by Shahidi et al [ ] showed lower specificity. This study was found to have an unclear risk of bias in methodology quality assessment. Therefore, subgroup analysis was carried out according to the methodological quality, and a negative correlation coefficient was found between logit-transformed sensitivity and specificity (r=−0.13) and an asymmetric β parameter, with a nonsignificant P value (P=.63) in high-quality studies, indicating an absence of heterogeneity among the studies. Third, the shape of the SROC curve for CAD of DCPs using endoscopic images was symmetric ( ). Fourth, meta-regression using modifiers identified in the systematic review was conducted, and no source of heterogeneity could be identified (published year, P=.34; nationality of the data sets, P=.29; type of CAD models, P=.38; type of endoscopic image, P=.23; location of the DCPs, P=.90; type of test data sets, P=.66; total number of images, P=.66; and methodological quality, P=.10; ). Finally, a subgroup analysis based on the potential modifiers was performed, and the pooled AUC of studies with high methodological quality was higher than that of studies with lower methodological quality. Except for this variable (methodological quality), no significant changes in diagnostic performance were found according to the modifiers ( ).Evaluation of Publication Bias
Deek funnel plot of studies for the CAD of DCPs exhibited a symmetrical shape with respect to the regression line (
), and the asymmetry test showed no evidence of publication bias (P=.65).Discussion
Principal Findings
This study presented evidence that CAD models showed high performance values for the histologic diagnosis of DCPs and practical values in the Fagan nomogram, indicating the potential to use these models in clinical practice. This performance was comparable with that of expert endoscopists and higher than that of novice endoscopists. Although the main analysis found heterogeneity among the included studies, the subgroup analysis demonstrated that methodological quality was the reason for the heterogeneity. Thorough meta-regression or subgroup analyses did not reveal any additional reasons for heterogeneity.
Most polyps detected during colonoscopy are diminutive, and considering the low potential of malignancy, resecting all DCPs is not cost-effective [
, - ]. However, many DCPs are still being resected and sent for histologic evaluation to determine the surveillance interval for CRC screening [ ]. CAD models without pathologic diagnosis could lead to cost savings by changing surveillance interval recommendations, and the DTA meta-analysis in our study revealed that the NPV of CAD models for the diagnosis of adenomatous polyps in the rectosigmoid colon was over 90% to adopt the diagnose and leave strategy based on the optical biopsy satisfying PIVI performance thresholds for in situ endoscopic histology prediction of DCPs.Despite the technical challenges of CAD models analyzing a smaller surface area, previous meta-analyses [
- ] have demonstrated that CAD models can increase the adenoma or polyp detection rate, especially for those with small size. With regard to the histologic prediction of DCPs, a previous meta-analysis revealed that endoscopists with only high confidence showed an NPV of approximately 90% using CAD models of digital chromoendoscopy, implicating the potential for adopting the diagnosis and leave strategy [ ]. Another meta-analysis showed an NPV of 0.95 (95% CI 0.88-0.98) for the CAD of DCPs in nonmagnifying narrow-band imaging [ ]. However, the location of the DCPs was not considered, and many studies were omitted from the search process.An additional finding of this DTA meta-analysis is the robustness of the diagnostic performance of CAD models. The performance values were consistent regardless of the modifiers, except for the methodological quality (
). This was consistent regardless of the nationality of the patients, location of DCPs, total number of included polyps, and type of CAD models or endoscopic images. However, the diagnostic performance of studies with high methodological quality showed higher AUCs than that of studies with low methodological quality, and no evidence of heterogeneity was detected among studies in the subgroup with high methodological quality. Although pooled AUCs in the subgroup of external test datasets could not be measured because only two studies were included in this subgroup, the remaining performance values were comparable with the subgroup of internal test data sets.Limitations
Despite the robust evidence in the DTA meta-analysis stated earlier, several inevitable limitations were identified. First, only two or three studies were included in the subgroup analyses of external test data sets, rectosigmoid DCPs, and endocytoscopic images. Bivariate and HSROC methods are advanced statistical techniques that have overcome the limitations of the Moses-Shapiro-Littenberg method (which does not consider any heterogeneity between studies) [
, ]. However, the Moses-Shapiro-Littenberg method is only possible for a subgroup with fewer than four studies. With accumulating evidence on this topic, this statistical pitfall could be overcome. Second, the number of studies was insufficient to enable a comparison of the relative diagnostic performance of endoscopists and CAD models. Considering the real clinical adaptation of endoscopists with CAD models rather than endoscopists versus CAD models, it is no longer necessary to compare the diagnostic capabilities of doctors and CAD models [ ]. Owing to the unique characteristics of patients in each institution, CAD models developed from a single institution usually have limitations for widespread implementation, indicating the importance of the external test. However, only two studies conducted external tests to verify CAD model performance. Additional studies focusing on external validation-oriented performance or suggesting a clinical application benefit for future perspectives in established CAD models are expected.In conclusion, CAD models showed potential for the optical histological diagnosis of DCPs using endoscopic images.
Acknowledgments
This work was supported by the Technology Development Program (S2931703) and funded by the Ministry of SMEs and Startups (Korea).
Authors' Contributions
CSB was responsible for conceptualization, data curation, formal analysis, funding acquisition, investigation, methodology, project administration, resources, supervision. CSB was also responsible for writing the original draft and reviewing and editing the final draft. JJL was responsible for data curation, formal analysis, investigation, and resources. GHB was responsible for data curation, formal analysis, investigation, and resources.
Conflicts of Interest
None declared.
References
- Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin 2021 Feb 04:209-249 [FREE Full text] [CrossRef] [Medline]
- Zauber AG, Winawer SJ, O'Brien MJ, Lansdorp-Vogelaar I, van Ballegooijen M, Hankey BF, et al. Colonoscopic polypectomy and long-term prevention of colorectal-cancer deaths. N Engl J Med 2012 Feb 23;366(8):687-696. [CrossRef]
- Rex DK, Boland CR, Dominitz JA, Giardiello FM, Johnson DA, Kaltenbach T, et al. Colorectal cancer screening: recommendations for physicians and patients from the U.S. multi-society task force on colorectal cancer. Am J Gastroenterol 2017 Jul;112(7):1016-1030. [CrossRef] [Medline]
- Kandel P, Wallace MB. Should we resect and discard low risk diminutive colon polyps. Clin Endosc 2019 May;52(3):239-246 [FREE Full text] [CrossRef] [Medline]
- Lieberman D, Moravec M, Holub J, Michaels L, Eisen G. Polyp size and advanced histology in patients undergoing colonoscopy screening: implications for CT colonography. Gastroenterology 2008 Oct;135(4):1100-1105 [FREE Full text] [CrossRef] [Medline]
- Qumseya BJ, Coe S, Wallace MB. The effect of polyp location and patient gender on the presence of dysplasia in colonic polyps. Clin Transl Gastroenterol 2012 Jul 26;3:e20 [FREE Full text] [CrossRef] [Medline]
- Butterly LF, Chase MP, Pohl H, Fiarman GS. Prevalence of clinically important histology in small adenomas. Clin Gastroenterol Hepatol 2006 Mar;4(3):343-348. [CrossRef] [Medline]
- Cho B, Bang CS. Artificial intelligence for the determination of a management strategy for diminutive colorectal polyps: hype, hope, or help. Am J Gastroenterol 2020 Jan;115(1):70-72. [CrossRef] [Medline]
- Gupta N, Bansal A, Rao D, Early DS, Jonnalagadda S, Wani SB, et al. Prevalence of advanced histological features in diminutive and small colon polyps. Gastrointest Endosc 2012 May;75(5):1022-1030. [CrossRef] [Medline]
- Kaltenbach T, Anderson JC, Burke CA, Dominitz JA, Gupta S, Lieberman D, et al. Endoscopic removal of colorectal lesions-recommendations by the US multi-society task force on colorectal cancer. Gastroenterology 2020 Mar;158(4):1095-1129. [CrossRef] [Medline]
- Tao Pu L, Maicas G, Tian Y, Yamamura T, Nakamura M, Suzuki H, et al. Computer-aided diagnosis for characterization of colorectal lesions: comprehensive software that includes differentiation of serrated lesions. Gastrointest Endosc 2020 Oct;92(4):891-899. [CrossRef] [Medline]
- Rex DK, Kahi C, O'Brien M, Levin T, Pohl H, Rastogi A, et al. The American Society for Gastrointestinal Endoscopy PIVI (Preservation and Incorporation of Valuable Endoscopic Innovations) on real-time endoscopic assessment of the histology of diminutive colorectal polyps. Gastrointest Endosc 2011 Mar;73(3):419-422. [CrossRef] [Medline]
- Bang CS. [Deep learning in upper gastrointestinal disorders: status and future perspectives]. Korean J Gastroenterol 2020 Mar 25;75(3):120-131 [FREE Full text] [CrossRef] [Medline]
- Yang YJ, Bang CS. Application of artificial intelligence in gastroenterology. World J Gastroenterol 2019 Apr 14;25(14):1666-1683 [FREE Full text] [CrossRef] [Medline]
- McInnes MD, Moher D, Thombs BD, McGrath TA, Bossuyt PM, The PRISMA-DTA Group, et al. Preferred reporting items for a systematic review and meta-analysis of diagnostic test accuracy studies: The PRISMA-DTA Statement. J Am Med Assoc 2018 Jan 23;319(4):388-396. [CrossRef] [Medline]
- Whiting PF, Rutjes AW, Westwood ME, Mallett S, Deeks JJ, Reitsma JB, et al. QUADAS-2: a revised tool for the quality assessment of diagnostic accuracy studies. Ann Intern Med 2011 Oct 18;155(8):529-536. [CrossRef] [Medline]
- Reitsma JB, Glas AS, Rutjes AW, Scholten RJ, Bossuyt PM, Zwinderman AH. Bivariate analysis of sensitivity and specificity produces informative summary measures in diagnostic reviews. J Clin Epidemiol 2005 Oct;58(10):982-990. [CrossRef] [Medline]
- Rutter CM, Gatsonis CA. A hierarchical regression approach to meta-analysis of diagnostic test accuracy evaluations. Stat Med 2001 Oct 15;20(19):2865-2884. [CrossRef] [Medline]
- Harbord RM, Whiting P. Metandi: Meta-analysis of diagnostic accuracy using hierarchical logistic regression. Stata J 2009 Aug 01;9(2):211-229. [CrossRef]
- Littenberg B, Moses LE. Estimating diagnostic accuracy from multiple conflicting reports: a new meta-analytic method. Med Decis Making 1993;13(4):313-321. [CrossRef] [Medline]
- Rodriguez-Diaz E, Baffy G, Lo W, Mashimo H, Vidyarthi G, Mohapatra SS, et al. Real-time artificial intelligence-based histologic classification of colorectal polyps with augmented visualization. Gastrointest Endosc 2021 Mar;93(3):662-670. [CrossRef] [Medline]
- Zachariah R, Samarasena J, Luba D, Duh E, Dao T, Requa J, et al. Prediction of polyp pathology using convolutional neural networks achieves "Resect and Discard" thresholds. Am J Gastroenterol 2020 Jan;115(1):138-144 [FREE Full text] [CrossRef] [Medline]
- Shahidi N, Rex DK, Kaltenbach T, Rastogi A, Ghalehjegh SH, Byrne MF. Use of endoscopic impression, artificial intelligence, and pathologist interpretation to resolve discrepancies between endoscopy and pathology analyses of diminutive colorectal polyps. Gastroenterology 2020 Feb;158(3):783-785. [CrossRef] [Medline]
- Jin EH, Lee D, Bae JH, Kang HY, Kwak M, Seo JY, et al. Improved accuracy in optical diagnosis of colorectal polyps using convolutional neural networks with visual explanations. Gastroenterology 2020 Jun;158(8):2169-2179. [CrossRef] [Medline]
- Byrne MF, Chapados N, Soudan F, Oertel C, Pérez ML, Kelly R, et al. Real-time differentiation of adenomatous and hyperplastic diminutive colorectal polyps during analysis of unaltered videos of standard colonoscopy using a deep learning model. Gut 2019 Jan 24;68(1):94-100 [FREE Full text] [CrossRef] [Medline]
- Horiuchi H, Tamai N, Kamba S, Inomata H, Ohya TR, Sumiyama K. Real-time computer-aided diagnosis of diminutive rectosigmoid polyps using an auto-fluorescence imaging system and novel color intensity analysis software. Scand J Gastroenterol 2019 Jun;54(6):800-805. [CrossRef] [Medline]
- Kudo S, Misawa M, Mori Y, Hotta K, Ohtsuka K, Ikematsu H, et al. Artificial intelligence-assisted system improves endoscopic identification of colorectal neoplasms. Clin Gastroenterol Hepatol 2020 Jul;18(8):1874-1881. [CrossRef] [Medline]
- Sánchez-Montes C, Sánchez FJ, Bernal J, Córdova H, López-Cerón M, Cuatrecasas M, et al. Computer-aided prediction of polyp histology on white light colonoscopy using surface pattern analysis. Endoscopy 2019 Mar;51(3):261-265. [CrossRef] [Medline]
- Mori Y, Kudo S, Misawa M, Saito Y, Ikematsu H, Hotta K, et al. Real-time use of artificial intelligence in identification of diminutive polyps during colonoscopy: a prospective study. Ann Intern Med 2018 Sep 18;169(6):357-366. [CrossRef] [Medline]
- Chen P, Lin M, Lai M, Lin J, Lu HH, Tseng VS. Accurate classification of diminutive colorectal polyps using computer-aided analysis. Gastroenterology 2018 Feb;154(3):568-575. [CrossRef] [Medline]
- Mori Y, Kudo S, Chiu P, Singh R, Misawa M, Wakamura K, et al. Impact of an automated system for endocytoscopic diagnosis of small colorectal lesions: an international web-based study. Endoscopy 2016 Dec 5;48(12):1110-1118. [CrossRef] [Medline]
- Kominami Y, Yoshida S, Tanaka S, Sanomura Y, Hirakawa T, Raytchev B, et al. Computer-aided diagnosis of colorectal polyp histology by using a real-time image recognition system and narrow-band imaging magnifying colonoscopy. Gastrointest Endosc 2016 Mar;83(3):643-649. [CrossRef] [Medline]
- Gross S, Trautwein C, Behrens A, Winograd R, Palm S, Lutz HH, et al. Computer-based classification of small colorectal polyps by using narrow-band imaging with optical magnification. Gastrointest Endosc 2011 Dec;74(6):1354-1359. [CrossRef] [Medline]
- Abadir AP, Ali MF, Karnes W, Samarasena JB. Artificial intelligence in gastrointestinal endoscopy. Clin Endosc 2020 Mar;53(2):132-141 [FREE Full text] [CrossRef] [Medline]
- Wong MC, Huang J, Huang JL, Pang TW, Choi P, Wang J, et al. Global prevalence of colorectal neoplasia: a systematic review and meta-analysis. Clin Gastroenterol Hepatol 2020 Mar;18(3):553-561 [FREE Full text] [CrossRef] [Medline]
- Barua I, Vinsard DG, Jodal HC, Løberg M, Kalager M, Holme Ø, et al. Artificial intelligence for polyp detection during colonoscopy: a systematic review and meta-analysis. Endoscopy 2021 Mar 17;53(3):277-284. [CrossRef] [Medline]
- Hassan C, Spadaccini M, Iannone A, Maselli R, Jovani M, Chandrasekar VT, et al. Performance of artificial intelligence in colonoscopy for adenoma and polyp detection: a systematic review and meta-analysis. Gastrointest Endosc 2021 Jan;93(1):77-85. [CrossRef] [Medline]
- Zhang Y, Zhang X, Wu Q, Gu C, Wang Z. Artificial intelligence-aided colonoscopy for polyp detection: a systematic review and meta-analysis of randomized clinical trials. J Laparoendosc Adv Surg Tech A 2021 Feb 01:0777. [CrossRef] [Medline]
- Li J, Lu J, Yan J, Tan Y, Liu D. Artificial intelligence can increase the detection rate of colorectal polyps and adenomas: a systematic review and meta-analysis. Eur J Gastroenterol Hepatol 2021 Aug 01;33(8):1041-1048. [CrossRef] [Medline]
- Aziz M, Fatima R, Dong C, Lee-Smith W, Nawras A. The impact of deep convolutional neural network-based artificial intelligence on colonoscopy outcomes: a systematic review with meta-analysis. J Gastroenterol Hepatol 2020 Oct;35(10):1676-1683. [CrossRef] [Medline]
- Mason SE, Poynter L, Takats Z, Darzi A, Kinross JM. Optical technologies for endoscopic real-time histologic assessment of colorectal polyps: a meta-analysis. Am J Gastroenterol 2019 Aug;114(8):1219-1230. [CrossRef] [Medline]
- Lui TK, Guo C, Leung WK. Accuracy of artificial intelligence on histology prediction and detection of colorectal polyps: a systematic review and meta-analysis. Gastrointest Endosc 2020 Jul;92(1):11-22. [CrossRef] [Medline]
- Bang CS, Lee JJ, Baik GH. Computer-aided diagnosis of esophageal cancer and neoplasms in endoscopic images: a systematic review and meta-analysis of diagnostic test accuracy. Gastrointest Endosc 2021 May;93(5):1006-1015 [FREE Full text] [CrossRef] [Medline]
- Bang CS, Lee JJ, Baik GH. Artificial intelligence for the prediction of helicobacter pylori infection in endoscopic images: systematic review and meta-analysis of diagnostic test accuracy. J Med Internet Res 2020 Sep 16;22(9):e21983 [FREE Full text] [CrossRef] [Medline]
Abbreviations
AUC: area under the curve |
CAD: computer-aided diagnosis |
CRC: colorectal cancer |
DCP: diminutive colorectal polyp |
DOR: diagnostic odds ratio |
DTA: diagnostic test accuracy |
FN: false-negative |
FP: false-positive |
HSROC: hierarchical summary receiver operating characteristic |
NPV: negative predictive value |
PIVI: Preservation and Incorporation of Valuable Endoscopic Innovations |
PRISMA: Preferred Reporting Items for Systematic Review and Meta-Analyses |
SROC: summary receiver operating characteristic |
TN: true-negative |
TP: true-positive |
Edited by G Eysenbach; submitted 16.04.21; peer-reviewed by JA Benítez-Andrades, MT García-Ordás; comments to author 02.07.21; revised version received 04.07.21; accepted 27.07.21; published 25.08.21
Copyright©Chang Seok Bang, Jae Jun Lee, Gwang Ho Baik. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 25.08.2021.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on https://www.jmir.org/, as well as this copyright and license information must be included.