Review
Abstract
Background: Neuroimaging segmentation is increasingly important for diagnosing and planning treatments for neurological diseases. Manual segmentation is time-consuming, apart from being prone to human error and variability. Transformers are a promising deep learning approach for automated medical image segmentation.
Objective: This scoping review will synthesize current literature and assess the use of various transformer models for neuroimaging segmentation.
Methods: A systematic search in major databases, including Scopus, IEEE Xplore, PubMed, and ACM Digital Library, was carried out for studies applying transformers to neuroimaging segmentation problems from 2019 through 2023. The inclusion criteria allow only for peer-reviewed journal papers and conference papers focused on transformer-based segmentation of human brain imaging data. Excluded are the studies dealing with nonneuroimaging data or raw brain signals and electroencephalogram data. Data extraction was performed to identify key study details, including image modalities, datasets, neurological conditions, transformer models, and evaluation metrics. Results were synthesized using a narrative approach.
Results: Of the 1246 publications identified, 67 (5.38%) met the inclusion criteria. Half of all included studies were published in 2022, and more than two-thirds used transformers for segmenting brain tumors. The most common imaging modality was magnetic resonance imaging (n=59, 88.06%), while the most frequently used dataset was brain tumor segmentation dataset (n=39, 58.21%). 3D transformer models (n=42, 62.69%) were more prevalent than their 2D counterparts. The most developed were those of hybrid convolutional neural network-transformer architectures (n=57, 85.07%), where the vision transformer is the most frequently used type of transformer (n=37, 55.22%). The most frequent evaluation metric was the Dice score (n=63, 94.03%). Studies generally reported increased segmentation accuracy and the ability to model both local and global features in brain images.
Conclusions: This review represents the recent increase in the adoption of transformers for neuroimaging segmentation, particularly for brain tumor detection. Currently, hybrid convolutional neural network-transformer architectures achieve state-of-the-art performances on benchmark datasets over standalone models. Nevertheless, their applicability remains highly limited by high computational costs and potential overfitting on small datasets. The heavy reliance of the field on the brain tumor segmentation dataset hints at the use of a more diverse set of datasets to validate the performances of models on a variety of neurological diseases. Further research is needed to define the optimal transformer architectures and training methods for clinical applications. Continuing development may make transformers the state-of-the-art for fast, accurate, and reliable brain magnetic resonance imaging segmentation, which could lead to improved clinical tools for diagnosing and evaluating neurological disorders.
doi:10.2196/57723
Keywords
Introduction
Neuroimaging refers to the visualization of the structure and function of the brain. It is one of the most important tools in the understanding of different neurological disorders. Generally, neuroimages can be obtained using 3 principal imaging modalities, where each modality shows the complexities of the brain from a different perspective. Of the 3, magnetic resonance imaging (MRI) is still the most frequently used due to high contrasting ability of brain tissues, high spatial resolution, and no risk of radiation exposure [
- ]. For different brain regions to be viewed, multiple MRI sequences are needed, such as T1, T1ce, T2, and fluid-attenuated inversion recovery, as presented in [ ]. The second neuroimaging modality is computed tomography (CT), which can produce high-resolution images. On the other hand, it has limited soft tissue characterization, and its radiation risk makes it unsuitable for repetitive use [ , ]. The third neuroimaging modality is positron emission tomography (PET), which integrates nuclear medicine to visualize metabolic activity [ ]. PET has high sensitivity, making it effective in detecting metastases, finding abnormalities, and imaging deep structures. However, it has limited resolution, and repeated use causes radiation risk [ , ]. Finding changes in brain tissue through neuroimaging analysis is critical for detecting and monitoring neurological disorders [ ] and brain tumors [ ]. Segmentation is a useful process in outlining regions of interest in medical images [ ], which enables the quantitative assessment of atrophy, growths, and anatomical differences that depict conditions like Alzheimer disease, schizophrenia, and brain tumors among other neurodegenerative diseases [ ]. Because of this, segmentation is applied broadly in different medical applications in diagnosis, tissue classification, radiotherapy treatment, and surgical planning [ , ].
Segmentation techniques can be classified into 3 categories: manual, semiautomated, and fully automated. Manual segmentation is the standard for segmentation because it is believed to be the most accurate [
]. The technique, however, is laborious, time-consuming, and subjective, since it depends on human judgment, and this may result in variation in the results because of the different interpretations. Due to this, there has been a great deal of research into automated segmentation techniques to replicate the results from manual segmentation but with a higher level of efficiency and consistency [ ]. To do this, 2 early paradigms were used: intensity-based approaches, which include thresholding, edge-detection, and region-based [ ], and traditional machine learning paradigms, including support vector machine, k-nearest neighbor clustering, and random forest [ , ]. Each of these methods has been applied in 1 or more ways, but their applicability and performance within the task of image segmentation remain limited [ , ]. Since then, deep learning (DL) methods have transformed medical imaging applications and became a strong alternative to classical techniques.DL is a subclass of machine learning that involves artificial neural networks with multiple layers. These networks are designed to progressively learn hierarchical representations and features of data, which both eliminates the need for manual feature engineering [
] and enables the extraction of complicated patterns from large datasets [ ]. Different DL architectures have been used for medical image segmentation, but the most widely used and popular one is convolutional neural networks (CNNs), which have achieved state-of-the-art performances in different medical imaging tasks, including segmentation [ , ]. U-Net [ ] is another notable model that was specially designed for biomedical image segmentation and has produced very good results in its field [ - ]. Some other notable models include SegNet [ ], ResNet [ ], DenseNet [ ], 3D-ConvNet [ ], and DeepLab [ ]. These models have served as a solid foundation for the imaging field and have resulted in a plethora of variants, each developed for specific imaging modalities, anatomical structures, and segmentation tasks. Transformers [ ] are a type of neural network architectures that mainly rely on self-attention mechanisms. They were first proposed in 2017 and have since yielded state-of-the-art results in the field of natural language processing [ ]. More recently, transformers have also shown success when applied to a wide array of computer vision tasks, one of which is segmentation [ ]. Although CNNs have achieved impressive performances in image-related tasks, they may not capture global and long-range dependencies well due to the small kernel size [ , ].Transformers have recently gained popularity in imaging due to their self-attention mechanism, which can model these long-range dependencies—especially useful in brain segmentation [
]. The great success of transformers has motivated the construction of vision transformers (ViT) [ ], which forego the use of convolutional layers and rely instead on a multihead self-attention mechanism [ , ]. This architecture divides an image into fixed-size patches, linearly embeds them, and processes them through a transformer network, thereby allowing it to model long-range dependencies with reduced inductive bias [ , ]. Recently, ViT architectures specifically designed for medical image segmentation have been explored and resulted in models like TransUnet [ ] and Swin-UNet [ ] for general-purpose use and models like TransBTS [ ] and Swin-UNETR [ ] with the backbone for brain tumor segmentation [ , ]. Special study deserves transformer use in neuroimaging, as the structures of the brain are complicated. Neural networks based on transformers can model long-range dependencies and spatial relationships of the brain images [ ], which is very important in brain segmentation.Although transformers have shown very promising results in many medical imaging tasks, their use in neuroimaging segmentation remains an evolving field that had not been systematically reviewed. Existing literature reviews have either examined the use of transformers for general medical image segmentation without focusing specifically on brain segmentation [
, , ] or have reviewed brain segmentation techniques using various DL methods without emphasizing the role of transformers [ , ]. One more difference that exists between this review and others is the focus on applying transformers to neuroimage segmentation, which is a central task in neurological disorder diagnosis and treatment. For example, compared with more general surveys such as those by Shamshad et al [ ] and Xiao et al [ ], which address a wide range of tasks or organ systems, our work specifically focuses on the unique challenges and developments within brain image segmentation. Thus, this scoping review will seek to fill the gap by focusing solely on transformer applications in neuroimage segmentation, an area of paramount importance for the diagnosis and treatment of neurological disorders. A scoping review on this topic is appropriate because the application of transformers in this area is relatively new and fast-developing; hence, it allows for comprehensively mapping the current research landscape and identifying knowledge gaps.The main purpose of this scoping review is to synthesize and critically evaluate the existing literature on the use of different transformer models for neuroimaging segmentation. This review aims at summarizing the types of transformer models applied, their performance, applications in various neurological conditions and imaging modalities, limitations of the current literature, and highlighting the existing gaps in research.
Methods
Study Design
The approach of this scoping review follows the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews) guidelines (
) [ ]. Our primary research question was “What are the current applications, performance, and limitations of transformer models in neuroimaging segmentation?”The goal was the extraction of key themes within recent literature related to transformer use in neuroimaging segmentation that will guide future research and clinical applications. Only the literature starting from 2019 was considered, since the rapid evolution in the development of transformer models for medical imaging is a key recent development in this field.
We defined transformer models as DL architectures relying on self-attention mechanisms, capable of processing sequential data and capturing long-range dependencies. From the neuroimaging perspective, we considered those studies where these models were applied to different modalities of brain imaging, focusing on MRI due to its prevalence in neurological diagnostics.
Our review process followed a systematic search in 4 major databases: Scopus, IEEE Xplore, PubMed, and ACM Digital Library. We present a comprehensive review of methodologies, results, strengths, and limitations of the included studies to derive useful insights that bridge technical developments with their implications in the clinical domain of neuroimage segmentation.
Search Strategy
Studies were retrieved on May 22, 2023, through searching the following databases: IEEE Xplore, ACM Digital Library, Scopus, and PubMed. The search was limited to 5 years, from 2019 to 2023, to prioritize recent research and consisted of search queries related to transformers such as “transformer,” “deep learning,” and “self-attention”; queries related to neuroimaging such as “brain,” “neuroimaging,” “MRI,” “CT,” and “PET”; and queries related to the medical field such as “health care,” “medical,” “health.”
Study Eligibility Criteria
This review only included papers whose primary purpose was on the use of transformers for the segmentation of neuroimages. Our search included journal papers, conference papers, and dissertations that focused on applying transformer models to imaging scans (eg, MRI and CT) of the human brain. We excluded all studies that (1) used transformers for the segmentation of nonneuroimaging, raw brain signals, or electroencephalogram data; (2) were not in English, review papers, conference abstracts, preprints, protocols, and conference abstracts; (3) focused on neuroimaging tasks other than segmentation (eg, classification and prediction); and (4) were published before 2019.
Study Selection
For this review, we used a 3-step study selection process. First, we used EndNote (Clarivate) to remove duplicate studies returned by our initial search. Next, 3 independent reviewers (MI, AA, and MA) screened the titles and abstracts of the remaining papers to exclude irrelevant studies. We then obtained full texts of the studies that passed the initial screening, and the same 3 reviewers (MI, AA, and MA) examined them against our predefined inclusion criteria. Any disagreements between reviewers during the screening processes were resolved through in-person discussions until a consensus was reached.
Data Collection
The data extraction for this review is done in Microsoft Excel by 2 independent groups of 2 reviewers (MI and AA and MA and OE) to share the workload for extraction and resolve conflicts between the groups. Disagreements in data extraction were resolved through consensus during face-to-face discussions. Data extractions fall into 3 broad categories: study characteristics, neuroimaging acquisition, and transformer features.
Synthesis of included data was done using a narrative approach. Descriptive texts, tables, and figures describe and show the summary and characteristics of the data. Microsoft Excel was used to manage and synthesize the data. First, we depict the characteristics of each included study concerning publication year, type of publication, and country of origin. Then, we describe the neuroimaging acquisition of these studies by including the imaging modality, dataset, dataset accessibility, and neurological condition. Finally, it explains the transformer architecture of the included studies: the number of parameters, transformer type, hybrid component, and the training and evaluation methodology used along with loss function, optimizer, and metrics.
Ethical Considerations
This scoping review synthesized and analyzed publicly available research studies. No direct human participant research was conducted; therefore, approval from an institutional review board or a research ethics committee was unnecessary. There was no collection, use, or dissemination of personal data from human participants in this study. Data extracted and analyzed in this review had been sourced from published studies, previously subjected to ethical review processes as part of their original publication. The review followed ethics in research practices, in that the representation of the included studies was true to form, and the methodology for the selection of studies and extraction was transparent. No individual participant data were accessed or reported in this review; hence, privacy and confidentiality were ensured. Since this study did not involve direct contact with human participants, issues regarding informed consent and protection of privacy and compensation of participants were therefore not relevant in this study. No images or supplementary material showing identifiable information of any individual were used in this review.
Results
Overview
A total of 1246 publications were retrieved from the initial search of the selected databases. In the first round of screening, 261 duplicates were identified and removed through the use of EndNote X9, leaving 985 publications remaining. In the second round, 761 publications were excluded through the analysis of their titles and abstracts against our predefined inclusion or exclusion criteria. The remaining 224 publications continued to the third round of screening, which included a detailed full-text read-through and resulted in the exclusion of 156 publications. Of the 1246 initial publications, only 68 studies met our criteria and were thereby included in this review.
depicts the full screening process in more detail.
Characteristics of the Included Studies
depicts the characteristics and metadata of each included study, including the publication year, country, and type. Included studies ranged between 2019 and 2023. Over half of the included studies were published in 2022, followed by 32.84% (n=22) in 2023. Studies included peer-reviewed papers (n=48, 71.64%) and conference papers (n=19, 29.36%). The included studies spanned a total of 13 countries, with China being by far the largest contributor in this domain, representing 68.66% (n=46) of the total studies. Following China, we can find the United States (n=5, 7.46%), the United Kingdom (n=4, 5.97%), and India (n=3, 4.48%), with other countries contributing 1 paper apiece.
Features | Studies, n (%) | References | |
Year of publication | |||
2023 | 22 (32.84) | [ | - ]|
2022 | 34 (50.75) | [ | - ]|
2021 | 10 (14.93) | [ | , , - ]|
2019 | 1 (1.49) | [ | ]|
Type of publication | |||
Journal paper | 48 (71.64) | [ | , - , - , , , , , , , ]|
Conference paper | 19 (29.36) | [ | , , - , , , , , - , - , - ]|
Country of publication | |||
China | 46 (68.66) | [ | , - , , - , , - , , , - , - , , , - , , ]|
United States | 5 (7.46) | [ | , , , , ]|
United Kingdom | 4 (5.97) | [ | , , , ]|
India | 3 (4.48) | [ | , , ]|
Other | 9 (13.43) | [ | , , , , , , , , ]
Neuroimaging Acquisition and Neurological Condition
depicts the different imaging modalities used, the datasets used, and the different neurological conditions across the included studies. The included studies included a range of 6 different modalities, with MRI being by far the most common with 88% (n=59), followed by CT with 10.45% (n=7), and the remaining modalities with 1 each. Over half of the included studies used only 1 dataset (n=40, 59.70%) for training and evaluation purposes, followed by 23.88% (n=16) using 2 datasets. Of the 44 unique datasets used across the included studies, 70.45% (n=31) are public or open-source datasets, and 29.55% (n=13) are private datasets obtained directly from medical institutions. Regarding the public dataset category, the brain tumor segmentation dataset (BraTS dataset) is by far the most widely used, with 58.21% (n=39) of total studies using its variants (including BraTS 2015, 2017, 2018, 2019, 2020, and 2021), followed by Medical Segmentation Decathlon and low-grade glioma-Kaggle with 5.97% (n=4) each. The main neurological condition of the included studies was the segmentation of brain tumors, with 71.64% (n=48) of studies conducting research specifically in this area.
Features | Studies, n (%) | References | |
Imaging modality | |||
MRIa | 59 (88.06) | [ | , , - , - , - , , , , , , - ]|
CTb | 7 (10.45) | [ | , , , , , , ]|
PETc | 1 (1.49) | [ | ]|
Interventional ultrasound | 1 (1.49) | [ | ]|
Electron microscopy | 1 (1.49) | [ | ]|
Digital subtraction angiography | 1 (1.49) | [ | ]|
Number of datasets used | |||
1 | 40 (59.7) | [ | , , , , - , , , - , , , , , , , - , - , - ]|
2 | 16 (23.88) | [ | , , , , , , , , , , , , , , , ]|
3+ | 10 (14.93) | [ | , , , , , , , , , ]|
Not mentioned | 1 (1.49) | [ | ]|
Dataset accessibility | |||
Public | 31 (70.45) | —d | |
Private | 13 (29.55) | — | |
Dataset | |||
Public: BraTSe | 39 (58.21) | [ | , , , - , - , , , , , , - , - , , , , , , , , - , , , , , - ]|
MSDf | 4 (5.97) | [ | , , , ]|
iseg-2017 | 3 (4.48 | [ | , , ]|
ADNIg | 2 (2.99) | [ | , ]|
MRBrainSh | 2 (2.99) | [ | , ]|
LGGi-Kaggle | 4 (5.97) | [ | , , , ]|
ISLESj | 3 (4.48) | [ | , , ]|
WMHk | 3 (4.48) | [ | , , ]|
Other | 12 (17.91) | [ | , , , , , , , , , , , ]|
Private | 9 | [ | , , , , , , , , ]|
Not stated | 1 (1.49) | [ | ]|
Neurological condition | |||
Brain tumor | 48 (71.64) | [ | , , - , - , - , , , - , , , , - , , , , , , - ]|
Ischemic stroke | 4 (5.97) | [ | , , , ]|
Alzheimer disease | 3 (4.48) | [ | , , ]|
Parkinson disease | 2 (2.99) | [ | , ]|
Intracerebral hemorrhage | 3 (4.48) | [ | , , ]|
Intracranial aneurysms | 1 (1.49) | [ | ]|
Autism | 1 (1.49) | [ | ]|
Brain lesions | 1 (1.49) | [ | ]|
Healthy brain | 6 (8.96) | [ | , , , , , ]
aMRI: magnetic resonance imaging.
bCT: computed tomography.
cPET: positron emission tomography.
dNot available.
eBraTS: brain tumor segmentation dataset.
fMSD: Medical Segmentation Decathlon.
gADNI: Alzheimer’s Disease Neuroimaging Initiative.
hMRBrainS: magnetic resonance brain image segmentation.
iLGG: low-grade glioma.
jISLES: ischemic stroke lesion segmentation.
kWMH: white matter hyperintensities.
Transformer-Based Techniques Types, Training Parameters, and Evaluation
The proposed neuroimage segmentation techniques used various artificial intelligence (AI) techniques. In this review, we focused on the deep transformer–based techniques that have gained more attention recently. From the proposed models, we can find transformer-based, CNN with transformer-based, and generative adversarial network with transformer-based techniques. At the same time, the methods based on TransBTS, TransUNet, SeinUNet, and U-Net with transformer are the most used models for neuroimage segmentation.
illustrates these models in terms of architecture. depicts the characteristics of transformer models used within the included studies. From , we can find that 58.21% (n=39) of the included studies did not explicitly report the number of parameters of their proposed models. Of the studies that did, however, the majority of the transformer models proposed had between 20 and 40 million parameters (n=10, 14.93%), followed by 1 and 19 million (n=8, 11.94%). A majority of studies implemented a 3D segmentation network (n=42, 62.69%), with 37.31% (n=25) being 2D. An overwhelming 85.07% (n=57) of included studies proposed transformer models that are hybrid, with only 14.93% (n=10) of them being standalone transformer models. ViT was the most used transformer architecture, with 55.22% (n=37) of studies using it as its main component. Another significant transformer model is the Swin transformer, with 20.89% (n=14), followed by TransUnet, with 5.97% (n=4). Of the 57 hybrid transformer models, 55 (96.49%) studies opted for a combination of CNN with their transformer, and of those 55 CNN-transformer models, 56.36% (n=31) were U-Net based, and 9.09% (n=5) were ResNet based. Both generative adversarial network (n=2, 3.51%) and autoencoders (n=2, 3.51%) were also combined with transformers.
Features | Studies, n (%) | References | |
Numberofparameters(in millions) | |||
1-19 | 8 (11.94) | [ | , , , , , , , ]|
20-39 | 10 (14.93) | [ | , , , , , , , , , ]|
40-59 | 4 (5.97) | [ | , , , ]|
60-100 | 3 (4.48) | [ | , , ]|
100-120 | 3 (4.48) | [ | , , ]|
120+ | 2 (2.99) | [ | , ]|
Not mentioned | 39 (58.21) | [ | , , , , , , - , , , , , , , , - , , , , , , , - , - ]|
Dimensionality | |||
2D | 25 (37.31) | [ | , , , , , , , , , , , , , , , , , , , , ]|
3D | 42 (62.69) | [ | , , - , , , , , , - , - , - , , , - , , - , , , , , , - ]|
Transformer model | |||
Standalone | 10 (14.93) | [ | , , , , , , , , , ]|
Hybrid | 57 (85.07) | [ | , , , , - , - , - , - , , , - , - , - ]|
Typeoftransformer | |||
ViTa | 37 (55.22) | [ | , , , , , , , - , - , , , , , , - , , , , , , , - ]|
Swin | 14 (20.89) | [ | , , , , , , - , , , , , ]|
SwinUnet | 2 (2.99) | [ | , ]|
TransUnet | 4 (5.97) | [ | , , , ]|
TransBTS | 2 (2.99) | [ | , ]|
Other | 8 (11.94) | [ | , , , , , , , ]|
Typeofhybridcomponent | |||
CNNb | 55 (96.49) | [ | , , , , - , - , - , - , - , , , , - , - ]|
U-Net | 31 (56.36) | [ | , , , , , , , - , , , , , - , , , - , , - , ]|
ResNet | 5 (9.09) | [ | , , , , ]|
GANc | 2 (3.51) | [ | , ]|
Autoencoder | 2 (3.51) | [ | , ]
aViT: vision transformer.
bCNN: convolutional neural network.
cGAN: generative adversarial network.
depicts the loss function used, the optimizer used, and the different evaluation methods used across each included study. The loss function was not mentioned in 11.94% (n=8) of the included studies. Of the studies that mentioned it, the most popular loss function is a combination of cross-entropy and Dice loss with 40.30% (n=27) of included studies, followed by Dice loss with 19.40% (n=13). Adam is the most used optimizer, with 47.76% (n=32) of included studies using it, followed by AdamW at 14.92% (n=10). However, the optimizer was not mentioned in 22.39% (n=15) of studies. In terms of evaluation, over half of the included studies used at least two evaluation metrics (n=34, 50.75%), followed by 1 metric (n=11, 16.42%). Of these evaluation metrics, the Dice score is by far the most used, with 94.03% (n=63) of all studies using it, followed by HD95, 52.24% (n=35), and sensitivity, 28.36% (n=19).
Features | Studies, n (%) | References | |
Lossfunction1 | |||
Dice loss | 13 (19.4) | [ | , , , , - , , , , ]|
Cross-entropy | 9 (13.43) | [ | , , , , , , , , ]|
Dice cross-entropy | 27 (40.3) | [ | , , , , , , , , , , , , , - , , - , - ]|
Other | 10 (14.93) | [ | , , , , , , , , , ]|
Not mentioned | 8 (11.94) | [ | , , , , , , , ]|
Optimizer | |||
Adam | 32 (47.76) | [ | , , , , , , , , - , - , , , , , , - , - , , - ]|
AdamW | 10 (14.92) | [ | , , , , , , , , , ]|
SGDa | 7 (10.45) | [ | , , , , , , ]|
Ranger | 2 (2.99) | [ | , ]|
RMSpropb | 1 (1.49) | [ | ]|
Apollo | 1 (1.49) | [ | ]|
Not mentioned | 15 (22.39) | [ | , , , , , , , , , , , , , , ]|
Evaluationmetrics | |||
Dice score | 63 (94.03) | [ | , , - , - , - , - ]|
HD95c | 35 (52.24) | [ | , , - , - , , , , - , , , , , , , , , , , , , , , , , - ]|
Recall or sensitivity | 19 (28.36) | [ | , , , , - , , , , , , , , , , , , ]|
IoUd | 12 (17.91) | [ | , , - , , , , , , , ]|
Precision | 10 (11.94) | [ | , , , , , , , , , ]|
Accuracy | 6 (8.96) | [ | , , , , , ]|
Specificity | 5 (7.46) | [ | , , , , ]|
AUCe | 5 (7.46) | [ | , , , , ]|
F-measure | 3 (4.48) | [ | , , ]|
Jaccard index | 4 (5.97) | [ | , , , ]|
Other | 5 (7.46) | [ | , , , , ]
aSGD: stochastic gradient descent.
bRMSprop: root mean square propagation.
cHD95: Hausdorff distance at the 95th percentile.
dIoU: Intersection Over Union.
eAUC: area under the curve.
Strengths and Limitations of Transformer-Based Techniques
Transformers have revolutionized the area of neuroimage segmentation by offering unparalleled capabilities in modeling complex features in medical imaging. It has the ability to model both local and global information, which substantially improves the accuracy of segmentation and therefore becomes very useful in various neurological applications. As shown in
, the common strengths of transformer-based techniques include a high mean Dice score, effective fusion of multimodal MRI, and robust performance across diverse and complex datasets. However, these models also have substantial limitations in terms of high computational and memory costs, sensitivity to small areas of tumors, and possible overfitting on smaller datasets.References | Strengths | Limitations | |
ViTa | |||
[ | , , , , , , , - , - , , , , , , - , , , , , , , - ]
|
| |
Swin | |||
[ | , , , , , , - , , , , , ]
|
| |
SwinUnet | |||
[ | , ]
|
| |
TransUnet | |||
[ | , , , ]
|
| |
TransBTS | |||
[ | , ]
|
| |
Other transformer types | |||
[ | , , , , , , , ]
|
|
aViT: vision transformer.
bMRI: magnetic resonance imaging.
cSSL: self-supervised learning.
dCNN: convolutional neural network.
eLGG: low-grade glioma.
Discussion
Principal Findings
The main purpose of this scoping review is to conduct a thorough investigation into the use of different transformer models in the field of neuroimaging, specifically segmentation. From the gathered data, it is clear that the use of transformers in neuroimaging experienced a great boost in research from 2021 to 2022, with over half of the included studies being published in 2022 compared to only 10 studies in 2021. It is also important to note that for the year 2023, only the studies up to May 22 were included; yet, this constitutes a total percentage of the included studies of 32.84% (n=22) and could very well be even higher when the whole year is considered.
From the studies included in this review, it is clear that MRI is by far the most popular image modality for applying transformer models to neuroimaging segmentation. This can be attributed to how common the use of MRI is in the diagnosis of neurological illnesses, especially for brain tumors [
], wherein it is able to provide functional, structural, and metabolic information [ ] through the use of its different modalities (T1, T2, T1ce, and fluid-attenuated inversion recovery). MRI is particularly suitable for neuroimaging segmentation purposes because of the high spatial resolution and soft tissue contrast, both being critical for any form of precise segmentation it exhibits, since it is able to show good detailed visualization of structures in the brain and distinction between different tissues of the brain sizes [ , ].Another reason for the popularity of MRI in the included studies is the availability of brain MRI scans sourced from the widely used BraTS datasets [
]. This yearly and open-source dataset contains a wide variety of different MRI modalities that are manually annotated, making it a very important resource for developing and benchmarking segmentation methods based on different transformer models. This is why it is no surprise that it is by far the most used dataset in the included studies.When it comes to neurological conditions, a majority of included studies in this review focused on the use of transformers in brain tumor segmentation. This can be attributed to multiple factors, including the availability of MRI scans from the BraTS dataset that are specifically for brain tumor segmentation. Brain tumors are also highly prevalent among all ages and have a high fatality rate [
], making it a prime area for research into new methods of diagnosis and treatment. In addition, brain tumors are fairly complex and irregular in both location and shape [ ], which makes manual segmentation a very tedious and time-consuming process that would benefit greatly from increased research into more automated methods for segmentation. The BraTS dataset is also a factor, as it provides a large variety of MRI scans that are specifically for brain tumor segmentation. Transformers are particularly useful for brain tumor segmentation due to their self-attention mechanism, which allows them to account for different variations in tumor characteristics, such as size and shape, during the segmentation process [ ].Most included studies proposed and developed models with 3D segmentation networks, specifically for 3D imaging data. In terms of neuroimaging, 3D scans are more common in part due to the 3D nature of MRI scans. Since MRI is the most common imaging modality used in neuroimaging, it makes sense that it is preferable to develop models for 3D imaging data in order to avoid the loss of information. Even though 3D models are typically more accurate for 3D imaging segmentation, they are computationally expensive [
], which is why some proposed models in the included studies chose to instead extract 2D slices from 3D imaging data. While this technique is suitable, reducing a 3D scan into 2D slices can lead to a degradation of volume and spatial characteristics native to 3D data [ ].CNN-transformer hybrid models were used far more than standalone transformer models in the included studies, specifically in the form of a U-Net and transformer combination. These combinations capitalize on the strengths of both CNN and transformers while minimizing their weaknesses. CNN is particularly useful in extracting local features and spatial information from the provided scans; however, it often struggles to capture long-range dependencies due to its small kernel size [
, ]. On the other hand, transformers are able to model these long-range dependencies due to their self-attention module, making them very useful for neuroimaging segmentation, especially in the case of brain tumors [ ]. This is why most included studies opted to use the use of CNN to capture local features and transformers to capture global features to increase the performance of their models in the task of segmentation [ ].Research and Practical Implications
This scoping review provides an overview of the available research regarding the use of transformers in the context of neuroimaging segmentation. These findings underline important implications for future research and applications in this area.
It is a notable finding of this review that many studies apply transformers, specifically to brain tumor segmentation, which might hint at the potential of transformers in assisting diagnosis and treatment planning in this field. As shown here, transformers are well-suited for this task. However, further research is needed to assess the real-world clinical usefulness of transformers for brain tumor segmentation. While brain tumors are an important challenge, the focus on this single application at this level would seem indicative of the current lack of large and good-quality datasets in many other big neurological diseases and conditions, such as Alzheimer and Parkinson disease, and strokes. Making publicly available manually annotated datasets of different neurological conditions would motivate new research and developments on the application of transformers in this field. On top of this, the heavy reliance of studies on the BraTS dataset shows that there is a need to diversify datasets in order to validate different models correctly. Most of the included studies favored hybrid use by combining CNN and transformers, which illustrates the complementary strengths of these architectures for neuroimaging segmentation. Success in hybrid techniques shows that further exploration of novel integrations between transformers, CNNs, and other modules could become a promising direction to achieve better performances on more complex medical image analysis problems. Improved accuracy in neuroimaging segmentation, through the ability of transformer models to extract local and global features, allows for more accurate identification of neurological conditions such as brain tumors. This will provide earlier diagnosis and treatment. Moreover, automation with these models will save much time of the clinicians in performing manual segmentation so that they can concentrate on the care of patients and other important tasks. Treatment planning may also be improved with transformer models, where the potential for more accurate and consistent segmentation results helps a lot in this respect. Moreover, these models can also potentially be integrated into clinical workflows without much hassle by developing user-friendly interfaces and collaboration between AI researchers and clinicians to ensure these tools are adopted and effectively used in practice.
Strengths and Limitations
This scoping review has numerous key strengths with regard to the analysis of transformer applications in neuroimaging segmentation. First, it gives a broad overview of the fast-evolving field by capturing recent works from 2019 through 2023. Second, it allows focusing on current research so that the review reflects the state-of-the-art in transformer applications for medical imaging. It is a systematic approach, covering 4 major databases; hence, wide and comprehensive coverage of the literature reviewed. The inclusion of journal papers and conference papers facilitates a wide view of both consolidated and emergent research. Third, this review gives elaborate insights into various aspects of transformer use in neuroimaging: imaging modality, dataset, neurological condition, and metric for performance evaluation. This level of analysis provides rich information relevant to both researchers and practitioners within the field. Finally, the review’s focus on brain tumor segmentation, while a limitation in some respects, also serves as a strength by providing an in-depth look at transformer applications in a critical area of medical imaging with significant clinical implications.
While this scoping review offers a number of strengths, its limitations need to be acknowledged so as to strike a balance. First, the review was on transformers in neuroimaging segmentation alone, excluding other medical imaging tasks or organs. This narrow focus allows for an in-depth analysis of transformer applications in brain imaging but may not be representative of the full spectrum of use that transformers have seen in medical imaging. This limitation could be reduced by expanding the scope of future reviews to multiple organ systems or imaging tasks, giving a wider look at transformer applications in medical imaging.
Second, the review was focused on studies published in the English language, published from 2019 up to 2023. This narrowing was necessary, as most current works are favored in this novelty area of transformer use in medical imaging. In so doing, this review criterion may have left out important non-English language publications or early applications of transformers. This is likely a limitation in the representation of research trends worldwide. In this respect, future studies can be designed to include more languages, also extending the date range to capture more diverse sets of publications and track the evolution of transformer use in medical imaging over an extended period.
Third, the fact that 58.21% (n=39) of the works included in this review were based on the BraTS dataset introduces a certain bias in the domain toward the segmentation of brain tumors. Though it is a very critical area, it might not be useful to represent transformers completely for other neurological conditions. Future research needs to give more emphasis to developing and publicly releasing manually annotated datasets about more neurological conditions to address this limitation. This will further encourage diverse applications of transformers in neuroimaging and provide a wider understanding of the capability of transformers across different pathologies.
The review demonstrated a high dominance by studies from China, with 46 (68.66%) studies of the total (see
for detailed analysis). This aligns with broader publication patterns in AI research where Chinese institutions contribute approximately 40% of global publications. While this distribution reflects documented trends in international research output, future reviews might benefit from more diversified search strategies to ensure comprehensive coverage of global research activities in this field.Finally, no formal quality or risk-of-bias assessment of the included studies was performed. Although this represents a common approach when it comes to scoping reviews, this limits the degree to which strong conclusions can be drawn about the relative effectiveness of various approaches to transformers. Future systematic reviews or meta-analyses may involve quality assessments to support more robust evidence in terms of the efficacy of transformer models in neuroimaging segmentation.
Future Directions
These findings point to a variety of promising directions for future research on the application of transformers to neuroimaging segmentation. First, future studies should develop novel integrations between transformers, CNNs, and other advanced modules that will further improve performance for complex medical image analysis tasks. This might be achieved by investigating various hybrid models leveraging the strengths of transformers and more traditional DL methods. Second, the extension of transformer applications to more neurological conditions other than brain tumors, which would allow a wider grasp of the potential capability of transformers across different pathologies. More clinical applications are likely to follow from here. Third, the development of new transformer-based methods or their combination with emerging techniques like diffusion models could further improve efficiency and robustness for both 2D and 3D brain segmentation. Fourth, future studies shall be done to bridge the current limitations in dataset diversity. This may be in creating and publishing manually annotated datasets for a wider range of neurological conditions that can enable transformers to apply to neuroimaging in more diversified ways. Finally, the translation of research findings into clinical practice remains a high unmet need. This transition will require extensive validation of transformer models on diverse, real-world datasets and close collaboration between AI researchers and clinicians. Such collaboration could result in the development of more clinically relevant models and user-friendly interfaces, which would expedite the translation of these advanced technologies into routine clinical practice.
Conclusions
This scoping review has thoroughly investigated the applications of transformers in neuroimaging segmentation and discovered a highly evolving field with great potential. The results of this paper have shown that transformer models, especially combined with CNNs in hybrid architectures, are also very promising for the task of brain MRI segmentation. Some of the big advantages of transformers include the modeling of long-range dependencies in images through self-attention mechanisms while still being able to perform local feature extraction. Such a combination uniquely allows for more accurate and detailed segmentation in highly complex neurological pathologies, like brain tumors.
There is clearly a trend toward 3D transformer models and hybrid CNN-transformer architectures, dominated by ViT as the variant of transformer used most frequently. These approaches also obtain superior performance on benchmark datasets, such as brain tumor segmentation tasks. However, reliance on the BraTS dataset highlights a requirement for more diverse data sources to ensure that performance could be validated across more multiple neurological conditions.
While this is promising, there are still important issues in the field: high computational costs associated with transformer models, overfitting on smaller datasets, and validation in larger clinical settings. Another issue is the geographical concentration of research output that highlights the need for greater diversity in the origins of studies worldwide to improve the generalizability of findings.
The future prospect of transformer models will unlock the potential that neuroimaging segmentation demands. Refining both architectures and training methods and integration into clinical workflows, transformations may provide state-of-the-art for fast, accurate, and reproducible brain-MRI segmentation, hence advancing clinical diagnosis and evaluation techniques for a better outcome in regard to patients with neurological disorders.
Although transformers have shown great improvement in neuroimaging segmentation, much potential is yet to be realized. Future work will need to be focused on present limitations, the extension of applications across a wider range of neurological conditions, and narrowing the gap between research and clinical practice to ensure that transformers are a valuable and impactful technology in medical imaging analysis.
Acknowledgments
This work was supported in part by the United Arab Emirates University (UAEU grant 12T037) and in part by the Big Data Analytics Center (UAEU grant 12R239).
Data Availability
All data generated or analyzed during this study are included in this published paper and
.Conflicts of Interest
None declared.
PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews) checklist.
DOCX File , 84 KBSignificant prevalence of studies published in China.
DOCX File , 104 KBReferences
- Abd-Ellah MK, Awad AI, Khalaf AAM, Hamed HFA. A review on brain tumor diagnosis from MRI images: practical implications, key achievements, and lessons learned. Magn Reson Imaging. 2019;61:300-318. [CrossRef] [Medline]
- Akkus Z, Galimzianova A, Hoogi A, Rubin DL, Erickson BJ. Deep learning for brain MRI segmentation: state of the art and future directions. J Digit Imaging. 2017;30(4):449-459. [FREE Full text] [CrossRef] [Medline]
- Shoeibi A, Khodatars M, Jafari M, Ghassemi N, Moridian P, Alizadehsani R, et al. Diagnosis of brain diseases in fusion of neuroimaging modalities using deep learning: a review. Inf Fusion. 2023;93:85-117. [CrossRef]
- Hasan AM, Meziane F, Aspin R, Jalab HA. Segmentation of brain tumors in MRI images using three-dimensional active contour without edge. Symmetry. 2016;8(11):132. [CrossRef]
- Domingues I, Pereira G, Martins P, Duarte H, Santos J, Abreu PH. Using deep learning techniques in medical imaging: a systematic review of applications on CT and PET. Artif Intell Rev. 2019;53(6):4093-4160. [CrossRef]
- Noor MBT, Zenia NZ, Kaiser MS, Mamun SA, Mahmud M. Application of deep learning in detecting neurological disorders from magnetic resonance images: a survey on the detection of Alzheimer's disease, Parkinson's disease and schizophrenia. Brain Inform. 2020;7(1):11. [FREE Full text] [CrossRef] [Medline]
- Ghaffari M, Sowmya A, Oliver R. Automated brain tumor segmentation using multimodal brain scans: a survey based on models submitted to the BraTS 2012-2018 challenges. IEEE Rev Biomed Eng. 2020;13:156-168. [CrossRef] [Medline]
- Fawzi A, Achuthan A, Belaton B. Brain image segmentation in recent years: a narrative review. Brain Sci. 2021;11(8):1055. [FREE Full text] [CrossRef] [Medline]
- Wadhwa A, Bhardwaj A, Singh Verma V. A review on brain tumor segmentation of MRI images. Magn Reson Imaging. 2019;61:247-259. [CrossRef] [Medline]
- Gau K, Schmidt CSM, Urbach H, Zentner J, Schulze-Bonhage A, Kaller CP, et al. Accuracy and practical aspects of semi- and fully automatic segmentation methods for resected brain areas. Neuroradiology. 2020;62(12):1637-1648. [FREE Full text] [CrossRef] [Medline]
- Wang R, Lei T, Cui R, Zhang B, Meng H, Nandi AK. Medical image segmentation using deep learning: a survey. IET Image Process. 2022;16(5):1243-1267. [CrossRef]
- Seo H, Badiei Khuzani M, Vasudevan V, Huang C, Ren H, Xiao R, et al. Machine learning techniques for biomedical image segmentation: an overview of technical aspects and introduction to state-of-art applications. Med Phys. 2020;47(5):e148-e167. [FREE Full text] [CrossRef] [Medline]
- Zhao Z, Chuah JH, Lai KW, Chow CO, Gochoo M, Dhanalakshmi S, et al. Conventional machine learning and deep learning in Alzheimer's disease diagnosis using neuroimaging: a review. Front Comput Neurosci. 2023;17:1038636. [FREE Full text] [CrossRef] [Medline]
- Balwant MK. A review on convolutional neural networks for brain tumor segmentation: methods, datasets, libraries, and future directions. IRBM. 2022;43(6):521-537. [CrossRef]
- Ronneberger O, Fischer P, Brox T. U-net: convolutional networks for biomedical image segmentation. 2015. Presented at: International Conference on Medical Image Computing and Computer-Assisted Intervention, Proceedings, Part III; October 5-9, 2015:18; Munich, Germany. [CrossRef]
- Mall PK, Singh PK, Srivastav S, Narayan V, Paprzycki M, Jaworska T, et al. A comprehensive review of deep neural networks for medical image processing: recent developments and future opportunities. Healthc Anal. 2023;4:100216. [CrossRef]
- Gul S, Khan MS, Bibi A, Khandakar A, Ayari MA, Chowdhury MEH. Deep learning techniques for liver and liver tumor segmentation: a review. Comput Biol Med. 2022;147:105620. [CrossRef] [Medline]
- Yousef R, Khan S, Gupta G, Siddiqui T, Albahlal BM, Alajlan SA, et al. U-net-based models towards optimal MR brain image segmentation. Diagnostics (Basel). 2023;13(9):1624. [FREE Full text] [CrossRef] [Medline]
- Badrinarayanan V, Kendall A, Cipolla R. SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell. 2017;39(12):2481-2495. [CrossRef] [Medline]
- He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. 2016. Presented at: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR); June 27-30, 2016; Las Vegas, NV, United States. [CrossRef]
- Huang G, Liu Z, Van Der Maaten L, Weinberger KQ. Densely connected convolutional networks. 2017. Presented at: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR); July 21-26, 2017; Honolulu, HI, United States. [CrossRef]
- Tran D, Bourdev L, Fergus R, Torresani L, Paluri M. Learning spatiotemporal features with 3D convolutional networks. 2015. Presented at: 2015 IEEE International Conference on Computer Vision (ICCV); December 7-13, 2015; Santiago, Chile. [CrossRef]
- Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL. DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Mach Intell. 2018;40(4):834-848. [CrossRef] [Medline]
- Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, et al. Attention is all you need. 2017. Presented at: NIPS'17: Proceedings of the 31st International Conference on Neural Information Processing Systems; December 4-9, 2017; Long Beach, CA, United States.
- He K, Gan C, Li Z, Rekik I, Yin Z, Ji W, et al. Transformers in medical image analysis. Intell Med. 2023;3(1):59-78. [CrossRef]
- Xiao H, Li L, Liu Q, Zhu X, Zhang Q. Transformers in medical image segmentation: a review. Biomed Signal Process Control. 2023;84:104791. [CrossRef]
- Akinyelu AA, Zaccagna F, Grist JT, Castelli M, Rundo L. Brain tumor diagnosis using machine learning, convolutional neural networks, capsule neural networks and vision transformers, applied to MRI: a survey. J Imaging. 2022;8(8):205. [CrossRef] [Medline]
- Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, et al. An image is worth 16x16 words: transformers for image recognition at scale. ArXiv. Preprint posted online on June 3, 2021. [FREE Full text]
- Shamshad F, Khan S, Zamir SW, Khan MH, Hayat M, Khan FS, et al. Transformers in medical imaging: a survey. Med Image Anal. 2023;88:102802. [CrossRef] [Medline]
- Thisanke H, Deshan C, Chamith K, Seneviratne S, Vidanaarachchi R, Herath D. Semantic segmentation using vision transformers: a survey. Eng Appl Artif Intell. 2023;126:106669. [CrossRef]
- Chen J, Lu Y, Yu Q, Luo X, Wang Y, Adeli E, et al. TransUNet: transformers make strong encoders for medical image segmentation. ArXiv. Preprint posted online on February 8, 2021. [FREE Full text]
- Cao H, Wang Y, Chen Y, Jiang D, Zhang X, Tian X, et al. Swin-Unet: Unet-like pure transformer for medical image segmentation. 2022. Presented at: Computer Vision—ECCV 2022 Workshops; October 23-27, 2022; Tel Aviv, Israel. [CrossRef]
- Wang W, Chen C, Ding M, Yu M, Zha S, Li J. TransBTS: multimodal brain tumor segmentation using transformer. 2021. Presented at: Medical Image Computing and Computer Assisted Intervention—MICCAI 2021; September 27-October 1, 2021; Strasbourg, France. [CrossRef]
- Hatamizadeh A, Nath V, Tang Y, Yang D, Roth HR, Xu D. Swin UNETR: swin transformers for semantic segmentation of brain tumors in MRI images. 2021. Presented at: Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries; September 27, 2021; Virtual Event. [CrossRef]
- Tricco AC, Lillie E, Zarin W, O'Brien KK, Colquhoun H, Levac D, et al. PRISMA Extension for Scoping Reviews (PRISMA-ScR): checklist and explanation. Ann Intern Med. 2018;169(7):467-473. [FREE Full text] [CrossRef] [Medline]
- Song J, Hahm J, Lee J, Lim CY, Chung MJ, Youn J, et al. Comparative validation of AI and non-AI methods in MRI volumetry to diagnose Parkinsonian syndromes. Sci Rep. 2023;13(1):3439. [FREE Full text] [CrossRef] [Medline]
- Cai Y, Long Y, Han Z, Liu M, Zheng Y, Yang W, et al. Swin Unet3D: a three-dimensional medical image segmentation network combining vision transformer and convolution. BMC Med Inform Decis Mak. 2023;23(1):33. [FREE Full text] [CrossRef] [Medline]
- Wu W, Yan J, Zhao Y, Sun Q, Zhang H, Cheng J, et al. Multi-task learning for concurrent survival prediction and semi-supervised segmentation of gliomas in brain MRI. Displays. 2023;78:102402. [CrossRef]
- Yan Q, Liu S, Xu S, Dong C, Li Z, Shi JQ, et al. 3D medical image segmentation using parallel transformers. Pattern Recognit. 2023;138:109432. [CrossRef]
- Liu Z, Ma C, She W, Xuan W. TransMVU: multi‐view 2D U‐Nets with transformer for brain tumour segmentation. IET Image Process. 2023;17(6):1874-1882. [CrossRef]
- Lu Y, Chang Y, Zheng Z, Sun Y, Zhao M, Yu B, et al. GMetaNet: multi-scale ghost convolutional neural network with auxiliary MetaFormer decoding path for brain tumor segmentation. Biomed Signal Process Control. 2023;83:104694. [CrossRef]
- Wei C, Ren S, Guo K, Hu H, Liang J. High-resolution Swin transformer for automatic medical image segmentation. Sensors (Basel). 2023;23(7):3420. [FREE Full text] [CrossRef] [Medline]
- Gharaibeh N, Abu-Ein AA, Al-hazaimeh OM, Nahar KMO, Abu-Ain WA, Al-Nawashi MM. Swin transformer-based segmentation and multi-scale feature pyramid fusion module for Alzheimer’s disease with machine learning. Int J Onl Eng. 2023;19(04):22-50. [CrossRef]
- Wang J, Li S, Yu L, Qu A, Wang Q, Liu J, et al. SDPN: a slight dual-path network with local-global attention guided for medical image segmentation. IEEE J Biomed Health Inform. 2023;27(6):2956-2967. [CrossRef] [Medline]
- Anaya-Isaza A, Mera-Jiménez L, Fernandez-Quilez A. CrossTransUnet: a new computationally inexpensive tumor segmentation model for brain MRI. IEEE Access. 2023;11:27066-27085. [CrossRef]
- Lin J, Lin J, Lu C, Chen H, Lin H, Zhao B, et al. CKD-TransBTS: clinical knowledge-driven hybrid transformer with modality-correlated cross-attention for brain tumor segmentation. IEEE Trans Med Imaging. 2023;42(8):2451-2461. [CrossRef] [Medline]
- Huang X, Liu Y, Li Y, Qi K, Gao A, Zheng B, et al. Deep learning-based multiclass brain tissue segmentation in fetal MRIs. Sensors (Basel). 2023;23(2):655. [FREE Full text] [CrossRef] [Medline]
- Hu Z, Li L, Sui A, Wu G, Wang Y, Yu J. An efficient R-Transformer network with dual encoders for brain glioma segmentation in MR images. Biomed Signal Process Control. 2023;79:104034. [CrossRef]
- Dhamija T, Gupta A, Gupta S, Anjum, Katarya R, Singh G. Semantic segmentation in medical images through transfused convolution and transformer networks. Appl Intell (Dordr). 2023;53(1):1132-1148. [FREE Full text] [CrossRef] [Medline]
- Gao H, Miao Q, Ma D, Liu R. Deep mutual learning for brain tumor segmentation with the fusion network. Neurocomputing. 2023;521:213-220. [CrossRef]
- Ashtari P, Sima DM, De Lathauwer L, Sappey-Marinier D, Maes F, Van Huffel S. Factorizer: a scalable interpretable approach to context modeling for medical image segmentation. Med Image Anal. 2023;84:102706. [CrossRef] [Medline]
- Marcus A, Bentley P, Rueckert D. Concurrent ischemic lesion age estimation and segmentation of CT brain using a transformer-based network. IEEE Trans Med Imaging. 2023;42(12):3464-3473. [CrossRef] [Medline]
- Piao Z, Gu YH, Jin H, Yoo SJ. Intracerebral hemorrhage CT scan image segmentation with HarDNet based transformer. Sci Rep. 2023;13(1):7208. [FREE Full text] [CrossRef] [Medline]
- Yang H, Zhou T, Zhou Y, Zhang Y, Fu H. Flexible fusion network for multi-modal brain tumor segmentation. IEEE J Biomed Health Inform. 2023;27(7):3349-3359. [CrossRef] [Medline]
- Wang Z, He M, Lv Y, Ge E, Zhang S, Qiang N, et al. Accurate corresponding fiber tract segmentation via FiberGeoMap learner with application to autism. Cereb Cortex. 2023;33(13):8405-8420. [CrossRef] [Medline]
- Liang J, Yang C, Zhong J, Ye X. BTSwin-unet: 3D U-shaped symmetrical swin transformer-based network for brain tumor segmentation with self-supervised pre-training. Neural Process Lett. 2022;55(4):3695-3713. [CrossRef]
- Rui-Qiang L, Xiao-Dong C, Ren-Zhe T, Cai-Zi L, Wei Y, Dou-Dou Z, et al. Automatic localization of target point for subthalamic nucleus-deep brain stimulation via hierarchical attention-UNet based MRI segmentation. Med Phys. 2023;50(1):50-60. [CrossRef] [Medline]
- Zhang S, Ren B, Yu Z, Yang H, Han X, Chen X, et al. TW-Net: transformer weighted network for neonatal brain MRI segmentation. IEEE J Biomed Health Inform. 2023;27(2):1072-1083. [CrossRef] [Medline]
- Khaled A, Han JJ, Ghaleb TA. Learning to detect boundary information for brain image segmentation. BMC Bioinformatics. 2022;23(1):332. [FREE Full text] [CrossRef] [Medline]
- Huang L, Zhu E, Chen L, Wang Z, Chai S, Zhang B. A transformer-based generative adversarial network for brain tumor segmentation. Front Neurosci. 2022;16:1054948. [FREE Full text] [CrossRef] [Medline]
- Chen S, Zhang JX, Zhang T. LETCP: a label-efficient transformer-based contrastive pre-training method for brain tumor segmentation. Appl Sci. 2022;12(21):11016. [CrossRef]
- Liang J, Yang C, Zeng L. 3D PSwinBTS: an efficient transformer-based Unet using 3D parallel shifted windows for brain tumor segmentation. Digital Signal Process. 2022;131:103784. [CrossRef]
- Zhang J, Liu Y, Wu Q, Wang Y, Liu Y, Xu X, et al. SWTRU: star-shaped window transformer reinforced U-net for medical image segmentation. Comput Biol Med. 2022;150:105954. [CrossRef] [Medline]
- Liu X, Cheng X. Segmentation method of magnetoelectric brain image based on the transformer and the CNN. Information. 2022;13(10):445. [CrossRef]
- Xu Y, He X, Xu G, Qi G, Yu K, Yin L, et al. A medical image segmentation method based on multi-dimensional statistical features. Front Neurosci. 2022;16:1009581. [FREE Full text] [CrossRef] [Medline]
- Gai D, Zhang J, Xiao Y, Min W, Zhong Y, Zhong Y. RMTF-Net: residual mix transformer fusion net for 2D brain tumor segmentation. Brain Sci. 2022;12(9):1145. [FREE Full text] [CrossRef] [Medline]
- Wu J, Xu Q, Shen Y, Chen W, Xu K, Qi XR. Swin transformer improves the IDH mutation status prediction of gliomas free of MRI-based tumor segmentation. J Clin Med. 2022;11(15):4625. [FREE Full text] [CrossRef] [Medline]
- Huang J, Fang Y, Wu Y, Wu H, Gao Z, Li Y, et al. Swin transformer for fast MRI. Neurocomputing. 2022;493:281-304. [CrossRef]
- Chen Y, Yin M, Li Y, Cai Q. CSU-Net: a CNN-transformer parallel network for multimodal brain tumour segmentation. Electronics. 2022;11(14):2226. [CrossRef]
- Zeineldin RA, Pollok A, Mangliers T, Karar ME, Mathis-Ullrich F, Burgert O. Deep automatic segmentation of brain tumours in interventional ultrasound data. Curr Dir Biomed Eng. 2022;8(1):133-137. [CrossRef]
- Pinaya WHL, Tudosiu PD, Gray R, Rees G, Nachev P, Ourselin S, et al. Unsupervised brain imaging 3D anomaly detection and segmentation with transformers. Med Image Anal. 2022;79:102475. [FREE Full text] [CrossRef] [Medline]
- Jiang Y, Zhang Y, Lin X, Dong J, Cheng T, Liang J. SwinBTS: a method for 3D multimodal brain tumor segmentation using swin transformer. Brain Sci. 2022;12(6):797. [FREE Full text] [CrossRef] [Medline]
- Kadri R, Bouaziz B, Tmar M, Gargouri F. Multimodal deep learning based on the combination of EfficientNetV2 and ViT for Alzheimer's disease early diagnosis enhanced by SAGAN data augmentation. Int J Comput Inf Syst Ind Manag Appl. 2022;14:313-325. [FREE Full text]
- Liu J, Zheng J, Jiao G. Transition Net: 2D backbone to segment 3D brain tumor. Biomed Signal Process Control. 2022;75:103622. [CrossRef]
- Wang J, Wang S, Liang W. METrans: multi‐encoder transformer for ischemic stroke segmentation. Electron Lett. 2022;58(9):340-342. [CrossRef]
- Liang J, Yang C, Zeng M, Wang X. TransConver: transformer and convolution parallel network for developing automatic brain tumor segmentation in MRI images. Quant Imaging Med Surg. 2022;12(4):2397-2415. [FREE Full text] [CrossRef] [Medline]
- Wang X, Li Y. STC-Net: fusing Swin transformer and convolution neural network for 2D medical image segmentation. 2022. Presented at: 2022 2nd International Conference on Electronic Information Engineering and Computer Technology (EIECT); October 28-30, 2022; Yan'an, China. [CrossRef]
- Viteri J, Piguave B, Pelaez CE, Loayza F. Automatic brain white matter hyperintensities segmentation with Swin U-net. 2022. Presented at: 2022 IEEE ANDESCON; November 16-19, 2022; Barranquilla, Colombia. [CrossRef]
- Wang P, Liu S, Peng J. AST-Net: lightweight hybrid transformer for multi-modal brain tumor segmentation. 2022. Presented at: 2022 26th International Conference on Pattern Recognition (ICPR); August 21-25, 2022; Montreal, QC, Canada. [CrossRef]
- Zhang Y, He N, Yang J, Li Y, Wei D, Huang Y, et al. mmFormer: multi-modal medical transformer for incomplete multimodal learning of brain tumor segmentation. 2022. Presented at: International Conference on Medical Image Computing and Computer-Assisted Intervention; September 18-22, 2022; Singapore. [CrossRef]
- Huang J, Li H, Wan X. Attentive symmetric autoencoder for brain MRI segmentation. 2022. Presented at: International Conference on Medical Image Computing and Computer-Assisted Intervention; September 18-22, 2022; Singapore. [CrossRef]
- Xing Z, Yu L, Wan L, Han T, Zhu L. NestedFormer: nested modality-aware transformer for brain tumor segmentation. 2022. Presented at: International Conference on Medical Image Computing and Computer-Assisted Intervention; September 18-22, 2022; Singapore. [CrossRef]
- Chen Y, Wang J. TSEUnet: A 3D neural network with fused Transformer and SE-Attention for brain tumor segmentation. 2022. Presented at: 2022 IEEE 35th International Symposium on Computer-Based Medical Systems (CBMS); July 21-23, 2022; Shenzen, China. [CrossRef]
- Wang E, Hu Y, Yang X, Tian X. TransUNet with attention mechanism for brain tumor segmentation on MR images. 2022. Presented at: 2022 IEEE International Conference on Artificial Intelligence and Computer Applications (ICAICA); June 24-26, 2022; Dalian, China. [CrossRef]
- Hatamizadeh A, Tang Y, Nath V, Yang D, Myronenko A, Landman B. UNETR: transformers for 3D medical image segmentation. 2022. Presented at: 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV); January 3-8, 2022; Waikoloa, HI, United States. [CrossRef]
- Nijiati M, Tuersun A, Zhang Y, Yuan Q, Gong P, Abulizi A, et al. A symmetric prior knowledge based deep learning model for intracerebral hemorrhage lesion segmentation. Front Physiol. 2022;13:977427. [FREE Full text] [CrossRef] [Medline]
- Laiton-Bonadiez C, Sanchez-Torres G, Branch-Bedoya J. Deep 3D neural network for brain structures segmentation using self-attention modules in MRI images. Sensors (Basel). 2022;22(7):2559. [FREE Full text] [CrossRef] [Medline]
- Li Y, Cai W, Gao Y, Li C, Hu X. More than encoder: introducing transformer decoder to upsample. 2022. Presented at: 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM); December 6-8, 2022; Las Vegas, NV, United States. [CrossRef]
- Rasoulian A, Salari S, Xiao Y. Weakly supervised intracranial hemorrhage segmentation using hierarchical combination of attention maps from a Swin transformer. 2022. Presented at: International Workshop on Machine Learning in Clinical Neuroimaging; September 18, 2022:63-72; Singapore. [CrossRef]
- Ayivi W, Zeng L, Yussif SB, Browne JA, Agbesi VK, Sam F, et al. Segmentation of glioblastoma multiforme via-attention neural network. 2022. Presented at: 33rd Irish Signals and Systems Conference (ISSC); June 9-10, 2022; Cork, Ireland. [CrossRef]
- Ou C, Qian Y, Chong W, Hou X, Zhang M, Zhang X, et al. A deep learning-based automatic system for intracranial aneurysms diagnosis on three-dimensional digital subtraction angiographic images. Med Phys. 2022;49(11):7038-7053. [CrossRef] [Medline]
- Jia Q, Shu H. BiTr-Unet: a CNN-transformer combined network for MRI brain tumor segmentation. 2021. Presented at: Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries; September 27, 2021; Virtual Event. URL: https://europepmc.org/abstract/MED/36005929
- Yang Y, Wei S, Zhang D, Yan Q, Zhao S, Han J. Hierarchical and global modality interaction for brain tumor segmentation. 2022. Presented at: Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries; September 27, 2021:441-450; Quebec City, Canada. [CrossRef]
- Luo C, Zhang J, Chen X, Tang Y, Weng X, Xu F. UCATR: based on CNN and transformer encoding and cross-attention decoding for lesion segmentation of acute ischemic stroke in non-contrast computed tomography images. 2021. Presented at: 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC); November 1-5, 2021; Mexico. [CrossRef]
- Sagar A. Vitbis: Vision transformer for biomedical image segmentation. 2021. Presented at: MICCAI Workshop on Distributed and Collaborative Learning; October 1, 2021:34-45; Strasbourg, France. [CrossRef]
- Sun Q, Fang N, Liu Z, Zhao L, Wen Y, Lin H. HybridCTrm: bridging CNN and transformer for multimodal brain image segmentation. J Healthc Eng. 2021;2021:7467261. [FREE Full text] [CrossRef] [Medline]
- Fidon L, Shit S, Ezhov I, Paetzold JC, Ourselin S, Vercauteren T. Generalized wasserstein dice loss, test-time augmentation, and transformers for the BraTS 2021 challenge. 2022. Presented at: International MICCAI Brainlesion Workshop; September 27, 2021:187-196; Singapore. [CrossRef]
- Sagar A. Emsvit: Efficient multi scale vision transformer for biomedical image segmentation. 2021. Presented at: International MICCAI Brainlesion Workshop; September 27, 2021:39-51; Quebec City, Canada.
- Yang H, Shen Z, Li Z, Liu J, Xiao J. Combining global information with topological prior for brain tumor segmentation. 2021. Presented at: International MICCAI Brainlesion Workshop; October 8, 2023; Quebec City, Canada. [CrossRef]
- Li J, Chen Y, Cai L, Davidson I, Ji S. Dense transformer networks for brain electron microscopy image segmentation. 2019. Presented at: Proceedings of the 28th International Joint Conference on Artificial Intelligence; August 10-16, 2019; Macao, China. [CrossRef]
- Menze B, Jakab A, Bauer S, Kalpathy-Cramer J, Farahani K, Kirby J, et al. The Multimodal Brain Tumor Image Segmentation Benchmark (BRATS). IEEE Trans Med Imaging. 2015;34(10):1993-2024. [FREE Full text] [CrossRef] [Medline]
- Wang P, Yang Q, He Z, Yuan Y. Vision transformers in multi-modal brain tumor MRI segmentation: a review. Meta-Radiology. 2023;1(1):100004. [CrossRef]
- Khan RF, Lee B, Lee MS. Transformers in medical image segmentation: a narrative review. Quant Imaging Med Surg. 2023;13(12):8747-8767. [FREE Full text] [CrossRef] [Medline]
Abbreviations
AI: artificial intelligence |
BraTS dataset: brain tumor segmentation dataset |
CNN: convolutional neural network |
CT: computed tomography |
DL: deep learning |
MRI: magnetic resonance imaging |
PET: positron emission tomography |
PRISMA-ScR: Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews |
ViT: vision transformer |
Edited by A Coristine; submitted 25.02.24; peer-reviewed by J Mistry, Y Shen, S Mary; comments to author 02.07.24; revised version received 23.08.24; accepted 19.11.24; published 29.01.25.
Copyright©Maya Iratni, Amira Abdullah, Mariam Aldhaheri, Omar Elharrouss, Alaa Abd-alrazaq, Zahiriddin Rustamov, Nazar Zaki, Rafat Damseh. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 29.01.2025.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research (ISSN 1438-8871), is properly cited. The complete bibliographic information, a link to the original publication on https://www.jmir.org/, as well as this copyright and license information must be included.