Abstract
Hippocampus is a part of the limbic system in human brain that plays an important role in forming memories and dealing with intellectual abilities. In most of the neurological disorders related to dementia, such as, Alzheimer’s disease, hippocampus is one of the earliest affected regions. Because there are no effective dementia drugs, an ambient assisted living approach may help to prevent or slow the progression of dementia. By segmenting and analyzing the size/shape of hippocampus, it may be possible to classify the early dementia stages. Because of complex structure, traditional image segmentation techniques can’t segment hippocampus accurately. Machine learning (ML) is a well known tool in medical image processing that can predict and deliver the outcomes accurately by learning from it’s previous results. Convolutional Neural Networks (CNN) is one of the most popular ML algorithms. In this work, a U-Net Convolutional Network based approach is used for hippocampus segmentation from 2D brain images. It is observed that, the original U-Net architecture can segment hippocampus with an average performance rate of 93.6%, which outperforms all other discussed state-of-arts. By using a filter size of , the original U-Net architecture performs a sequence of convolutional processes. We tweaked the architecture further to extract more relevant features by replacing all kernels with three alternative kernels of sizes , , and . It is observed that, the modified architecture achieved an average performance rate of 96.5%, which outperforms the original U-Net model convincingly.
Keywords: Magnetic Resonance Image, U-Net, Hippocampus, Deep Neural Network (DNN), Alzheimer’s Disease, Machine Learning
Introduction
Hippocampus, which is one of the most important parts in human brain is mainly responsible in dealing with memory related tasks, such as, forming new memories, storing old memories, etc [1]. Apart from the memory related tasks, hippocampus also helps the limbic system to regulates motivation, emotion, learning, etc [1]. Hippocampus is a complex structured gray matter, located in the temporal lobe of the brain [2]. Based on the location, hippocampus is categorized as two types are: 1) Left hippocampus, 2) Right hippocampus [3]. The right hippocampus is mainly involved in memory for locations within an environment, whereas the left hippocampus mainly responsible in context-dependent episodic or autobiographical memory [4]. In most of the neurological disorders, which are related to dementia, such as Alzheimer’s disease (AD), Mild Cognitive Impairment (MCI), etc., hippocampus is one of the earliest affected region in brain [5]. A significant volumetric atrophy in hippocampus can be observed in MCI and AD, which leads to the memory loss for the patients [6, 7]. Therefore, by analyzing the hippocampus shape/size, the stages of dementia may be predicted.
Though, there lies no proper medical treatments for dementia, but an ambient assisted living system can help to reduce the progression rate of dementia stage. Alzheimer’s disease (AD) is one of the most serious dementia which is only experiences by the elderly aged people [8]. Before a person develops AD, the patient experiences an intermediate dementia stage called Mild Cognitive Impairment (MCI) [9]. Therefore, if the MCI stage of a patient can be predicted, and a proper ambient assisted living system (such as proper care, monitoring behavioral and mood changes, etc.) is provided, then MCI to AD progression may be prevented or delayed.
Magnetic Resonance Imaging (MRI) is one of the most popular tools used to detect various diseases such as, cancer, tumor, etc [10]. From brain MR images, the region of interests (RoIs), such as the hippocampus, amygdala, etc., can be easily visualized. We have acquired data as MR images, and our RoI is the hippocampus. Proper segmentation of hippocampus may play an important role to do a better research in dementia related neurological disorders and their stages [11, 12]. Due to it’s location, complex structure, and complex pixel information, hippocampus segmentation is a challenge for the researchers [13–15]. The traditional segmentation methods, such as, Region growing, K-Means, Histogram based thresholding, etc., can’t separate the hippocampus accurately [16]. Because of the ability to learn from the environment, machine learning is considered as one of the most effective tool for medical image processing [17]. Hence, a machine learning technique can be used to separate the hippocampus from the brain.
In this paper, a U-Net architecture based segmentation method is used for hippocampus segmentation from brain MR images. All the data used in this study are acquired from the online data-set ADNI [18]. Input images are pre-processed and a skull stripping operation is performed to extract only brain parts, which may help in training the architecture more accurately. The original U-Net uses a series of convolutional kernels for extracting important features. Different convolutional filter sizes can enhance a model’s learning of various critical features, and combining multiple features can result in better feature representations [19]. Hence, we have proposed a modified U-Net architecture for hippocampus segmentation by replacing all the convolutional filters with a series of three different filters of size , , and . Some of the related works is discussed and their performance is compared with the proposed segmentation method. From the the performance comparisons, it is observed that, the proposed approach can segment the hippocampus from brain images more precisely than the original U-Net and the discussed state-of-the-arts, which may aid neurologists in doing better hippocampal research.
The organization of this article is summarized as: in "Introduction", the basic introduction about the Hippocampus, Machine learning tools, and the U-Net architecture is described. In "Related Works", some of the related works for hippocampus segmentation by different researchers are discussed. In "Material and Methods", a detail discussion about the data and tools used in this article is presented, and the proposed hippocampus segmentation approach is discussed briefly. In "Results", the experimental results is presented and discussion about the proposed model is presented in "Discussion". In "Conclusion and Future Work", we have concluded the paper and discussed some future scope of works.
Related Works
Machine Learning
Machine Learning (ML) is popularly used in the situations, where decision making is extremely difficult. Applications of ML is successful in various fields, such as knowledge engineering, digital image analysing, medical image analysing, intrusion identifying, etc [20]. For example, Ranjit Panigrahi, et al., proposed a novel Intrusion Detection System (IDS) using a decision tree-based machine learning approach, and got an accuracy of nearly 99.96% [21]. Similarly, Ganjar Alfian, et al. proposed a novel health tracking, designed for the diabetic victims using blue-tooth low energy based sensors as well as the real-time data processing [22]. In the proposed framework, the authors have adopted the concept of Multi Layer Perception (MLP) based ML model in order to perform an accurate classification. Random Forest is one more example of widely used ML approach in various fields, including medical diagnosis system. One successful example of using RF in medical diagnosis system can be observed the article on type-II diabetes and hypertension prediction, proposed by Ijaz et al. [23].
Artificial Neural Network
Artificial Neural Network (ANN) is a machine learning model that creates an artificial network of neural via an algorithm which allows the machine to learn by incorporating new data [24]. Each neural in the network known as the processing elements are connected to each other via their weights. For training the network, a learning algorithm is used for determining the estimation of the weights. After the training phase, classification can be done for the unknown test signals. For classification, the most popular class used is the multi-layer perceptron network [25].
Deep Neural Network
A Deep Neural Network (DNN) is a type of ANN consist of multiple hidden layers between the input and the output layer allows a machine to train for predicting the desirable outputs [26]. DNN can be used as an effective tool for segmentation process, specially, in the situations where the structure and the information in the images are complex, for example the MR images [27]. One of the applications of DNN can be seen in a work proposed by Srinivasu et al. [28]. The authors used MobileNet V2 and LSTM to classify skin disorder. The deep learning based approach performs well with an accuracy of more than 85%.
The typical architecture for the DNN are shown in Fig. 1.
U-Net Convolutional Network
U-Net is a generic deep-learning solution for various processing, such as segmentation in biomedical image data [29]. U-Net architecture has two major steps are: the convolutions and the transposed convolutions. In the Transposed convolutions, in order to increase the spatial resolution, it does an up-sampling kernel. For example, a convolution layer can convert an image from to with a kernel filter. A transposed convolution layer can convert an image from to . U-Net is popularly known for its end to end “U shape” neural network architecture. U-Net is one of the most commonly used successful method in medical image processing. In 2012, U-Net achieved the championship status for the neural-structure segmentation which was declared by the IEEE International Symposium on Biomedical Imaging (ISBI) [29]. For segmenting the Glioblastoma-astrocytoma U373 cells, U-Net achieved the top rank in 2014, declared by the ISBI [30]. Again in 2015, U-Net was the winner in the challenge of “Computer Automated Detection of Caries in Bitewing Radiography”, which was conducted by the ISBI [31]. Besides, in 2015, ISBI declared that U-Net can provide the best result in the “HeLa Cells Tracking Challenge” [32].
Amongst all the popular segmentation techniques, U-Net is the most commonly used approach in medical image processing [33]. From the U-Net architecture, lower sampling helps to deliberately extract the environmental information, whereas the upper sampling helps in restoring the circumstantial information of all the layers from lower sampling as well as the input of upper sampling. U-Net is popularly used in many medical imaging modalities, such as, MRI, CT, PET, OCT, etc. Some of the applications of U-Net in medical image segmentation from various imaging modalities is presented in Table 1.
Table 1.
Authors and articles | Imaging modalities | Organs/Tissues | Data-set | Average performance |
---|---|---|---|---|
Hansch et al. [34] | CT Imaging | Parotid gland | Test data of the Medical Image Computing and Computer Assisted Intervention – MICCAI 2015 | 88% |
Huang et al. [35] | CT Imaging | Pulmonary nodules | 1245 CT images | 73.6% |
Zheng et al. [36] | MR Imaging | Cardiac regions | 3078 images | 93% |
Tao et al. [37] | MR Imaging | Left ventricle | 700 patients | 98% |
Wang et al. [38] | MR Imaging | Rectal tumors | 93 patients | 74% |
Pedoia et al. [39] | MR Imaging | Cartilage and meniscus | 1478 Images | 93% |
Norman et al. [40] | MR Imaging | Cartilage and meniscus | 638 Images | 80.25% |
Zeng and Zheng [41] | MR Imaging | Proximal femur | 20 Images | 98% |
Huang et al. [42] | MR Imaging | Liver vessel | Sliver07 and 3Dircadb data-set | 75% |
Kumar et al. [43] | Ultrasound (US) imaging | Breast masses | 433 US Images | 82% |
Devalla et al. [44] | Optical coherence tomography (OCT) | Optic Nerve Head (ONH) | 40 OCT Images | 91% |
Venhuizen et al. [45] | Optical coherence tomography (OCT) | Retina | 71 patients | 98% |
From Table 1, it can be observed that, U-Net is popularly used in medical image segmentation framework with convincing performances.
Literature Survey on Hippocampus Segmentation
Researchers are trying to develop a methodology for accurate segmentation of hippocampus. Some of the relevant research works are discussed below.
Somasundaram and Genish [46] proposed an atlas based approach to segment the hippocampus from MRI. The authors have collected data for 54 subjects from the University of Pittsburgh Alzheimer’s Disease Research Center. Using the Brain Extraction Toolbox, the non-brain tissues were removed from input images. Firstly, the location of the hippocampus is identified from the input MR images by the atlas based approach and derived the Region of Interest (ROI). secondly, conservative smoothing and top-hat filter is applied for determining the edges of the ROI and binarized it using the Riddler Calvard method. Finally, the hippocampus is segmented by Connected Component Analysis (CCA). The authors have claimed that the Dice, Precision and Recall is 0.82, 0.77 and 0.88 for the proposed approach.
A hippocampus segmentation method is presented by presented by Tang et al. [47], using multi-channel large deformation diffeomorphic metric mapping (LDDMM). The experiment is done using the T1 weighted brain images for 23 subjects. A template and a target image are created with vector value of template image and the vector value of target image . The authors have constructed a diffeomorphic map from and . All the input images are considered as the template to construct the multichannel large deformation diffeomorphic metric mapping (LDDMM) map. Then the hippocampus in the template images were determined and marked by neuro imaging experts. After constructing the LDDMM map, the authors have used the similar transformation on the manually marked hippocampus of the template images. This process finally generates the automatic separation of the target region. For the cognitively normal subjects, the average Dice Similarity Coefficient is claimed by the author is 0.76.
Hao et al. [48] proposed a local label learning (LLL) approach for multi-atlas based hippocampus separation. The authors have acquired the T1-weighted MR images for 87 different subjects from ADNI. In pre-processing steps, the authors have performed GradWarp-correction, B1-correction, N3-bias field correction, and geometrical scaling. With the help of the ITK toolbox, the authors have performed the hippocampus separation by two neuro-image experts. Next, a multi-atlas based hippocampus segmentation mechanism is introduced, where the algorithm works based on a template image and a target image. An adopted L1 SVM supervised learning method is used for learning the relationship between the Region of interest(ROI) and image appearance for each voxel. The K-Nearest Neighbors (KNN) algorithm and a Support Vector Machine (SVM) based classification approaches are used to get balanced training data-set. The average Dice index value for the proposed method is claimed as 0.88 by the author.
Zhu et al. [49] proposed a voting based label fusion approach based on an integrated learning component. A total of 100 subjects and their corresponding MR images for this experiment are acquired from ADNI. The final output of the segmented hippocampus images was validated with the hippocampus labelled images obtained from the European Alzheimer’s Disease Consortium ADNI (EADC-ADNI) data-set. Initially, with the help of Montreal Neurological Institute (MNI) toolbox, hippocampus from all the input images were labelled manually by medical experts. The authors have randomly chosen 40 subjects for training and 60 subjects for the testing. The approach compute the voting weights after comparing the target image patch with each atlas image patches. The method first adopts a classification technique for learning the relationship between the patches and segmentation labels. The method then fuses the labels based on the weights obtained. The average Dice coefficient obtained is 0.85 for the proposed method.
Zarpalas et al. [50] proposed a hippocampus segmentation approach based on the Active Contour Models. All the relevant data are acquired by the authors from the Internet Brain Segmentation Repository (IBSR) data-set. Firstly, the authors proposed to construct a static local weighting map on the sagittal slices of hippocampal region by the Gradient Distribution on Boundary (GDB) of the hippocampus and then balance the images as well as the prior information from the map. The main moto of this idea is that, sometimes, the constructed static map alone may not be able to assist in the inconsistency of hippocampal shapes. All the input images are distributed in two terms; one is edge based, another is region based. Authors proposed the integration of an Adaptive Gradient Distibution on the Boundary map (AGDB) in the Active Contour Models framework. The Dice coefficient for the algorithm is nearly 0.84 as claimed by the authors.
Manojon and Coupe [51] proposed a hippocampus separation method works on the boosted ensemble of autocontext neural networks. The authors have acquired T1, and T2 weighted MR images for 25 subjects from the online data-set Neuro Imaging Tools and Resources Collaboratory (NITRC). Using the Montreal Neurological Institute (MNI) toolbox, all the images are performed for the intensity correction and standardization, bias correction, and linear registration. The manually drawn hippocampus images were also acquired from the NITRC. From the labelled images, the algorithm is trained using the random forest based approach and marked the ROIs (i.e, hippocampus) automatically from all the input images. In the method, feedforward multilayer perceptron is used having 2 hidden layers. authors used the boosting strategy for getting an accurate classification. The authors have claimed that, the proposed segmentation technique works well with a dice coefficient value of 0.86.
Coupe et al. [52] has proposed a Patch-based approach for the hippocampus segmentation using the manual segmentation by some experts as priors. The authors have acquired T1 weighted brain images for a total of 80 different subjects from International Consortium for Brain Mapping (ICBM) data-set. In pre-processing steps, the authors have performed denoising, in-homogeneity correction, intensity normalization, linear registration, etc. For all the input images, the non-brain regions were removed and the brain only images were transformed into the stereotaxic space by some medical experts. The medical experts were labelled the hippocampus from all the input images manually. The labelled hippocampus images were used as the masks in order to train the algorithm to segment the hippocampus from input images. The authors have claimed that, while segmenting the hippocampus using the proposed approach, they obtained the average kappa index as 0.88.
Using intensity features, a novel atlas registration based hippocampus segmentation approach is proposed in the literature [53]. To design the classification model, some neurological experts are assigned to label the ROI from the brain images, and then the labelled atlases are consequently registered to the target images.Next, all the resultant transformations are applied for distorting the coordinate framework labels of the target images. Mainly, the proposed mechanism is composed of energy functional enhanced using two terms; first one is based on voxel classifier for determining the approximate intensity distributions of the hippocampus region, and the second one is designed to determine the spatial distribution of both the hippocampus as well as the background tissues. To train the model, the authors have used a set of 20 labelled images. The accuracy of the proposed model is approximately 85%.
Goubran et al. proposed a 3D CNN based hippocampus segmentation in the literature [54]. Based on the hippocampus atrophy, the authors proposed the CNN model named as the HippMapp3r model, for segmentation of the hippocampus. The model is trained on a large dataset comprising of a convincing number of thoroughly hand-labelled segmentations of different subject groups. Different other brain damages are also integrated in the database, such as the GM atrophy, WM hyperintensities (WMH), Ventricular-enlargement (VE), etc. Finally, the authors proposed a 3D CNN, based on a U-Net model along with the residual-units as well as the dice coefficient-based loss function, where the residual-units comprises of a convolution layer with normalization and nonlinearities. To avoid the overfitting issues, a dropout layer is also added in the network. The segmentation model is combination of 2 networks, one network to train on the entire brain images, and the second network is designed to train the output of the first network. For generating the segmentation maps, a trainable de-convolution kernel is added in the up-sampling tasks. The proposed method can segment the hippocampus with an average Dice value of , and a Pearson’s correlation coefficients value of 0.94.
In a similar work, a CNN based hippocampus segmentation method is proposed by Hänsch et al. [55].The MR and CT images for 45 patients were acquired for the study. The hippocampus region from all the input images is contoured with the help of medical imaging experts, which are then used as the binary masks. A hybrid encoder-decoder CNN is designed having 1 input-channel (CT or MR) and 3 output-channels (1 channel for the background, one for left hippocampus, and another channel is for the right hippocampus). The original architecture of the network was a 2D model consisting of 100 layers with 4 resolution levels with dense convolutional units. The model is further upgraded to a 3D architecture by adding a separable 3D convolution in between the dense unit and the max pooling. To improve the performance of the model, batch-normalization and dropout is used and then softmax is used as the final layer. The concept of the hyperparameter search is used to train the model for resolving the instability issues. The authors have claimed that, the model can segment the hippocampus with a median dice coefficient of 0.76.
Using the concept of the Deep Convolutional Neural Network Ensembles and Transfer Learning, a novel hippocampus segmentation approach is discussed in the literature [56]. The model is designed with the concept of the deep CNNs by incorporating a distinct segmentation as well as a fault rectification steps. The model is designed by the ensemble of 3 autonomous segmentation CNNs, functioning with the orthogonal tissues in the 3D brain images, in which the outputs are fused together. The 3D-input images are break-down into sagittal, coronal and axial slices. The input of the segmentation CNNs are the 2D-slices, hence produces the 2D-segmentations, which are then stacked along with the 3rd dimension in order to design a 3D-segmentation mask. The final outcome is acquired by executing a voxel-wise fusion operation from all the 3D-segmentation masks. The model architecture comprises of 6 successive Residual-units, where each unit consist of 2 parallel branches (in the 1st branch, 2 convolution layers are there, and the 2nd branch is to forward the inputs for adding with the output of the 1st branch. The CNN models for the segmentation consist of fifteen spatial-convolutional layers, along with filters and supplementary biases. The average Dice coefficient obtained by the model is around 0.89.
A level set based method for automatic segmentation of hippocampus is proposed by Safavian et al. in the literature [57]. The authors have used the SPM 12 toolbox for bias & field correction, non-uniformity correction, and CSF (CerebroSpinal Fluid) map extraction. To remove the skulls, the Brain Extraction Tool (BET) is used, and then by performing the cross correlation operation using ANTs (Advanced Normalisation Tools) toolbox, registration operation is done for all the input images. All brain images are then transformed to the MNI spatial space, and the left & right hippocampus are labelled in accordance to the Harmonised Hippocampal Protocol (HarP). For back registering of the hippocampus regions to the target space, an inverse transformation operation is performed. The CSF map from the target images is subtracted from the hippocampus region to correct the errors. For the accurate determination of the hippocampus borders, a binary map is constructed that enclosed the gradient information of target images in the hippocampus regions. For the final segmentation, the authors have introduced a novel level-set approach with a basic intention for representing the surfaces with a level-set of ‘0’ of a dimensional hyper-surface. The average Dice coefficient achieved by the model is approximately 0.847.
Based on simultaneous region deformation approach, a fully automatic hippocampus segmentation method is proposed in the literature [58].The concept of the method is to apply the interchange deformation of 2 objects, namely the hippocampus and the amygdala, from 2 preliminary objects, using homotopic based region deformation, which is modelled in a Bayesian framework. The deformation is achieved by the repetitive energy diminution, which is determined using a function of five constraints (global & local information affection, regularization, as well as the volume & surface terms). The preliminary substances are derived from the probabilistic atlases, and then the energy function is repetitively reduced for the hippocampus and amygdala using some supplementary constraints determined from anatomical & probabilistic priors. With the help of a neurological expert, the hippocampus from the image data-set of 16 subjects is manually segmented. By using the SPM5 software, the atlases are transformed to MNI space. The average performance obtained by the proposed method is 86%
Material and Methods
Data and Tools
All the data used in our research work is acquired from the publicly available online data-sets, Alzheimer’s Disease Neuroimaging Initiative (ADNI) [18]. The type of the images used in this study is original volumetric T1-weighted, Magnetization Prepared Rapid Gradient Echo (MPRAGE) MRI. Total 2008 numbers of MR images for 210 different subjects (Male:105, Female: 105) have acquired.
Python is a well known tool in the field of medical image processing [59]. Python is easy to use and takes less time for the execution [60]. Implementation of the pre-processing operations as well as the proposed segmentation is done using the open CV, Python tools. Using 3D-slicer tool, the hippocampal ground-truth images are extracted. Using the Matlab toolbox, output images are then compared with their corresponding ground-truth images.
Preprocessing
For 3D image processing, some post processing steps are required in each slice of the image in order to get the desirable result, which may affect on the execution time [61]. Hence researchers often prefers 2D over the 3D image processing. In our research work, the 3D MRI is converted to a 2D image for the particular slice which gives best visuality for the hippocampus. Finally all the MR images are resized into , sized images.
While capturing the MR images, skull part is also being captured by the machine. For segmenting any part of the brain, skull stripping is important in order to get better result [62]. We have implemented and compared Some commonly used traditional image segmentation algorithms for skull removing namely Region Growing, Region Splitting - Merging, K - Means Clustering, Histogram Based Thresholding and Fuzzy C Means based on some popular performance analysis metrics. From the analysis, it is found that among all the above mentioned approaches, the Histogram Based Thresholding technique gives the more accurate result [63]. Hence, for the skull stripping from the MR images, we have used the Histogram Based Thresholding algorithm. One of the visual result of the skull stripping is shown in Figs. 2 and 3.
Proposed Segmentation Method
After removing the skulls from the brain MRI, next step is to perform the segmentation operation. Architecture of the original U-Net convolution network is shown in Fig. 4.
The original U-Net architecture consists of a contracting path (left side) and an expansive path (right side). The contracting path is a convolutional network that consists of repeated convolution layers; each followed by an activation unit a rectified linear unit (ReLU) and a max pooling operation. The expansive path is a deconvolution network that consist of a transpose layer, concatenation and convolution layers. In contraction path, the feature information is increased while the spatial information decreased, and in the expansive path it combines the features and spatial information through the up-convolutions and concatenate with features from the contracting path. A model’s feature learning from input images may be improved by combining different convolutional filters [19]. Hence, we have proposed a modified U-Net architecture by replacing all convolutional filters by a series of , , and as shown in Fig. 5. Since our region of interest (hippocampus) is a small part of the brain, hence instead of introducing large filter sizes, we have used a series of three small filters (, , and ).
Contracting Path (Down Sampling)
We have used input images of size . In the first convolution layers, set of convolutional filters are used each with a size of , , and . There are down- sampling convolution blocks and in every next down-sampling block, we doubled the number of feature channels. The output layer goes to a dropout layer where 0.1 nodes is drop out to prevent the model from over fitting. After dropout, the layer is convolved again with the same filters and size. The output will now go to the max-pooling layer, which reduces the height and width information by keeping the number of channels constant. Each block consist of two-convolution layer, one dropout and one max pooling. The size of the image reduce while the depth increase i.e. from to .
Expansive Path (Up-Convolution)
In up sampling, the transpose convolution and concatenation operation is applied along with the regular convolution. There are four blocks in the expansive path, each block consist one transpose convolution, one concatenation layer, one dropout and two convolution layers. Transpose convolution is a technique with learnable parameters that convert a low resolution to a high-resolution information. In Transpose convolution layer, we used a set of convolutional filters each of size , , and as well as strides 2 on the output from last down sampling convolution layer and then concatenated the output of the transpose layer with the feature from the down sampling. After that apply a regular convolution layer with a size of , , and after every concatenation to produce more precise output base on the information. The size of the image increased and the depth is decreased i.e. from to .
More Layers and Parameters in the Modified U-Net Architecture
For hippocampus segmentation, total number of deep layers (layers that can generate some parameters) in the original U-Net is 23, while the deep layers in the modified U-Net is increased to 67. The details comparison between original and modified U-Net is presented in Table 2.
Table 2.
Models | No. of deep layers | Conv filters used | No. of parameters | Trainable parameters | Non-trainable parameters |
---|---|---|---|---|---|
Original U-Net | 23 | 1,179,121 | 1,177,649 | 1472 | |
Modified U-Net | 67 | 13,672,017 | 13,667,601 | 4,416 |
From Table 2, it can be observed that, in original architecture, a total of 1,179,121 number of parameters are extracted and among them 1,177649 parameters are further used for the training of the model. Similarly, it can be seen that, the modified architecture could generate more features (13,672,017) as well as trainable parameters (13,667,601). All the layers used in the modified architecture are described below.
Input Layer
This layer is just to read the input images.
Convolutional Layer
This layer passes a window which is also known as the convolution kernel over the input images. The kernel scans an entire image and determines the multiplication of the pixel values with the weights of the corresponding pixel position in the kernel. For a particular window, the multiplications are added together and gets a single number. Finally, a matrix of several numbers from each window locations are extracted [64]. The mathematical representation of the convolution operation is shown in Eq. 1.
1 |
Where, G is the convoluted image of size , H is the input image of size , and F is the kernel of size .
Batch Normalization Layer
Batch Normalization or BatchNor is an essential operation performed in ANN, where the outputs in each layer were reconstructed into a standard setup. Since after the BatchNor operation the problem of experiencing the very high or very low activation value can be resolved, hence the normalization operation leads the algorithm to learn faster and also it allows all the layers to learn autonomously [65]. Moreover, it also helps in improving the accuracy by reducing the dropout rate (also known as data loss). The BatchNor operation is started by determining the batch’s standard deviation. From the resultant of the previous activation layer, the batch mean is subtracted and divided by the standard deviation. A total of 17 BatchNor layers are derived in the modified U-Net architecture. The mathematical representations for the BatchNor operation is presented in Eq. 2.
2 |
Where, is the input to the BatchNor (BN) and B is the mini batch form by BN. represents the sample mean, and represents the standard deviation of B. and represents element wise scale parameter and shift parameter of the same shape as x. Again the sample mean and standard deviation is calculated as Eqs. 3 and 4.
3 |
4 |
Where, is a very small positive constant is added so that we never attempt to divide by zero.
Activation Layer
In this layer, the weighted sum of its all inputs are calculated and then a bias is added with the sum to decide a rule to be fired or not. The result is then referred to the subsequent layers as inputs. Various activation functions are there. In this architecture, a Rectified Linear Unit (ReLU) is used. One of the major advantages of ReLU is that it alters all the non-positive inputs as zero and are acts as non-activated, hence in ReLU, all the neurons are not active at a same time which makes it computationally efficient and faster [66]. A total of 17 activation layers are considered in the proposed architecture. The formula of ReLU activation function can be defined as shown in Eq. 5.
5 |
Where, function f(x) returns 0 f or any non-positive inputs and it returns x for any non-negative inputs.
Pooling Layer
The key function of a pooling layer is to gather features from the maps which are generated by the convolution of a filter over the images. After accumulating all the features, for reducing the parameters and computational time, pooling operation reduces the spatial size of the images gradually. Max pooling is one of the most widely used pooling methods where the kernel mines the largest possible features from the area of convolution [67]. In the modified U-Net architecture, 4 max pooling layers are used. The mathematical expression of the max pooling operation is presented in Eq. 6.
6 |
Where, is the max pooling output, and m represents the kernel width.
Dropout Layer
While train the model, the weights of the input images are updated which may leads to make the model fully dependent on the dataset that is being used and hence it may not be able to provide a convincing result during the prediction or classification of an object. The issue is called the over-fitting problem. To overcome from this issue, the concept of dropout is introduced, where some neurons from the model are temporarily removed based on some probability measurements and test the impact. Dropout inspires the model for learning the relevant features which are beneficial in aggregation with different random neurons [68]. For estimating the best model, the loss function should be minimized which can be determined by the least square loss as shown in Eqs. 7 and 8.
7 |
8 |
Where, Eq. 7 presented the loss in a regular network and Eq. 8 presented the loss in a dropout network. Here, represents the dropout rate which is depends on the probability value p. The concept of the gradient descent is used in back-propagation while train the network, which is shown in Eq. 9.
9 |
Similarly, the gradient of the regular network is shown in Eq. 10.
10 |
Now from Eqs. 9 and 10, the expected gradient of the dropout network can be determined as shown in Eq. 11.
11 |
From Eqs. 7 to 11, it can be observed that the dropout minimization leads to minimize a regular network, which can be presented as in Eq. 12.
12 |
From Eq. 12 it can be observed that, if we differentiate Eq. 12, we will get the expected gradient of a dropout network.
Transposed Convolutional Layer
The transposed convolutional or deconvolutional layer performs the opposite operation to the convolutional layer. The main idea of constructing this layer is to define a transformation that goes in the reverse direction to the normal convolution. In other words, convolution transforms the original image into a size of a smaller matrix and the transposed convolution reverse the procedure and helps the model to produce the final output with the same size with the inputs [69]. The basic mathematical expression of de-convolution operation is shown in Eq. 13.
13 |
Where, I is the de-convoluted image of size , H is the input image of size , and F is the kernel of size .
Concatenation Layer
The main idea of defining a concatenation layer is to perform the concatenation operations by taking some inputs in a definite axis, with a condition that, all the inputs must be of the same size in all the dimensions apart from the concatenation direction [70]. In this model the concatenation axis is considered as the third dimension, i.e, the depth.
Training and Testing
A total of 508 MR images along with their corresponding hippocampal binary masks are taken as the train data to fit into the model and train over 40 epochs with Adam optimizer [71]. Considering the exponentially weighted average of the gradients, this approach is utilised to speed up the gradient descent algorithm. Performance is measure by setting a metric “accuracy” and loss function “binary cross-entropy” to evaluate during training the data-sets. We use Keras callbacks to implement Early-stopping i.e. if the validation loss does not improve for 3 continuous Epochs it will stop training the data, we uses a batch size of 16.
While train the model, a Soft-max based energy function is defined along with a cross entropy based loss function. The equation of soft-max is shown in Eq. 14.
14 |
In Eq. 14, l is the feature channel where is the activation which is performed pixel wise. M denotes the class numbers, returns the maximum function ( is 1, if m returns the max activation , for any other value of m, is 0).
Cross entropy is then derived for penalizing at each pixel the deviation of from 1 by using the Eq. 15.
15 |
In Eq. 15, In Eq. 15, is the actual level of the pixels and e : Z is a weight map used to identify the pixels that contributes the most and to give those pixels more importance during the training.
The output value of each pixel we get is from 0 to 1, so we take a threshold of 0.5 to classify a pixel to 0 or 1, and make prediction. 0 represent background and 1 represent foreground or the hippocampus region. A sample train image along with its corresponding binary mask is shown in Figs. 6 and 7.
Results
The model is trained using 40 epochs. The learning curve of the model is shown in Fig. 8.
From Fig. 8, it can be observed that how the curve learns about the value loss and chooses the best model for further processing. As the structure of the brain images are very complex, it is difficult to determine the boundary of hippocampus for proper segmentation. To get the boundary information more accurately, all the input images are converted to the seismic images. From the seismic images, to determine the hippocampus regions,the binary masks are used as the salt images. A sample seismic and salt image is shown in Fig. 9.
After getting the seismic and salt images, the model started predicting the expected output. The output prediction by the model is shown in Fig. 10 for a sample image.
After training the model with 508 numbers of input images, the model is tested for another 1500 brain images. Some of the sample ground-truth images is shown in Fig. 11, and corresponding tested output images provided by the model is shown in Fig. 12. Some sample overlapped images between ground-truth and predicted are shown in Fig. 13.
From Figs. 11 to 13 visually it can be observed that the segmented hippocampus images are very much identical to their corresponding ground-truth images.
Some of the performance analysis parameters, such as, Accuracy, sensitivity, specificity, Precision, Dice-coefficient, and the Jaccard Index for all the output images are determined using the Eqs. 16 to 21.
16 |
17 |
18 |
19 |
20 |
21 |
Where, True positive (TP) returns the pixels which are segmented correctly. False positive (FP) returns the number of pixels segmented mistakenly. True negative (TN) returns the number of pixels ignored correctly. False Negative (FN) returns the number of pixels ignored falsely. The number of TP,FP,TN,FN pixels are calculated by comparing segmented images with their corresponding ground truth images. Tables 3 and 4 show the average performance of the original and modified models, respectively.
Table 3.
Average no. of pixels in Confusion matrix | Accuracy | Sensitivity | Specificity | Precision | Dice coefficient (DIC) | Jaccard Index (JI) | Average Performance | |
---|---|---|---|---|---|---|---|---|
TP | 26500 | 0.95 | 0.93 | 0.96 | 0.92 | 0.95 | 0.90 | 0.936 |
FP | 1200 | |||||||
TN | 36086 | |||||||
FN | 1750 |
Table 4.
Average no. of pixels in Confusion matrix | Accuracy | Sensitivity | Specificity | Precision | Dice coefficient (DIC) | Jaccard Index (JI) | Average Performance | |
---|---|---|---|---|---|---|---|---|
TP | 27210 | 0.97 | 0.98 | 0.97 | 0.96 | 0.97 | 0.94 | 0.965 |
FP | 1080 | |||||||
TN | 36580 | |||||||
FN | 666 |
Discussion
After analysing the performances, it can be observed that, proposed hippocampus segmentation approach can separate the hippocampus with an accuracy of 0.97, sensitivity of 0.98, specificity of 0.97, precision value of 0.96, Dice-coefficient of 0.97, and the Jaccard Index value of 0.94. The average performance (average value of accuracy, sensitivity, specificity, precision, Dice-coefficient, and the Jaccard Index) is around 0.965. The performance comparison amongst all the discussed hippocampus segmentation approaches are summarized in Table 5.
Table 5.
Methods | Accuracy | Sensitivity | Specificity | Precision | Dice-coefficient | Jaccard Index |
---|---|---|---|---|---|---|
Somasundaram et al. [46] | - | 0.99 | 0.93 | 0.77 | 0.82 | - |
Tang et al. [47] | Average Kappa Overlap Ratio = 0.76 | |||||
Hao et al. [48] | - | - | - | 0.8 | 0.83 | 0.71 |
Zhu et al. [49] | - | - | - | 0.88 | 0.88 | 0.79 |
Zarpalas et al. [50] | - | - | - | - | 0.84 | |
Manojon et al. [51] | - | - | - | - | 0.87 | - |
Coupe et al. [52] | Average Kappa Index Value = 0.92 | |||||
van der Lijn et al. [53] | 0.85 | - | - | - | - | - |
Goubran et al. [54] | - | - | - | - | 0.89 | 0.95 |
Hansch et al. [55] | - | - | - | - | 0.76 | - |
Ataloglou et al. [56] | - | - | - | 0.89 | 0.89 | 0.81 |
Safavian et al. [57] | - | - | - | - | 0.9 | - |
Chupin et al. [58] | 0.73 | - | - | - | - | - |
Modified U-Net based approach | 0.97 | 0.98 | 0.97 | 0.96 | 0.97 | 0.94 |
From Table 5, it can be observed that, the average performance of the proposed segmentation method is approximately 96.5%, which is the highest amongst all the discussed techniques. From Table 3, it can be observed that, the U-Net based hippocampus segmentation technique has the highest accuracy (0.97). The segmentation technique also has a convincing Sensitivity of 0.98. The highest Specificity ( 0.97) can also be observed in the proposed segmentation framework. The Precision (0.96) and the Dice-coefficient (0.97) value is also highest in the modified U-Net based segmentation approach. The approach also achieved a convincing Jaccard Index value of 0.94. The average performance comparison graph of different hippocampus segmentation approaches is shown in Fig. 14.
A short summary of different hippocampus segmentation techniques is presented in Table 6.
Table 6.
Authors and articles | Year of publication | Data-sets | Algorithms | Average performance |
---|---|---|---|---|
Somasundaram et al. [46] | 2015 | Neuro imaging Informatics Tools and Resources Clearinghouse (NITRC) | Atlas based method | 88.16% |
Tang et al. [47] | 2012 | NA | Multi channel Large Deformation Diffeomorphic Metric Mapping (LDDMM) | 79% |
Hao et al. [48] | 2014 | ADNI | Local Label Learning (LLL) | 88% |
Zhu et al. [49] | 2016 | ADNI | Multi-atlas based | 85% |
Zarpalas et al. [50] | 2013 | Internet Brain Segmentation Repository (IBSR) | Gradient based reliability maps | 84% |
Manojon et al. [51] | 2017 | NITRC | Patch-Based Boosted Ensemble of Autocontext Neural Networks | 86.95% |
Coupe et al. [52] | 2011 | International Consortium for Brain Mapping (ICBM) | Patch-based | 88.4% |
van der Lijn et al. [53] | 2008 | NA | Atlas registration based | 85% |
Goubran et al. [54] | 2020 | Sunnybrook Dementia Study (SDS), ADNI, University of Pennsylvania | 3D CNN based | 91% |
Hansch et al. [55] | 2020 | NA | CNN based | 76% |
Ataloglou et al. [56] | 2019 | MICCAI, ADNI | Deep Convolutional Neural Network Ensembles | 89% |
Safavian et al. [57] | 2019 | ADNI | Level set based method | 85% |
Chupin et al. [58] | 2009 | ADNI | Simultaneous region deformation approach | 86% |
Modified U-Net based approach | ADNI | U-Net based | 96.5% |
From Table 3, it can be observed that, amongst all the discussed hippocampus segmentation approaches, the methodology proposed by Somasundaram and Genish [46] achieved the highest Sensitivity rate of 99%. Similarly, the method proposed by Goubran et al. [54] achieved the highest Jaccard Index value of 0.95. But, if we observe Figs. 12, and Table 4, we can analyse that, amongst all the discussed methods, overall, the proposed U-Net based hippocampus segmentation approach can segment more accurately with the highest average performance rate of 93.6%.
Conclusion and Future Work
As the traditional image segmentation techniques fails to segment the hippocampus from the brain MRI, initially we have tried a U-Net convolution based method for the segmentation. It is observed that, the original architecture uses a series of convolutional filters all over the network. It is found from some studies that, different filter sizes can assist a model to adopt multiple important features. Hence, to enhance the performance, we have proposed a modified U-Net architecture by replacing all filters with three different filters of size , , and . Layer wise discussion is done for the modified U-Net approach. As skulls can be ignored in our research work, we have stripped the Skulls as a pre-processing step in order to get more accurate results. An adequate number of data are obtained (2008 no.s) from the online data-set ADNI. In order to compare the output results, the ground truth images are obtained by doing manual segmentation using 3D slicer application. From the visual representation as well as from the performance comparison table, it can be observed that the proposed technique is giving a convincing result for hippocampus segmentation. The proposed segmentation technique outperforms the original U-Net as well as all other state-of-the-arts with an average performance of around 96.5%.
Though the proposed segmentation approach achieves convincing result, it has some limitations as well. Some of the limitations of our proposed method are, i) this method can only segment the 2D images. In future work, the model may be further improved to train and segment hippocampus from 3D brain images, ii) though the model’s performance improves after adding multiple convolutional filters, the proposed method extracts a greater number of parameters, making it more computationally expensive than the original model. Computational time can be lowered in future studies by using advanced DNN parameters like depth wish convolutional procedures, iii) in the future, more brain images from various sources can be acquired to train and evaluate the model more precisely, iv) furthermore, because the hippocampus is affected in most neurological disorders, including Alzheimer’s Disease (AD), volumetric measurements of the hippocampus can be performed in different dementia stages, such as Mild Cognitive Impairment (MCI), Alzheimer’s Disease (AD), Cognitively Normal (CN), and so on, in order to predict early dementia stages. Following the collection of volumetric measures, a detailed comparison between patients can be made, which may aid neurologists in diagnosing various neurological diseases.
Footnotes
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Contributor Information
Ruhul Amin Hazarika, Email: rahazarika@gmail.com.
Arnab Kumar Maji, Email: arnab.maji@gmail.com.
Raplang Syiem, Email: syiemrap93@gmail.com.
Samarendra Nath Sur, Email: samar.sur@gmail.com.
Debdatta Kandar, Email: kdebdatta@gmail.com.
References
- 1.Anand KS, Dhikav V. Hippocampus in health and disease: An overview. Annals of Indian Academy of Neurology. 2012;15(4):239. doi: 10.4103/0972-2327.104323. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Ezzati A, Katz MJ, Zammit AR, Lipton ML, Zimmerman ME, Sliwinski MJ, Lipton RB. Differential association of left and right hippocampal volumes with verbal episodic and spatial memory in older adults. Neuropsychologia. 2016;93:380–385. doi: 10.1016/j.neuropsychologia.2016.08.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Sadeghi L, Rizvanov AA, Salafutdinov II, Dabirmanesh B, Sayyah M, Fathollahi Y, Khajeh K. Hippocampal asymmetry: Differences in the left and right hippocampus proteome in the rat model of temporal lobe epilepsy. Journal of proteomics. 2017;154:22–29. doi: 10.1016/j.jprot.2016.11.023. [DOI] [PubMed] [Google Scholar]
- 4.Burgess N, Maguire EA, O’Keefe J. The human hippocampus and spatial and episodic memory. Neuron. 2002;35(4):625–641. doi: 10.1016/s0896-6273(02)00830-9. [DOI] [PubMed] [Google Scholar]
- 5.A. Vijayakumar, A. Vijayakumar, Comparison of hippocampal volume in dementia subtypes, ISRN radiology 2013 (2012). 10.5402/2013/174524. [DOI] [PMC free article] [PubMed]
- 6.Jack CR, Petersen RC, O’Brien PC, Tangalos EG. MR-based hippocampal volumetry in the diagnosis of Alzheimer’s disease. Neurology. 1992;42(1):183–183. doi: 10.1212/wnl.42.1.183. [DOI] [PubMed] [Google Scholar]
- 7.Colliot O, Chételat G, Chupin M, Desgranges B, Magnin B, Benali H, Dubois B, Garnero L, Eustache F, Lehéricy S. Discrimination between Alzheimer disease, mild cognitive impairment, and normal aging by using automated segmentation of the hippocampus. Radiology. 2008;248(1):194–201. doi: 10.1148/radiol.2481070876. [DOI] [PubMed] [Google Scholar]
- 8.Hazarika RA, Maji AK, Sur SN, Paul BS, Kandar D. A survey on classification algorithms of brain images in Alzheimer’s disease based on feature extraction techniques. IEEE Access. 2021;9:58503–58536. doi: 10.1109/ACCESS.2021.3072559. [DOI] [Google Scholar]
- 9.R. A. Hazarika, A. K. Maji, D. Kandar, P. Chakrabarti, T. Chakrabarti, K. J. Rao, J. Carvalho, B. Kateb, M. Nami, An evaluation on changes in hippocampus size for cognitively normal (CN), mild cognitive impairment (MCI), and Alzheimer’s disease (AD) patients using fuzzy membership function (2021). 10.31219/osf.io/wujfn.
- 10.Ijaz MF, Attique M, Son Y. Data-driven cervical cancer prediction model with outlier detection and over-sampling methods. Sensors. 2020;20(10):2809. doi: 10.3390/s20102809. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Halliday G. Pathology and hippocampal atrophy in Alzheimer’s disease. The Lancet Neurology. 2017;16(11):862–864. doi: 10.1016/S1474-4422(17)30343-5. [DOI] [PubMed] [Google Scholar]
- 12.Y. Chen, B. Shi, Z. Wang, P. Zhang, C. D. Smith, J. Liu, Hippocampus segmentation through multi-view ensemble convnets, in: 2017 IEEE 14th International Symposium on Biomedical Imaging (ISBI 2017), IEEE, 2017, pp. 192–196. 10.1109/ISBI.2017.7950499.
- 13.Shi Y, Cheng K, Liu Z. Hippocampal subfields segmentation in brain mr images using generative adversarial networks. Biomedical engineering online. 2019;18(1):1–12. doi: 10.1186/s12938-019-0623-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Dill V, Klein PC, Franco AR, Pinho MS. Atlas selection for hippocampus segmentation: Relevance evaluation of three meta-information parameters. Computers in biology and medicine. 2018;95:90–98. doi: 10.1016/j.compbiomed.2018.02.005. [DOI] [PubMed] [Google Scholar]
- 15.F. Bartel, H. Vrenken, M. van Herk, M. de Ruiter, J. Belderbos, J. Hulshof, J. C. de Munck, Fast segmentation through surface fairing (FASTSURF): A novel semi-automatic hippocampus segmentation method, PloS one 14 (1) (2019). 10.1371/journal.pone.0210641. [DOI] [PMC free article] [PubMed]
- 16.Pang S, Jiang J, Lu Z, Li X, Yang W, Huang M, Zhang Y, Feng Y, Huang W, Feng Q. Hippocampus segmentation based on local linear mapping. Scientific reports. 2017;7(1):1–11. doi: 10.1038/srep45501. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.H. Seo, M. B. Khuzani, V. Vasudevan, C. Huang, H. Ren, R. Xiao, X. Jia, L. Xing, Machine learning techniques for biomedical image segmentation: An overview of technical aspects and introduction to state-of-art applications, arXiv preprint arXiv:1911.02521 (2019). 10.1002/mp.13649. [DOI] [PMC free article] [PubMed]
- 18.Alzheimer’s disease neuroimaging initiative, [Last accessed on 27/02/2020]. http://adni.loni.usc.edu/data-samples/access-data/
- 19.B. Murugesan, V. Ravichandran, K. Ram, S. Preejith, J. Joseph, S. M. Shankaranarayana, M. Sivaprakasam, Ecgnet: Deep network for arrhythmia classification, in: 2018 IEEE International Symposium on Medical Measurements and Applications (MeMeA), IEEE, 2018, pp. 1–6.
- 20.Panigrahi R, Borah S, Bhoi AK, Ijaz MF, Pramanik M, Jhaveri RH, Chowdhary CL. Performance assessment of supervised classifiers for designing intrusion detection systems: A comprehensive review and recommendations for future research. Mathematics. 2021;9(6):690. doi: 10.3390/math9060690. [DOI] [Google Scholar]
- 21.Panigrahi R, Borah S, Bhoi AK, Ijaz MF, Pramanik M, Kumar Y, Jhaveri RH. A consolidated decision tree-based intrusion detection system for binary and multiclass imbalanced datasets. Mathematics. 2021;9(7):751. doi: 10.3390/math9070751. [DOI] [Google Scholar]
- 22.Alfian G, Syafrudin M, Ijaz MF, Syaekhoni MA, Fitriyani NL, Rhee J. A personalized healthcare monitoring system for diabetic patients by utilizing BLE-based sensors and real-time data processing. Sensors. 2018;18(7):2183. doi: 10.3390/s18072183. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Ijaz MF, Alfian G, Syafrudin M, Rhee J. Hybrid prediction model for type 2 diabetes and hypertension using DBSCAN-based outlier detection, synthetic minority over sampling technique (SMOTE), and random forest. Applied Sciences. 2018;8(8):1325. doi: 10.3390/app8081325. [DOI] [Google Scholar]
- 24.J. F. Pagel, P. Kirshtein, Machine dreaming and consciousness, Academic Press, 2017.
- 25.Udpa S, Udpa L. NDT techniques: Signal and image processing. 2001 doi: 10.1016/B978-0-12-803581-8.03476-7. [DOI] [Google Scholar]
- 26.V. V. Raghavan, V. N. Gudivada, V. Govindaraju, C. R. Rao, Cognitive computing: Theory and applications, Elsevier, 2016.
- 27.Lundervold AS, Lundervold A. An overview of deep learning in medical imaging focusing on MRI. Zeitschrift für Medizinische Physik. 2019;29(2):102–127. doi: 10.1016/j.zemedi.2018.11.002. [DOI] [PubMed] [Google Scholar]
- 28.Srinivasu PN, SivaSai JG, Ijaz MF, Bhoi AK, Kim W, Kang JJ. Classification of skin disease using deep learning neural networks with mobilenet v2 and LSTM. Sensors. 2021;21(8):2852. doi: 10.3390/s21082852. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.O. Ronneberger, P. Fischer, T. Brox, U-net: Convolutional networks for biomedical image segmentation, in: International Conference on Medical image computing and computer-assisted intervention, Springer, 2015, pp. 234–241. 10.1007/978-3-319-24574-4_28.
- 30.Isbi 2014 challenge. https://cs.adelaide.edu.au/~carneiro/isbi14 challenge/. Accessed 03 January 2022
- 31.Isbi 2015 challenge. www-o.ntust.edu.tw/~cweiwang/ISBI2015/challenge2/index.html. Accessed 03 January 2022
- 32.Isbi-2015-challenge. https://cs.adelaide.edu.au/~zhi/isbi15 challenge/index.html. Accessed 03 January 2022
- 33.Du G, Cao X, Liang J, Chen X, Zhan Y. Medical image segmentation based on u-net: A review. Journal of Imaging Science and Technology. 2020;64(2):20508–1. doi: 10.2352/J.ImagingSci.Technol.2020.64.2.020508. [DOI] [Google Scholar]
- 34.A. Hänsch, M. Schwier, T. Gass, T. Morgas, B. Haas, V. Dicken, H. Meine, J. Klein, H. K. Hahn, Evaluation of deep learning methods for parotid gland segmentation from CT images, Journal of Medical Imaging 6 (1) (2018) 011005. 10.1117/1.JMI.6.1.011005. [DOI] [PMC free article] [PubMed]
- 35.Huang L, Xia W, Zhang B, Qiu B, Gao X. MSFCN-multiple supervised fully convolutional networks for the osteosarcoma segmentation of CT images. Computer methods and programs in biomedicine. 2017;143:67–74. doi: 10.1016/j.cmpb.2017.02.013. [DOI] [PubMed] [Google Scholar]
- 36.Zheng Q, Delingette H, Duchateau N, Ayache N. 3-d consistent and robust segmentation of cardiac images by deep learning with spatial propagation. IEEE transactions on medical imaging. 2018;37(9):2137–2148. doi: 10.1109/TMI.2018.2820742. [DOI] [PubMed] [Google Scholar]
- 37.Tao Q, Yan W, Wang Y, Paiman EH, Shamonin DP, Garg P, Plein S, Huang L, Xia L, Sramko M, et al. Deep learning-based method for fully automatic quantification of left ventricle function from cine MR images: a multivendor, multicenter study. Radiology. 2019;290(1):81–88. doi: 10.1148/radiol.2018180513. [DOI] [PubMed] [Google Scholar]
- 38.Wang J, Lu J, Qin G, Shen L, Sun Y, Ying H, Zhang Z, Hu W. A deep learning-based autosegmentation of rectal tumors in MR images. Medical physics. 2018;45(6):2560–2564. doi: 10.1002/mp.12918. [DOI] [PubMed] [Google Scholar]
- 39.Pedoia V, Norman B, Mehany SN, Bucknor MD, Link TM, Majumdar S. 3d convolutional neural networks for detection and severity staging of meniscus and PFJ cartilage morphological degenerative changes in osteoarthritis and anterior cruciate ligament subjects. Journal of Magnetic Resonance Imaging. 2019;49(2):400–410. doi: 10.1002/jmri.26246. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Norman B, Pedoia V, Majumdar S. Use of 2D u-net convolutional neural networks for automated cartilage and meniscus segmentation of knee MR imaging data to determine relaxometry and morphometry. Radiology. 2018;288(1):177–185. doi: 10.1148/radiol.2018172322. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.G. Zeng, G. Zheng, Deep learning-based automatic segmentation of the proximal femur from MR images, in: Intelligent Orthopaedics, Springer, 2018, pp. 73–79. 10.1007/978-981-13-1396-7_6. [DOI] [PubMed]
- 42.Huang Q, Sun J, Ding H, Wang X, Wang G. Robust liver vessel extraction using 3d u-net with variant dice loss function. Computers in biology and medicine. 2018;101:153–162. doi: 10.1016/j.compbiomed.2018.08.018. [DOI] [PubMed] [Google Scholar]
- 43.V. Kumar, J. M. Webb, A. Gregory, M. Denis, D. D. Meixner, M. Bayat, D. H. Whaley, M. Fatemi, A. Alizad, Automated and real-time segmentation of suspicious breast masses using convolutional neural network, PloS one 13 (5) (2018) e0195816. 10.1371/journal.pone.0195816. [DOI] [PMC free article] [PubMed]
- 44.Devalla SK, Renukanand PK, Sreedhar BK, Subramanian G, Zhang L, Perera S, Mari J-M, Chin KS, Tun TA, Strouthidis NG, et al. Drunet: a dilated-residual u-net deep learning network to segment optic nerve head tissues in optical coherence tomography images. Biomedical optics express. 2018;9(7):3244–3265. doi: 10.1364/BOE.9.003244. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Venhuizen FG, van Ginneken B, Liefers B, van Grinsven MJ, Fauser S, Hoyng C, Theelen T, Sánchez CI. Robust total retina thickness segmentation in optical coherence tomography images using convolutional neural networks. Biomedical optics express. 2017;8(7):3292–3316. doi: 10.1364/BOE.8.003292. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.K. Somasundaram, T. Genish, An atlas based approach to segment the hippocampus from MRI of human head scans for the diagnosis of Alzheimers disease, International Journal of Computational Intelligence and Informatics 5 (1) (2015). 10.1016/j.zemedi.2018.11.002.
- 47.X. Tang, S. Mori, T. Ratnanather, M. I. Miller, Segmentation of hippocampus and amygdala using multi-channel landmark large deformation diffeomorphic metric mapping, in: 2012 38th Annual Northeast Bioengineering Conference (NEBEC), IEEE, 2012, pp. 414–415.
- 48.Hao Y, Wang T, Zhang X, Duan Y, Yu C, Jiang T, Fan Y, Initiative ADN. Local label learning (lll) for subcortical structure segmentation: application to hippocampus segmentation. Human brain mapping. 2014;35(6):2674–2697. doi: 10.1109/NEBC.2012.6207140. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Zhu H, Cheng H, Yang X, Fan Y, Initiative ADN, et al. Metric learning for multi-atlas based segmentation of hippocampus. Neuroinformatics. 2017;15(1):41–50. doi: 10.1007/s12021-016-9312-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.D. Zarpalas, P. Gkontra, P. Daras, N. Maglaveras, Hippocampus segmentation through gradient based reliability maps for local blending of ACM energy terms, in: 2013 IEEE 10th International Symposium on Biomedical Imaging, IEEE, 2013, pp. 53–56. 10.1109/ISBI.2013.6556410.
- 51.J. V. Manjón, P. Coupé, Hippocampus subfield segmentation using a patch-based boosted ensemble of autocontext neural networks, in: International Workshop on Patch-based Techniques in Medical Imaging, Springer, 2017, pp. 29–36. 10.1007/978-3-319-67434-6_4.
- 52.Coupé P, Manjón JV, Fonov V, Pruessner J, Robles M, Collins DL. Patch-based segmentation using expert priors: Application to hippocampus and ventricle segmentation. NeuroImage. 2011;54(2):940–954. doi: 10.1016/j.neuroimage.2010.09.018. [DOI] [PubMed] [Google Scholar]
- 53.van der Lijn F, Den Heijer T, Breteler MM, Niessen WJ. Hippocampus segmentation in MR images using atlas registration, voxel classification, and graph cuts. Neuroimage. 2008;43(4):708–720. doi: 10.1016/j.neuroimage.2008.07.058. [DOI] [PubMed] [Google Scholar]
- 54.M. Goubran, E. E. Ntiri, H. Akhavein, M. Holmes, S. Nestor, J. Ramirez, S. Adamo, M. Ozzoude, C. Scott, F. Gao, et al., Hippocampal segmentation for brains with extensive atrophy using three-dimensional convolutional neural networks, Tech. rep., Wiley Online Library (2020). 10.1016/j.neuroimage.2008.07.058. [DOI] [PMC free article] [PubMed]
- 55.A. Hänsch, J. H. Moltz, B. Geisler, C. Engel, J. Klein, A. Genghi, J. Schreier, T. Morgas, B. Haas, Hippocampus segmentation in CT using deep learning: impact of MR versus CT-based training contours, Journal of Medical Imaging 7 (6) (2020) 064001. 10.1117/1.JMI.7.6.064001. [DOI] [PMC free article] [PubMed]
- 56.Ataloglou D, Dimou A, Zarpalas D, Daras P. Fast and precise hippocampus segmentation through deep convolutional neural network ensembles and transfer learning. Neuroinformatics. 2019;17(4):563–582. doi: 10.1007/s12021-019-09417-y. [DOI] [PubMed] [Google Scholar]
- 57.Safavian N, Batouli SAH, Oghabian MA. An automatic level set method for hippocampus segmentation in MR images. Computer Methods in Biomechanics and Biomedical Engineering: Imaging & Visualization. 2020;8(4):400–410. doi: 10.1080/21681163.2019.1706054. [DOI] [Google Scholar]
- 58.Chupin M, Gérardin E, Cuingnet R, Boutet C, Lemieux L, Lehéricy S, Benali H, Garnero L, Colliot O. Fully automatic hippocampus segmentation and classification in Alzheimer’s disease and mild cognitive impairment applied on data from ADNI. Hippocampus. 2009;19(6):579–587. doi: 10.1002/hipo.20626. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Folks R. Using the python programming language for image processing in nuclear medicine. Journal of Nuclear Medicine. 2014;55(supplement 1):1322–1322. [Google Scholar]
- 60.S. G. Virupakshappa, R. Sequeira, A. Rastogi, N. Jain, et al., Essence of python programming language in medical image analysis: Enhancing workplace productivity, European Congress of Radiology 2018, 2018.
- 61.I. Despotović, B. Goossens, W. Philips, MRI segmentation of the human brain: challenges, methods, and applications, Computational and mathematical methods in medicine 2015 (2015). 10.1155/2015/450341. [DOI] [PMC free article] [PubMed]
- 62.Kalavathi P, Prasath VS. Methods on skull stripping of MRI head scan images-a review. Journal of digital imaging. 2016;29(3):365–379. doi: 10.1007/s10278-015-9847-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.R. A. Hazarika, K. Kharkongor, S. Sanyal, A. K. Maji, A comparative study on different skull stripping techniques from brain magnetic resonance imaging, in: International Conference on Innovative Computing and Communications, Springer, 2020, pp. 279–288.
- 64.A. Krizhevsky, I. Sutskever, G. E. Hinton, Imagenet classification with deep convolutional neural networks, in: Advances in neural information processing systems, 2012, pp. 1097–1105. 10.1145/3065386.
- 65.N. Bjorck, C. P. Gomes, B. Selman, K. Q. Weinberger, Understanding batch normalization, in: Advances in Neural Information Processing Systems, 2018, pp. 7694–7705.
- 66.K. Hara, D. Saito, H. Shouno, Analysis of function of rectified linear unit used in deep learning, in: 2015 International Joint Conference on Neural Networks (IJCNN), IEEE, 2015, pp. 1–8. 10.1109/IJCNN.2015.7280578.
- 67.J. Nagi, F. Ducatelle, G. A. Di Caro, D. Cireşan, U. Meier, A. Giusti, F. Nagi, J. Schmidhuber, L. M. Gambardella, Max-pooling convolutional neural networks for vision-based hand gesture recognition, in: 2011 IEEE International Conference on Signal and Image Processing Applications (ICSIPA), IEEE, 2011, pp. 342–347. 10.1109/ICSIPA.2011.6144164.
- 68.P. Baldi, P. J. Sadowski, Understanding dropout, in: Advances in neural information processing systems, 2013, pp. 2814–2822.
- 69.V. Dumoulin, F. Visin, A guide to convolution arithmetic for deep learning, arXiv preprint arXiv:1603.07285 (2016).
- 70.F. Lin, Q. Wu, J. Liu, D. Wang, X. Kong, Path aggregation u-net model for brain tumor segmentation, Multimedia Tools and Applications (2020) 1–14.
- 71.D. P. Kingma, J. Ba, Adam: A method for stochastic optimization, arXiv preprint arXiv:1412.6980 (2014).