神经网络 医学图像分割
The U-Net architecture is built using the Fully Convolutional Network and designed in a way that it gives better segmentation results in medical imaging. It was first designed by Olaf Ronneberger, Philipp Fischer, and Thomas Brox in 2015 to process biomedical images [https://arxiv.org/pdf/1505.04597.pdf]. Convolutional neural networks are generally used for image classification problems, but in biomedical cases, we have to localize the area of abnormality as well.
U-Net体系结构是使用完全卷积网络构建的,并以在医学成像中提供更好的分割结果的方式进行设计。 它是由Olaf Ronneberger,Philipp Fischer和Thomas Brox于2015年首次设计的,用于处理生物医学图像[ https://arxiv.org/pdf/1505.04597.pdf ]。 卷积神经网络通常用于图像分类问题,但是在生物医学情况下,我们还必须定位异常区域。
It has a “U” shape. U-Net architecture is symmetric and it’s functioning is somewhat similar to auto-encoders. It can be narrowed down into three major parts — The contracting(downsampling) path, Bottleneck, and expanding(upsampling) path. In auto-encoders, the encoder part of the neural network compresses the input into a latent space representation and then a decoder constructs the output from the compressed or encoded representation. But there is a slight difference, unlike regular encoder-decoder structures, the two parts are not decoupled. Skip connections are used to transfer fine-grained information from the low-level layers of the analysis path to the high-level layers of the synthesis path as this information is needed to generate reconstructions that have accurate fine-grained details.
它具有“ U”形。 U-Net体系结构是对称的,其功能在某种程度上类似于自动编码器。 可以将其缩小为三个主要部分-缩小(下采样)路径,瓶颈和扩展(上采样)路径。 在自动编码器中,神经网络的编码器部分将输入压缩为潜在空间表示,然后解码器根据压缩或编码的表示构造输出。 但是,与常规的编码器-解码器结构不同,这两个部分没有解耦,只是存在一点差异。 跳过连接用于将细粒度的信息从分析路径的低层传输到合成路径的高层,因为需要此信息来生成具有准确细粒度细节的重构。
(Contracting Path)
Contracting path is composed of four blocks where each block is made of :
订约路径由四个区块组成,每个区块由以下组成:
- 3x3 Convolution Layer + Activation function (relu) [Dropout is optional]
- 3x3 Convolution Layer + Activation function (relu) [Dropout is optional]
- 2x2 Max Pooling Layer
Each block has two convolutional layers and one max-pooling layer. The number of channels is then switched to 64. Also, the kernel size of (3,3) is used which changes the dimensions from 572x572 → 570x570→568x568. The MaxPool2D layer then reduces the dimension to 284x284 and the process is repeated three more times until we have reached the bottleneck part.
每个块具有两个卷积层和一个最大合并层。 然后将通道数切换为64。此外,使用内核大小(3,3),将尺寸从572x572→570x570→568x568更改。 然后,MaxPool2D层将尺寸减小为284x284,并重复此过程三遍,直到达到瓶颈部分。
(Expanding Path)
The expanding path is also composed of 4 blocks. Each of these blocks is composed of :
扩展路径也由4个块组成。 这些块均由以下组成:
- DeConvolution or UpSampling2D layer with stride 2
- Image is concatenated with the corresponding image from the contracting path
- 3x3 Convolution layer + Activation function [Dropout is optional]
- 3x3 Convolution layer + Activation function [Dropout is optional]
The image is concatenated with the corresponding image and generates an image of dimension 56x56x1024. This process is followed by a set of Convolutional layers and the last Conv2D layer has one filter of size 1x1.
该图像与相应的图像连接在一起,并生成尺寸为56x56x1024的图像。 此过程之后是一组卷积层,最后一个Conv2D层具有一个大小为1x1的滤镜。
(Building U-Net with Tensorflow and Keras)
The first step in the process is to download a dataset. In this example, we’ll be using Kaggle’s 2018 Data Science Bowl which is a dataset with 128x128 images of nuclei and it’s mask. It can be used for biomedical image segmentation. If you’re using Kaggle API, you can download it with :
该过程的第一步是下载数据集。 在此示例中,我们将使用Kaggle的2018年数据科学碗,该碗是一个具有128x128个核图像及其蒙版的数据集。 它可以用于生物医学图像分割。 如果您使用的是Kaggle API,则可以通过以下方式下载:
Import the required packages and tools.
导入所需的软件包和工具。
The next step is to create a data generator function that loads the data, resizes the images to a scale of 128x128, and normalizes them. Our aim is to create a data generator that returns the image and its mask. The structure of the Data Generator may vary as it depends on your requirement. After this, we’ll have to set some hyperparameters like image size, epochs, and batch size.
下一步是创建一个数据生成器功能,该功能可加载数据,将图像调整为128x128的大小并对其进行规范化。 我们的目标是创建一个返回图像及其遮罩的数据生成器。 数据生成器的结构可能会有所不同,具体取决于您的要求。 此后,我们将不得不设置一些超参数,例如图像大小,时期和批处理大小。
Now we can implement U-Net’s architecture i.e. contracting path, bottleneck, and expanding path.
现在,我们可以实现U-Net的体系结构,即收缩路径,瓶颈和扩展路径。
(U-Net model and training)
We’ll repeat the down_block and up_block process four times, which is then followed by the last Convolutional layer which gives the final predicted mask. “Adam” optimizer is a good choice for an optimizer in this case and “binary_crossentropy” can be used as our loss function. Check it’s summary after a successful compilation of the model.
我们将重复down_block和up_block过程四次,然后是最后一个卷积层,该层将给出最终的预测掩码。 在这种情况下,“ Adam”优化器是优化器的不错选择,“ binary_crossentropy”可以用作我们的损失函数。 成功编译模型后,请检查其摘要。
The next step is to train the model over a decent number of epochs and make predictions.
下一步是在相当数量的时期内训练模型并进行预测。
Now we can compare the real images with their predicted masks. Use matplotlib to plot these images and change the cmap, so that we can see the difference clearly.
现在我们可以将真实图像与其预测的蒙版进行比较。 使用matplotlib绘制这些图像并更改cmap,以便我们可以清楚地看到差异。
This example was only focused on building the U-Net model. If you want the full code for this segmentation task, you can find it on my GitHub repository.
此示例仅专注于构建U-Net模型。 如果您想要此分段任务的完整代码,可以在我的GitHub存储库中找到它。
(Advantages)
U-Net performs much better compared to FCN-8. U-net is symmetric and the skip connections between the contracting and expanding path combines the location information from the downsampling path with the contextual information in the upsampling. Also, it doesn’t have any Dense layer, which means different image sizes can be used as input since the only parameters to learn on convolution layers are the kernel. The U-Net model can be used on different sets of images and results are quite satisfactory. Data augmentation techniques like shift and rotation invariance can be very helpful to teach the network the desired invariance and robustness properties when we have only a few training samples.
与FCN-8相比,U-Net的性能要好得多。 U-net是对称的,收缩路径和扩展路径之间的跳过连接将下采样路径中的位置信息与上采样中的上下文信息结合在一起。 而且,它没有任何密集层,这意味着可以使用不同的图像大小作为输入,因为在卷积层上学习的唯一参数是内核。 U-Net模型可用于不同的图像集,效果令人满意。 当我们只有几个训练样本时,诸如移位和旋转不变性之类的数据增强技术对于教导网络所需的不变性和鲁棒性非常有用。
Intel is using its own version of U-Net for identifying tumors in both 2D and 3D models. Intel’s U-Net is trained on BraTS (Brain Tumor Segmentation) which is a subset of the Medical Segmentation Decathlon dataset. You can check their repository here.
英特尔正在使用自己的U-Net版本来识别2D和3D模型中的肿瘤。 英特尔的U-Net在BraTS(脑肿瘤分割)上接受了培训,而BraTS是医学分割十项全能数据集的子集。 您可以在此处检查其存储库。
神经网络 医学图像分割