Abstract
The Segment Anything Model (SAM) can achieve satisfactory segmentation performance under high-quality box prompts. However, SAM’s robustness is compromised by the decline in box quality, limiting its practicality in clinical reality. In this study, we propose a novel Robust Box prompt based SAM (RoBox-SAM) to ensure SAM’s segmentation performance under prompts with different qualities. Our contribution is three-fold. First, we propose a prompt refinement module to implicitly perceive the potential targets, and output the offsets to directly transform the low-quality box prompt into a high-quality one. We then provide an online iterative strategy for further prompt refinement. Second, we introduce a prompt enhancement module to automatically generate point prompts to assist the box-promptable segmentation effectively. Last, we build a self-information extractor to encode the prior information from the input image. These features can optimize the image embeddings and attention calculation, thus, the robustness of SAM can be further enhanced. Extensive experiments on the large medical segmentation dataset including 99,299 images, 5 modalities, and 25 organs/targets validated the efficacy of our proposed RoBox-SAM.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., Süsstrunk, S.: Slic superpixels compared to state-of-the-art superpixel methods. IEEE Trans. Pattern Anal. Mach. Intell. 34(11), 2274–2282 (2012)
Avital, I., et al.: Neural segmentation of seeding rois (srois) for pre-surgical brain tractography. IEEE Trans. Med. Imaging 39(5), 1655–1667 (2019)
Canny, J.: A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell. 6, 679–698 (1986)
Fan, Q., et al.: Stable segment anything model. arXiv preprint arXiv:2311.15776 (2023)
Fu, H., Li, F., Orlando, J.I., et al.: Palm: Pathologic myopia challenge (2019). https://doi.org/10.21227/55pk-8z03
Hicks, S.A., Jha, D., Thambawita, V., Halvorsen, P., Hammer, H.L., Riegler, M.A.: The EndoTect 2020 challenge: evaluation and comparison of classification, segmentation and inference time for endoscopy. In: Del Bimbo, A., et al. (eds.) ICPR 2021. LNCS, vol. 12668, pp. 263–274. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-68793-9_18
Huang, Y., et al.: On the robustness of segment anything. arXiv preprint arXiv:2305.16220 (2023)
Huang, Y., et al.: Fourier test-time adaptation with multi-level consistency for robust classification. In: Greenspan, H., et al. (eds.) MICCAI 2023, vol. 14222, pp. 221–231. Springer, Heidelberg (2023). https://doi.org/10.1007/978-3-031-43898-1_22
Huang, Y., Yang, X., Liu, L., Zhou, H., Chang, A., Zhou, X., et al.: Segment anything model for medical images? Med. Image Anal. 92, 103061 (2024)
Ji, W., et al.: Learning calibrated medical image segmentation via multi-rater agreement modeling. In: Proceedings of the IEEE/CVF CVPR, pp. 12341–12351 (2021)
Ji, Y., Bai, H., Yang, J., Luo, P.: Amos: a large-scale abdominal multi-organ benchmark for versatile medical image segmentation. In: Advances In Neural Information Processing Systems (NeurlPS) Benchmark and Dataset Track (2022)
Ke, L., Ye, M., Danelljan, M., Tai, Y.W., Tang, C.K., et al.: Segment anything in high quality. Adv. Neural Inf. Process. Syst. 36 (2024)
Kirillov, A., Mintun, E., Ravi, N., Mao, H., et al.: Segment anything. In: Proceedings of the IEEE/CVF ICCV, pp. 4015–4026 (2023)
Leclerc, S., Smistad, E., Pedrosa, J., Østvik, A., Cervenansky, F., et al.: Deep learning for segmentation using an open large-scale dataset in 2d echocardiography. IEEE Trans. Med. Imaging 38(9), 2198–2210 (2019)
Lee, S., Shim, H., et al.: Learning local shape and appearance for segmentation of knee cartilage in 3d mri. In: Medical Image Analysis for the Clinic: a Grand Challenge. In Proceedings of the 13th International Conference on MICCAI 2010, Beijing, China, pp. 231–240 (2010)
Lemaître, G., Martí, R., Freixenet, J., et al.: Computer-aided detection and diagnosis for prostate cancer based on mono and multi-parametric mri: a review. Comput. Biol. Med. 60, 8–31 (2015)
Li, F., Zhang, H., Sun, P., Zou, X., et al.: Semantic-sam: segment and recognize anything at any granularity. arXiv preprint arXiv:2307.04767 (2023)
Li, X., Jia, M., Islam, M.T., et al.: Self-supervised feature learning via exploiting multi-modal data for retinal disease diagnosis. IEEE Trans. Med. Imaging 39(12), 4023–4033 (2020)
Ma, J., He, Y., Li, F., Han, L., et al.: Segment anything in medical images. Nat. Commun. 15(1), 654 (2024)
Ma, J., et al.: Abdomenct-1k: is abdominal organ segmentation a solved problem? IEEE Trans. Pattern Anal. Mach. Intell. 44(10), 6695–6714 (2021)
Mazurowski, M.A., Dong, H., Gu, H., et al.: Segment anything model for medical image analysis: an experimental study. Med. Image Anal. 89, 102918 (2023)
Polo, M.: Chest CT Segmentation Dataset. [EB/OL] (2020). https://www.kaggle.com/datasets/polomarco/chest-ct-segmentation
Qiao, Y., Zhang, C., Kang, T., Kim, D., et al.: Robustness of sam: segment anything under corruptions and beyond. arXiv preprint arXiv:2306.07713 (2023)
Radford, A., Kim, J.W., Hallacy, C., Ramesh, A., et al.: Learning transferable visual models from natural language supervision. In: International Conference on Machine Learning, pp. 8748–8763. PMLR (2021)
Tancik, M., Srinivasan, P., Mildenhall, B., Fridovich-Keil, S., Raghavan, N., et al.: Fourier features let networks learn high frequency functions in low dimensional domains. Adv. Neural. Inf. Process. Syst. 33, 7537–7547 (2020)
Wang, Y., Zhao, Y., Petzold, L.: An empirical study on the robustness of the segment anything model (sam). arXiv preprint arXiv:2305.06422 (2023)
Wu, J., Fu, R., Fang, H., Liu, Y., Wang, Z., et al.: Medical sam adapter: adapting segment anything model for medical image segmentation. arXiv preprint arXiv:2304.12620 (2023)
Xie, W., Willems, N., Patil, S., Li, Y., Kumar, M.: Sam fewshot finetuning for anatomical segmentation in medical images. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 3253–3261 (2024)
Yang, Y., Soatto, S.: FDA: Fourier domain adaptation for semantic segmentation. In: Proceedings of the IEEE/CVF CVPR, pp. 4085–4095 (2020)
Zhou, J., Jia, X., Ni, D., et al.: Thyroid nodule segmentation and classification in ultrasound images (2020). https://zenodo.org/records/3715942
Acknowledgments
This work was supported by the grant from National Natural Science Foundation of China (12326619, 62101343, 62171290), Science and Technology Planning Project of Guangdong Province (2023A0505020002), and Shenzhen-Hong Kong Joint Research Program (SGDX20201103095613036).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Ethics declarations
Disclosure of Interests
The authors have no competing interests to declare that are relevant to the content of this article.
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Huang, Y. et al. (2025). Robust Box Prompt Based SAM for Medical Image Segmentation. In: Xu, X., Cui, Z., Rekik, I., Ouyang, X., Sun, K. (eds) Machine Learning in Medical Imaging. MLMI 2024. Lecture Notes in Computer Science, vol 15242. Springer, Cham. https://doi.org/10.1007/978-3-031-73290-4_1
Download citation
DOI: https://doi.org/10.1007/978-3-031-73290-4_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-73292-8
Online ISBN: 978-3-031-73290-4
eBook Packages: Computer ScienceComputer Science (R0)