Abstract
This manuscript presents a review of state-of-the-art techniques proposed in the literature for multimodal image registration, addressing instances where images from different modalities need to be precisely aligned in the same reference system. This scenario arises when the images to be registered come from different modalities, among the visible and thermal spectral bands, 3D-RGB, or flash-no flash, or NIR-visible. The review spans different techniques from classical approaches to more modern ones based on deep learning, aiming to highlight the particularities required at each step in the registration pipeline when dealing with multimodal images. It is noteworthy that medical images are excluded from this review due to their specific characteristics, including the use of both active and passive sensors or the non-rigid nature of the body contained in the image.
Similar content being viewed by others
Data Availibility Statement
Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.
References
Ekpar F (2008) A framework for intelligent video surveillance, 421–426 (IEEE)
Torresan H, Turgeon B, Ibarra-Castanedo C, Hebert P, Maldague XP (2004) Advanced surveillance systems: combining video and thermal imagery for pedestrian detection, Vol. 5405, 506–515 (SPIE)
Yu X, Tian X (2022) A fault detection algorithm for pipeline insulation layer based on immune neural network. Int Journal of Pressure Vessels and Piping 196:104611
Kim C, Park G, Jang H, Kim E-J (2022) Automated classification of thermal defects in the building envelope using thermal and visible images. Quantitative InfraRed Thermography Journal 1–17
Asadzadeh S, de Oliveira WJ, de Souza Filho CR (2022) Uav-based remote sensing for the petroleum industry and environmental monitoring: State-of-the-art and perspectives. J Pet Sci Eng 208:109633
Li X, Ye H, Qiu S (2022) Cloud contaminated multispectral remote sensing image enhancement algorithm based on mobilenet. Remote Sensing 14:4815
Pan Y, Liu D, Wang L, Xing S, Benediktsson JA (2022) A multispectral and panchromatic images fusion method based on weighted mean curvature filter decomposition. Appl Sci 12:8767
Hafeez A et al. (2022) Implementation of drone technology for farm monitoring & pesticide spraying: A review. Information Processing in Agriculture
Lahmyed R, El Ansari M, Ellahyani A (2019) A new thermal infrared and visible spectrum images-based pedestrian detection system. Multimedia Tools Appl 78:15861–15885
Nam Y, Nam Y-C (2018) Vehicle classification based on images from visible light and thermal cameras. EURASIP Journal on Image and Video Processing 1–9
Yue J et al (2021) Method for accurate multi-growth-stage estimation of fractional vegetation cover using unmanned aerial vehicle remote sensing. Plant Methods 17:1–16
Hwang S, Park J, Kim N, Choi Y, So Kweon I (2015) Multispectral pedestrian detection: Benchmark dataset and baseline 1037–1045
Shariq MH, Hughes BR (2020) Revolutionising building inspection techniques to meet large-scale energy demands: A review of the state-of-the-art. Renew Sustain Energy Rev 130:109979
Jia Y, Zhang J, Shan S (2021) Dual-branch meta-learning network with distribution alignment for face anti-spoofing. Trans Inf Forensics Secur 17:138–151
Patel H, Upla KP (2020) Night vision surveillance: Object detection using thermal and visible images 1–6 (IEEE)
Cheng T, Gu J, Zhang X, Hua L, Zhao F (2022) Multimodal image registration for power equipment using clifford algebraic geometric invariance. Energy Rep 8:1078–1086
Yi Z, Zhiguo C, Yang X (2008) Multi-spectral remote image registration based on SIFT. Electron Lett 44:1
Aguilera C, Barrera F, Lumbreras F, Sappa AD, Toledo R (2012) Multispectral image feature points. Sensors 12:12661–12672
Pinggera12 P, Breckon T, Bischof H (2012) On cross-spectral stereo matching using dense gradient features 2:3
Firmenichy D, Brown M, Susstrunk S (2011) Multispectral interest points for RGB-NIR image registration 181–184 (IEEE)
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60:91–110
Vural MF, Yardimci Y, Temizel A (2009) Registration of multispectral satellite images with orientation-restricted SIFT, Vol. 3, III–243 (IEEE)
Bay H, Tuytelaars T, Van Gool L (2006) Surf: Speeded up robust features. Lect Notes Comput Sci 3951:404–417
Balntas V, Johns E, Tang L, Mikolajczyk K (2016) PN-Net: Conjoined triple deep network for learning local image descriptors. arXiv:1601.05030
Zagoruyko S, Komodakis N (2015) Learning to compare image patches via convolutional neural networks 4353–4361
Okorie A, Makrogiannis S (2019) Region-based image registration for remote sensing imagery. Comput Vision Image Underst 189:102825
Jiang X et al (2020) Robust feature matching for remote sensing image registration via linear adaptive filtering. Trans Geosci Remote Sens 59:1577–1591
Teke M, Temizel A (2010) Multi-spectral satellite image registration using scale-restricted surf 2310–2313 (IEEE)
Lu J, Öfverstedt J, Lindblad J, Sladoje N (2022) Is image-to-image translation the panacea for multimodal image registration? a comparative study. Plos one 17:e0276196
Isola P, Zhu J-Y, Zhou T, Efros AA (2017) Image-to-image translation with conditional adversarial networks 1125–1134
Zhu J-Y, Park T, Isola P, Efros AA (2017 Unpaired image-to-image translation using cycle-consistent adversarial networks
Lee H-Y et al (2020) Drit++: Diverse image-to-image translation via disentangled representations. Int J Comput Vision 128:2402–2417
Choi Y, Uh Y, Yoo J, Ha J-W (2020) Stargan v2: Diverse image synthesis for multiple domains 8188–8197
Pielawski N et al (2020) Comir: Contrastive multimodal image representation for registration arXiv:2006.06325
Ma W, Wu Y, Liu S, Su Q, Zhong Y (2018) Remote sensing image registration based on phase congruency feature detection and spatial constraint matching. Access 6:77554–77567
Li K, Zhang Y, Zhang Z, Lai G (2019) A coarse-to-fine registration strategy for multi-sensor images with large resolution differences. Remote Sensing 11:470
Tarolli JG, Bloom A, Winograd N (2016) Multimodal image fusion with sims: Preprocessing with image registration. Biointerphases 11:02A311
Krishnan PT, Balasubramanian P, Jeyakumar V (2021) Histogram matched visible and infrared image registration for face detection 222–226 (IEEE)
Banharnsakun A, Achalakul T, Sirinaovakul B (2011) The best-so-far selection in artificial bee colony algorithm. Appl Soft Comput 11:2888–2901
Debayle J, Presles B (2016) Rigid image registration by general adaptive neighborhood matching. Pattern Recogn 55:45–57
Velesaca HO, Vulgarin J, Vintimilla BX (2023) Deep learning-based human height estimation from a stereo vision system 1–7
Yan X, Zhang Y, Zhang D, Hou N, Zhang B (2020) Registration of multimodal remote sensing images using transfer optimization. Geosci Remote Sens Lett 17:2060–2064
Deng X, Liu E, Li S, Duan Y, Xu M (2023) Interpretable multi-modal image registration network based on disentangled convolutional sparse coding. Trans Image Process 32:1078–1091
Chen J et al (2023) Shape-former: Bridging cnn and transformer via shapeconv for multimodal image matching. Inf Fusion 91:445–457
Zhang Y, Zhang Z, Ma G, Wu J (2021) Multi-source remote sensing image registration based on local deep learning feature. International Geoscience and Remote Sensing Symposium 2021-July, 3412–3415
Elsaeidy M, Erkol ME, Gunturk BK, Ates HF (2022) Infrared-to-optical image translation for keypoint-based image registration (Institute of Electrical and Electronics Engineers Inc.)
Song Z, Zhou S, Guan J (2013) A novel image registration algorithm for remote sensing under affine transformation. Trans Geosci Remote Sens 52:4895–4912
Liu X, Ai Y, Zhang J, Wang Z (2018) A novel affine and contrast invariant descriptor for infrared and visible image registration. Remote Sensing 10:658
Tu Z, Li Z, Li C, Tang J (2022) Weakly alignment-free rgbt salient object detection with deep correlation network. Trans Image Process 31:3752–3764
Wang D, Liu J, Fan X, Liu R. Unsupervised misaligned infrared and visible image fusion via cross-modality image generation and registration 3508–3515
Zhang H et al (2019) Registration of multimodal remote sensing image based on deep fully convolutional neural network. Journal of Selected Topics in Applied Earth Observations and Remote Sensing 12:3028–3042
Ma J, Zhao J, Ma Y, Tian J (2015) Non-rigid visible and infrared face registration via regularized gaussian fields criterion. Pattern Recogn 48:772–784
Rabatel G, Labbe S (2016) Registration of visible and near infrared unmanned aerial vehicle images based on fourier-mellin transform. Precis Agric 17:564–587
Arar M, Ginger Y, Danon D, Bermano AH, Cohen-Or D (2020) Unsupervised multi-modal image registration via geometry preserving image-to-image translation 13410–13419
Rouhani M, Sappa AD (2012) Non-rigid shape registration: A single linear least squares framework, Vol. 7578, 264–277 (Springer)
Ye Y, Shan J, Bruzzone L, Shen L (2017) Robust registration of multimodal remote sensing images based on structural similarity. Trans Geosci Remote Sens 55:2941–2958
Zitova B, Flusser J (2003) Image registration methods: a survey. Image Vision Comput 21:977–1000
Estévez PA, Tesmer M, Perez CA, Zurada JM (2009) Normalized mutual information feature selection. Trans Neural Netw 20:189–201
Erives H, Fitzgerald GJ (2006) Automatic subpixel registration for a tunable hyperspectral imaging system. Geosci Remote Sens Lett 3:397–400
Zhao F, Huang Q, Gao W (2006) Image matching by normalized cross-correlation, Vol. 2, II–II (IEEE)
Rao YR, Prathapani N, Nagabhooshanam E (2014) Application of normalized cross correlation to image registration. Int J Res Eng Technol 3:12–16
Chai T, Draxler RR (2014) Root mean square error (RMSE) or mean absolute error (MAE)?-arguments against avoiding RMSE in the literature. Geosci Model Dev 7:1247–1250
Willmott CJ, Matsuura K (2005) Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Clim Res 30:79–82
Asuero AG, Sayago A, González A (2006) The correlation coefficient: An overview. Critical Reviews in Analytical Chemistry 36:41–59
Taylor R (1990) Interpretation of the correlation coefficient: a basic review. Journal of Diagnostic Medical Sonography 6:35–39
Zhou Y, Rangarajan A, Gader PD (2019) An integrated approach to registration and fusion of hyperspectral and multispectral images. Trans Geosci Remote Sens 58:3020–3033
Dice LR (1945) Measures of the amount of ecologic association between species. Ecology 26:297–302
Sorensen TA (1948) A method of establishing groups of equal amplitude in plant sociology based on similarity of species content and its application to analyses of the vegetation on danish commons. Biol Skar 5:1–34
Eelbode T et al (2020) Optimization for medical image segmentation: theory and practice when evaluating with dice score or jaccard index. Trans Med Imaging 39:3679–3690
Cocianu CL, Uscatu CR (2022) Multi-scale memetic image registration. Electronics 11:278
Klein S, Staring M, Murphy K, Viergever MA, Pluim JP (2009) Elastix: a toolbox for intensity-based medical image registration. Trans Med Imaging 29:196–205
Muthukumaran D, Sivakumar M (2017) Medical image registration: a matlab based approach 2:29–34
Avants BB, Tustison N, Song G et al (2009) Advanced normalization tools (ANTS). Insight J 2:1–35
Johnson HJ, Christensen GE (2002) Consistent landmark and intensity-based image registration. Trans Med Imaging 21:450–461
Allasia G, Cavoretto R, De Rossi A (2012) A class of spline functions for landmark-based image registration. Math Methods Appl Sci 35:923–934
Habib A, Al-Ruzouq R (2005) Semi-automatic registration of multi-source satellite imagery with varying geometric resolutions. Photogramm Eng Remote Sens 71:325–332
Pistarelli MD, Sappa AD, Toledo R (2013) Multispectral stereo image correspondence, 217–224 (Springer)
Aguilera C, Barrera F, Sappa AD, Toledo R (2012) A novel SIFT-like-based approach for FIR-VS images registration. Proc Quantitative InfraRed Thermography
Zeng Q et al (2020) Real-time adaptive visible and infrared image registration based on morphological gradient and C_SIFT. Journal of Real-Time Image Processing 17:1103–1115
Zhang X et al (2021) Multimodal remote sensing image registration methods and advancements: A survey. Remote Sens 13:5128
Lowe DG (1999) Object recognition from local scale-invariant features, Vol. 2, 1150–1157 (IEEE)
Li J, Hu Q, Ai M (2019) Rift: Multi-modal image matching based on radiation-variation insensitive feature transform. Trans Image Process 29:3296–3310
Sedaghat A, Mokhtarzade M, Ebadi H (2011) Uniform robust scale-invariant feature matching for optical remote sensing images. Transactions on Geoscience and Remote Sensing 49:4516–4527
Morris NJ, Avidan S, Matusik W, Pfister H (2007) Statistics of infrared images 1–7 (IEEE)
Mouats T, Aouf N, Sappa AD, Aguilera C, Toledo R (2014) Multispectral stereo odometry. Trans Intell Transp Syst 16:1210–1224
Aguilera CA, Sappa AD, Toledo R (2015) LGHD: A feature descriptor for matching across non-linear intensity variations, 178–181 (IEEE)
Radhika V, Kartikeyan B, Krishna BG, Chowdhury S, Srivastava PK (2007) Robust stereo image matching for spaceborne imagery. Transactions on Geoscience and Remote Sensing 45:2993–3000
Wan T et al (2019) RGB-D point cloud registration via infrared and color camera. Multimedia Tools and Applications 78:33223–33246
Zhao D, Yang Y, Ji Z, Hu X (2014) Rapid multimodality registration based on mm-surf. Neurocomputing 131:87–97
Jhan J-P, Rau J-Y (2021) A generalized tool for accurate and efficient image registration of uav multi-lens multispectral cameras by n-surf matching. Journal of Selected Topics in Applied Earth Observations and Remote Sensing 14:6353–6362
Zheng X, Li Z-L, Nerry F, Zhang X (2019) A new thermal infrared channel configuration for accurate land surface temperature retrieval from satellite data. Remote Sens Environ 231:111216
Ren H, Ye X, Liu R, Dong J, Qin Q (2017) Improving land surface temperature and emissivity retrieval from the chinese gaofen-5 satellite using a hybrid algorithm. Trans Geosci Remote Sens 56:1080–1090
Quan J, Zhan W, Chen Y, Wang M, Wang J (2016) Time series decomposition of remotely sensed land surface temperature and investigation of trends and seasonal variations in surface urban heat islands. Journal of Geophysical Research: Atmospheres 121:2638–2657
Abbasi N et al (2021) Estimating actual evapotranspiration over croplands using vegetation index methods and dynamic harvested area. Remote Sens 13:5167
Chen J et al (2022) A tir-visible automatic registration and geometric correction method for SDGSAT-1 thermal infrared image based on modified RIFT. Remote Sens 14:1393
Vijay ST, Pournami P (2018) Feature based image registration using heuristic nearest neighbour search 1–3 (IEEE)
Yuan Y et al (2020) Automated accurate registration method between uav image and google satellite map. Multimedia Tools Appl 79:16573–16591
Song X, Zheng J, Zhong F, Qin X (2018) Modeling deviations of rgb-d cameras for accurate depth map and color image registration. Multimedia Tools Appl 77:14951–14977
Fischler MA, Bolles RC (1981) Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun ACM 24:381–395
Cheng L et al (2012) Remote sensing image matching by integrating affine invariant feature extraction and RANSAC. Comput Electr Eng 38:1023–1032
Yang K et al (2017) Remote sensing image registration using multiple image features. Remote Sensing 9:581
Krishnan PT, Balasubramanian P, Jeyakumar V, Mahadevan S, Noel Joseph Raj A (2022) Intensity matching through saliency maps for thermal and visible image registration for face detection applications. The Visual Computer 1–14
Chen S-J, Shen H-L, Li C, Xin JH (2017) Normalized total gradient: A new measure for multispectral image registration. Trans Image Process 27:1297–1310
Hu H et al (2020) An artificial bee algorithm with a leading group and its application into image registration. Multimedia Tools Appl 79:14643–14669
Landsat 8 and srtm dataset. https://earthexplorer.usgs.gov/
The tufts face database. http://tdface.ece.tufts.edu/
Panetta K et al (2018) A comprehensive database for benchmarking imaging systems. Transactions on Pattern Analysis and Machine Intelligence 42:509–520
Jhan J-P, Rau J-Y, Huang C-Y (2016) Band-to-band registration and ortho-rectification of multilens/multispectral imagery: A case study of minimca-12 acquired by a fixed-wing uas. J Photogramm Remote Sens 114:66–77
Oxford dataset. http://www.robots.ox.ac.uk/vgg/research/affine/
Yasuma F, Mitsunaga T, Iso D, Nayar S (2008) Generalized Assorted Pixel Camera: Post-Capture Control of Resolution. Tech. rep, Dynamic Range and Spectrum
Mikolajczyk’s dataset. http://www.robots.ox.ac.uk/vgg/research/affine
Group FA. Flir thermal dataset for algorithm training. https://www.flir.com/oem/adas/adas-dataset-form/
Xu H, Ma J, Yuan J, Le Z, Liu W (2022) RFNet: Unsupervised network for mutually reinforcing multi-modal image registration and fusion 19679–19688
Debaque B et al (2022) Thermal and visible image registration using deep homography, 1–8 (IEEE)
Tang T, Chen T, Zhu B, Ye Y (2022) MU-NET: A multiscale unsupervised network for remote sensing image registration, Vol. 43, 537–544 (International Society for Photogrammetry and Remote Sensing)
Jiang X, Ma J, Xiao G, Shao Z, Guo X (2021) A review of multimodal image matching: Methods and applications. Information Fusion 73:22–71
Quan D et al (2022) Self-distillation feature learning network for optical and SAR image registration. Transactions on Geoscience and Remote Sensing 60
Parbs TJ, Koch P, Mertins A (2022) Convolutive attention for image registration 1348–1352
Quan D et al (2018) Deep generative matching network for optical and sar image registration 6215–6218
Vaswani A et al (2017) Guyon I et al (eds) Attention is all you need. (eds Guyon, I. et al.) Advances in Neural Information Processing Systems, Vol. 30 (Curran Associates, Inc.)
Goodfellow I et al (2014) Ghahramani Z, Welling M, Cortes C, Lawrence N, Weinberger K (eds) Generative adversarial nets. (eds Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N. & Weinberger, K.) Advances in Neural Information Processing Systems, Vol. 27 (Curran Associates, Inc.)
Li R, Zhou M, Zhang D, Yan Y, Huo Q (2023) A survey of multi-source image fusion. Multimedia Tools and Applications
Xu H, Ma J, Jiang J, Guo X, Ling H (2020) U2fusion: A unified unsupervised image fusion network. Transactions on Pattern Analysis and Machine Intelligence 8828:1
Zhang S, Zhao W, Hao X, Yang Y, Guan C (2020) A context-aware locality measure for inlier pool enrichment in stepwise image registration. Transactions on Image Processing 29:4281–4295
Toet A (2014) TNO image fusion dataset. https://www.figshare.com/articles/dataset/TNO_Image_Fusion_Dataset/1008029
Xu H, Ma J, Le Z, Jiang J, Guo X (2020) FusionDN: A Unified Densely Connected Network for Image Fusion. Conf on Artificial Intelligence 34:12484–12491
Brown M, Süsstrunk S (2011) Multi-spectral SIFT for scene category recognition, 177–184 (IEEE)
Wang G et al (2018) RGB-T saliency detection benchmark: Dataset, baselines, analysis and a novel approach 359–369 (Springer)
Tu Z et al (2019) RGB-T image saliency detection via collaborative graph learning. Transactions on Multimedia 22:160–173
Tu Z et al (2022) RGBT salient object detection: A large-scale dataset and benchmark. Transactions on Multimedia
Ellmauthaler A, Pagliari CL, da Silva EA, Gois JN, Neves SR (2019) A visible-light and infrared video database for performance evaluation of video/image fusion methods. Multidim Syst Signal Process 30:119–143
Acknowledgements
This material is based upon work supported by the Air Force Office of Scientific Research under award number FA9550-22-1-0261; and partially supported by the Grant PID2021-128945NB-I00 funded by MCIN/AEI/10.13039/501100011033 and by “ERDF A way of making Europe"; the “CERCA Programme / Generalitat de Catalunya"; and the ESPOL project CIDIS-12-2022.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Velesaca, H.O., Bastidas, G., Rouhani, M. et al. Multimodal image registration techniques: a comprehensive survey. Multimed Tools Appl 83, 63919–63947 (2024). https://doi.org/10.1007/s11042-023-17991-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-17991-2