Abstract
In this paper, we design a parallel-processing pipeline for spectrum reconstruction in Fourier transform imaging spectroscopy (FTIS), which works well with the embedded system, NVIDIA Jetson TX2. This embedded system has great performance in parallel computing and can be developed easily by programmers using CUDA C in a single development board. This is very important for data processing on satellite and mobile devices. On the other hand, because of the huge amount of interference data acquired by the Fourier transform spectrometer, traditional interference data processing mechanism is not efficient and time-saving. These data should be processed in a fast way for real time, especially on satellite, to save memory and bandwidth. We take advantage of parallel computing to enable higher efficiency and reduced operation time. Furthermore, traditional serial algorithms for processing interferograms on the ARMs are introduced for comparison. The experimental results show that our parallel spectrum reconstruction pipeline has much higher performance than the serial one, and for huge data, our parallel mechanism also achieves great result in high performance.
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
In recent years, benefiting from the rapid development of imaging spectroscopy, the Fourier transform spectrometer [1,2,3,4,5,6] has played an important role in both space exploration and component analysis. Compared with other spectrometer, it has many advantages in high throughput, multi-channel operation, and high resolution.
Fourier transform spectroscopy obtains abundant data containing space and spectral information of the target simultaneously. The core device of the spectrometer is the interferometer. The light from the target is separated into two coherent beams, and with the change of the optical path difference, the two interfere beams will interfere on the sensors so as we can obtain a series of interference patterns.
Generally, spectral reconstruction [7, 8] mainly includes preprocessing, apodization, phase correction and Fourier transform. An important step in preprocessing is detrending, which is to remove the slowly varying trends from the interfering signals. The purpose of apodization technique is to reduce spectrum leakage by selecting some appropriate functions. To ensure the symmetry of interferogram, phase correction is performed to correct the phase error caused by sampling position offset. Then, we would obtain the spectrum by Fourier transform.
In order to process the obtained interferograms as quickly as possible, we usually simplify the process by omitting some steps, resulting in low spectral quality and poor resolution. Specially, the fast intelligent processing system on satellite requires real-time spectrum reconstruction to save memory and bandwidth, which requires a high-speed method to replace the traditional pipeline.
Comparing with the traditional data-process pipeline, it is a new way to perform general computing on the graphics processing unit (GPU). It is particularly suitable for solving problems that can be represented as data parallel computing, i.e. the parallel execution of the same program on many data elements. Meanwhile, NVIDIA provide Compute Unified Device Architecture (CUDA) as a general-purpose parallel-computing platform and programming model to solve many computational problems more efficiently [9,10,11].
Fortunately, in the embedded field, NVIDIA also makes great contribution and it provides an embedded development kit, suitable for NVIDIA Jetson series. NVIDIA Jetson represents a series of computing processor boards from NVIDIA. All Jetson boards are carrying a Tegra processor, including Jetson TK1, TX1 and TX2 models. NVIDIA claims that it is a AI supercomputer on a module, powered by the NVIDIA Pascal architecture. Best of all, it packages this performance into a small, power-efficient form factor that is ideal for intelligent edge devices like robots, drones, smart cameras and portable medical devices. The Jetson TX2 supports all the features of the Jetson TX1 module, and enables a larger, more complex deep neural network (DNN).
In this paper, the embedded board, NVIDIA Jetson TX2, is used for our real-time embedded platform, on which our parallel interferogram processing algorithms are performed. The rest of this paper is structured as follows: Sect. 2 briefly explains the characteristics and advantages of the board. Section 3 depicts the algorithms of parallel processing running on the GPU of the embedded board. The experiments are arranged in Sect. 4. At the end of this paper, we make an analysis and draw conclusions.
2 Embedded System
NVIDIA Jetson with GPU-accelerated parallel processing is a leading embedded computing platform. The most important feature is that the Jetson series provide CUDA for developers to improve the performance of algorithms.
Jetson TX2 is a fast, power-efficient embedded AI computing device. This 7.5-watt supercomputer on a module is built around an NVIDIA Pascal-family GPU. In addition to being loaded 8 GB of memory and 59.7 GB/s of memory bandwidth, it has a variety of standard hardware interfaces that make it easy to integrate into a wide range of products and form factors. Some other parameters about TX2 is as follows (Table 1).
From the table, we could see the embedded system is running on two types of ARM with a high-performance GPU. The CPU and GPU differ in frequency in different working modes. Their work frequency is not all the same. The performance of algorithms is also different in different mode. These modes are listed in the following.
-
Mode 0: Denver 2 (2.0 GHz), ARM A57 (2.0 GHz), GPU (1.30 GHz);
-
Mode 1: ARM A57 (1.2 GHz), GPU (0.85 GHz);
-
Mode 2: Denver 2 (1.4 GHz), ARM A57 (1.4 GHz), GPU (1.12 GHz);
-
Mode 3: ARM A57 (2.0 GHz), GPU (1.12 GHz);
-
Mode 4: Denver 2 (2.0 GHz), GPU (1.12 GHz);
3 Theory
According to the basic principle of Fourier transform spectroscopy [12], spectral recovery can be achieved by Fourier transform of the interferogram. This principle could be described by the following equation:
where I is the interferogram, B is the spectrum, and \(\varDelta \) and \(\sigma \) mean the path difference and the wave number, respectively. We can use fast Fourier transform (FFT) instead, whose complexity is O(NlogN).
In this paper, the data-parallel-process pipeline of spectrum reconstruction is divided into three part: detrending, apodization, phase correction and Fourier transform. These parallel algorithms are similar with [13].
3.1 Detrending
Usually, the interference signal x(t) consists of a slowly varying trend superimposed on a fluctuating process y(t). It should take measures to eliminate the trend term, containing the low-frequency part. The trend term could be solved by the least square method, searching for the most appropriate function by minimizing the square errors.
For a linear model that is described by the following equation,
the parameters could be solved by the least square method, which can be estimated by
If X is a full-rank matrix,
X can be decomposed by QR decomposition (QRD), that is,
where Q is an orthogonal matrix meaning and R is an upper triangular matrix. \(\hat{\beta }\) could be written by the following form,
That is, the detrending is converted to QR decomposition and matrix inversion. For parallel computing, parallel QRD and parallel matrix inversion are performed in our embedded board. These parallel algorithms in [13] could be used for our experiments.
3.2 Apodization
The ideal range of optical path difference is from negative infinity to positive infinity, which in reality is not satisfied by detectors to collect infinite data. That is, the signals we obtain are truncated. According to the Fourier theory, these truncated signals that can be seen as the multiplication of a sequence and a rectangular window, are equivalent to the convolution of the original spectrum of the signal with a sinc function in frequency domain [14]. For a continuous spectrum, the spectral resolution is limited by the sidelobe of the rectangular window due to the discontinuity of the interferogram near the maximum OPD.
Apodization is based on the point-to-point multiplication of interference sequence and a certain apodizing function to suppress the sidelobe effect in the recovery spectrum. In parallel computing, the apodization function using multi-thread can be expressed by
where tid is the current index of the thread and w is the apodization function. Some functions for apodization include the triangular function, Happ-Genzel function, Hamming function, and Bessel function.
3.3 Phase Correction and Fourier Transform
Generally phase is corrected by using Fourier transform so that phase correction are performed together with Fourier transform.
In our experiment, our interferogram is provided by our interferometer. It is a single-sided interference signal, which contains a double-sided interferogram around the zero OPD, and Mertz method is used for the phase by this double-sided interferogram. Suppose the interferogram is asymmetrical because the detector does not pick up the value at the zero OPD and this would introduce a new optical path difference \(\phi (\sigma )\), so
And
where \(m_r(\sigma )\) is the real part of \(B(\sigma )e^{(\sigma )}\) and \(m_i(\sigma )\) is the imaginary part.
In Mertz method [15], the phase is in a low resolution so that it could be acquired to fit the low phase spectrum based on the least square method by using a high-order polynomial for high-resolution phase spectrum \(\phi _0(\sigma )\). The difference between the original phase information \(\phi (\sigma )\) and the high-resolution phase spectrum can be calculated,
The final spectrum through phase correction is given by
In phase correction, we use improved Mertz method [16] in which the high-resolution phase spectrum is obtained by zero filling for the double-sided interferogram to guarantee the same length as the original signal. Furthermore, it is more efficient than Mertz method in parallel computing.
4 Experiments
Our experiments are implemented in C++ and CUDA C on the NVIDIA Jetson TX2 board. Our embedded system is shown in Fig. 1. The white light interferogram is provided by our interferometer for our experiments, as shown in Fig. 2.
4.1 Reconstruction
Figure 3 is the reconstruction result from the Fig. 2. From the figure, our algorithms for interferogram processing has a great result.
4.2 Batch Processing in Different Work Mode
The performance in the Jetson TX2 in all working mode is listed in Tables 2, 3, 4, 5 and 6, respectively. In these tables, Nor. means the QRD and matrix inversion are included and Opt. represents a simplified process, where we record the results of QRD and matrix and use them directly for the batch processing. The Groups are the number of batches we process.
From these tables, Mode 0 has the best performance because of the board working frequency is highest between the five modes. However, the board working in other modes may save more power.
4.3 Application
For the actual collected scanning data by the LASIS, as shown in Fig. 4, the size of an image is \(256\times 1024\). The interference sequence length is 128 with 16 single-sided zero-crossing samples. The wavelength is from 450 to 900 nm. For a frame of real scene, there are about 200,000 interference fringes to process. On the development board, the spectrum is reconstructed within about 47 ms. The result of spectrum reconstruction is shown in Fig. 5.
5 Conclusions
In this paper, the pipeline of the interferogram processing in spectrum reconstruction have been explored on the embedded NVIDIA Jetson TX2. The construction result reaches great success on the development board. For batch processing, the GPU has given obvious performance improvement, compared with the other ARMs. The processing pipeline we designed is well tested on our board. In the spectrum reconstruction, for detrending, we use parallel QRD and matrix inversion algorithms; for phase correction, an improved Mertz method has been performed for a fast phase correction. These parallel algorithms also has high performance. Especially, the more data, the higher the performance.
As high-performance processing pipeline on the embedded system, it could be considered fast and effective calculations for the interferogram process to meet actual requirements.
References
Grandmont, F., Drissen, L., Joncas, G.: Development of an imaging Fourier transform spectrometer for astronomy. Int. Soc. Opt. Photonics 4842, 392–402 (2003)
Lacan, A., et al.: A static Fourier transform spectrometer for atmospheric sounding: concept and experimental implementation. Opt. Express 18(8), 8311–8331 (2010)
Rafert, J.B., Sellar, R.G., Blatt, J.H.: Monolithic Fourier-transform imaging spectrometer. Appl. Opt. 34(31), 7228–7230 (1995)
Dierking, M.P., Karim, M.A.: Solid-block stationary Fourier-transform spectrometer. Appl. Opt. 35(1), 84–89 (1996)
Zhang, C., Zhao, B., Xiangli, B.: Wide-field-of-view polarization interference imaging spectrometer. Appl. Opt. 43(33), 6090–6094 (2004)
Zhang, C., Xiangli, B., Zhao, B.C., Yuan, X.J.: A static polarization imaging spectrometer based on a Savart polariscope. Opt. Commun. 203(1–2), 21–26 (2002)
Su, L., et al.: Spectrum reconstruction method for airborne temporally-spatially modulated Fourier transform imaging spectrometers. IEEE Trans. Geosci. Remote Sens. 52(6), 3720–3728 (2014)
Zhang, C., Jian, X.: Wide-spectrum reconstruction method for a birefringence interference imaging spectrometer. Opt. Lett. 35(3), 366–368 (2010)
Cook, S.: CUDA Programming: A Developer’s Guide to Parallel Computing with GPUs. Newnes (2012)
Sanders, J., Kandrot, E.: CUDA by Example: An Introduction to General-Purpose GPU Programming. Addison-Wesley Professional (2010)
Nvidia, C.: NVIDIA CUDA C programming guide. Nvidia Corporation 120(18), 8 (2011)
Griffiths, P.R., De Haseth, J.A.: Fourier Transform Infrared Spectrometry. Wiley, Hoboken (2007)
Zhang, W., Wen, D., Song, Z., et al.: Spectrum reconstruction in Fourier transform imaging spectroscopy based on high-performance parallel computing. Appl. Opt. 57(21), 5983–5991 (2018)
Stoica, P.: Moses R L. Spectral analysis of signals (2005)
Mertz, L.: Auxiliary computation for Fourier spectrometry. Infrared Phys. 7(1), 17–23 (1967)
Ting, X.: A method to improve the computing efficiency of mertz method in fourier transform spectroscopy. Acta Optica Sinica 3 (1999)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Zhang, W. et al. (2019). Parallel Spectrum Reconstruction in Fourier Transform Imaging Spectroscopy Based on the Embedded System. In: Zhao, Y., Barnes, N., Chen, B., Westermann, R., Kong, X., Lin, C. (eds) Image and Graphics. ICIG 2019. Lecture Notes in Computer Science(), vol 11902. Springer, Cham. https://doi.org/10.1007/978-3-030-34110-7_34
Download citation
DOI: https://doi.org/10.1007/978-3-030-34110-7_34
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-34109-1
Online ISBN: 978-3-030-34110-7
eBook Packages: Computer ScienceComputer Science (R0)