Abstract
In spectrometric problems, objects are characterized by high-resolution spectra that correspond to hundreds to thousands of variables. In this context, even fast variable selection methods lead to high computational load. However, spectra are generally smooth and can therefore be accurately approximated by splines. In this paper, we propose to use a B-spline expansion as a pre-processing step before variable selection, in which original variables are replaced by coefficients of the B-spline expansions. Using a simple leave-one-out procedure, the optimal number of B-spline coefficients can be found efficiently. As there is generally an order of magnitude less coefficients than original spectral variables, selecting optimal coefficients is faster than selecting variables. Moreover, a B-spline coefficient depends only on a limited range of original variables: this preserves interpretability of the selected variables. We demonstrate the interest of the proposed method on real-world data.
M. Verleysen is Research Director of the Belgian F.N.R.S. (National Fund for Scientific Research). D. François is funded by a grant from the Belgian F.R.I.A. Parts of this research result from the Belgian Program on Interuniversity Attraction Poles, initiated by the Belgian Federal Science Policy Office. The scientific responsibility rests with its authors.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Benoudjit, N., Cools, E., Meurens, M., Verleysen, M.: Chemometric calibration of infrared spectrometers: Selection and validation of variables by non-linear models. Chemometrics and Intelligent Laboratory Systems 70(1), 47–53 (2004)
Benoudjit, N., François, D., Meurens, M., Verleysen, M.: Spectrophotometric variable selection by mutual information. Chemometrics and Intelligent Laboratory Systems 74(2), 243–251 (2004)
Rossi, F., Lendasse, A., François, D., Wertz, V., Verleysen, M.: Mutual information for the selection of relevant variables in spectrometric nonlinear modelling. Chemometrics and Intelligent Laboratory Systems 80(2), 215–226 (2006)
Ramsay, J., Silverman, B.: Functional Data Analysis. Springer Series in Statistics. Springer, Heidelberg (1997)
Hastie, T., Mallows, C.: A discussion of A statistical view of some chemometrics regression tools by I.E. Frank and J.H. Friedman. Technometrics 35, 140–143 (1993)
Marx, B.D., Eilers, P.H.: Generalized linear regression on sampled signals with penalized likelihood. In: Forcina, A., Marchetti, G.M., Hatzinger, R., Falmacci, G. (eds.) Statistical Modelling. Proceedings of the 11th International workshop on Statistical Modelling, Orvietto (1996)
Hoerl, A.E., Kennard, R.W.: Ridge regression: Biased estimation for non-orthogonal problems. Technometrics 12(1), 55–67 (1970)
Pezzulli, S., Silverman, B.: On smoothed principal components analysis. Computational Statistics 8, 1–16 (1993)
Hastie, T., Buja, A., Tibshirani, R.: Penalized discriminant analysis. Annals of Statistics 23, 73–102 (1995)
Cardot, H., Ferraty, F., Sarda, P.: Functional linear model. Statist. & Prob. Letters 45, 11–22 (1999)
Rossi, F., Delannay, N., Conan-Guez, B., Verleysen, M.: Representation of functional data in neural networks. Neurocomputing 64, 183–210 (2005)
Rossi, F., Conan-Guez, B.: Theoretical properties of projection based multilayer perceptrons with functional inputs. Neural Processing Letters 23(1), 55–70 (2006)
Biau, G., Bunea, F., Wegkamp, M.: Functional classification in Hilbert spaces. IEEE Transactions on Information Theory 51, 2163–2172 (2005)
Rossi, F., Villa, N.: Support vector machine for functional data classification. Neurocomputing 69(7–9), 730–742 (2006)
Alsberg, B.K.: Representation of spectra by continuous functions. Journal of Chemometrics 7, 177–193 (1993)
Alsberg, B.K., Kvalheim, O.M.: Compression of nth-order data arrays by b-splines. part 1: Theory. Journal of Chemometrics 7(1), 61–73 (1993)
Olsson, R.J.O., Karlsson, M., Moberg, L.: Compression of first-order spectral data using the b-spline zero compression method. Journal of Chemometrics 10(5–6), 399–410 (1996)
de Boor, C.: A Practical Guide to Splines. Applied Mathematical Sciences, vol. 27. Springer, Heidelberg (1978)
Daubechies, I.: Orthonormal bases of compactly supported wavelets. Communications in Pure & Applied Mathematics 41, 909–996 (1988)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Rossi, F., François, D., Wertz, V., Verleysen, M. (2006). A Functional Approach to Variable Selection in Spectrometric Problems. In: Kollias, S.D., Stafylopatis, A., Duch, W., Oja, E. (eds) Artificial Neural Networks – ICANN 2006. ICANN 2006. Lecture Notes in Computer Science, vol 4131. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11840817_2
Download citation
DOI: https://doi.org/10.1007/11840817_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-38625-4
Online ISBN: 978-3-540-38627-8
eBook Packages: Computer ScienceComputer Science (R0)