Data-guided model combination by decomposition and aggregation

Xu, Mingyang; Golay, Michael W.

doi:10.1007/s10994-005-5931-5

Data-guided model combination by decomposition and aggregation

Published: 09 March 2006

Volume 63, pages 43–67, (2006)
Cite this article

Download PDF

Machine Learning Aims and scope Submit manuscript

Data-guided model combination by decomposition and aggregation

Download PDF

Mingyang Xu^1,2 &
Michael W. Golay^1,2

665 Accesses
Explore all metrics

Abstract

Model selection and model combination is a general problem in many areas. Especially, when we have several different candidate models and also have gathered a new data set, we want to construct a more accurate and precise model in order to help predict future events. In this paper, we propose a new data-guided model combination method by decomposition and aggregation. With the aid of influence diagrams, we analyze the dependence among candidate models and apply latent factors to characterize such dependence. After analyzing model structures in this framework, we derive an optimal composite model. Two widely used data analysis tools, namely, Principal Component Analysis (PCA) and Independent Component Analysis (ICA) are applied for the purpose of factor extraction from the class of candidate models. Once factors are ready, they are sorted and aggregated in order to produce composite models. During the course of factor aggregation, another important issue, namely factor selection, is also touched on. Finally, a numerical study shows how this method works and an application using physical data is also presented.

Article PDF

Fully and partially exploratory factor analysis with bi-level Bayesian regularization

Article 12 July 2022

Hierarchical disjoint principal component analysis

Article 24 August 2022

Generalized Structured Component Analysis Accommodating Convex Components: A Knowledge-Based Multivariate Method with Interpretable Composite Indexes

Article Open access 16 February 2024

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. In B. N. Petrov & F. Czáki (Eds.), 2nd International Symposium on Information Theory. (pp. 267--281) Budapest: Akademiai Kiadó.
Bartholomew, D. J., & Knott, M. (1999). Latent variable models and factor analysis. London: Arnold; New York: Oxford University Press.
MATH Google Scholar
Burnham, K. P., & Anderson, D. R. (2002). Model selection and multimodel inference: A practical information-theoretic approach. New York: Springer.
MATH Google Scholar
Chan, ai-Wan, & Cha Siu-Ming (2001). Selection of independent factor model in finance. In Proceedings of 3rd International Conference on Independent Component Analysis and blind Signal Separation, December 9–12. California, USA: San Diego.
Cheung, Y.-M., & Xu, L. (2001). Independent component ordering in ICA time series analysis. Neurocomputing, 41, 145–152.
Article MATH Google Scholar
Christensen, R. (2001). Linear models for multivariate, time series, and spatial data. New York: Springer.
Google Scholar
Clemen, R. T., & Winkler, R. L. (1993). Aggregating point estimates: A flexible modeling approach. Management Science, 39, 4501–515.
Google Scholar
Clemen, R. T., & Winkler, R. L. (1986). Combining economic forecasts. J. Business and Economic Statistics, 4, 39–46.
Google Scholar
Clemen, R. T. (1986). Combing overlapped information. Management Science, 33:3, 373–379.
Google Scholar
Dawid, A. P. (1984). Present position and potential developments: Some personal views. Statistical theory. The prequential approach (with discussion). J. R. Statistical Soc. A, 147, 178–292.
MathSciNet Google Scholar
Deco, G., & Obradovic, D. (1995). Linear redundancy reduction learning. Neural Networks, 8:5, 751–755.
Article Google Scholar
Diaconis, P., & Freedman, D. (1984). Asymptotics of graphical projection pursuit. Ann. Statist., 12, 793–815.
MathSciNet MATH Google Scholar
Ellery, E. (1991). Probabilistic causality. Cambridge: Cambridge University Press.
MATH Google Scholar
Figlewski, S., & Urich, T. (1983). Optimal aggregation of money supply forecasts: accuracy, profitability and market efficiency. J. Finance, 28, 695–710.
Google Scholar
Forster, M. R. (1984). Probabilistic causality and the foundation of modern science. Ph.D. Thesis, University of Western Ontario.
Forster, M. R. (2000). Key concepts in model selection: Performance and generalizability. Journal of Mathematical Psychology, 44, 205–231.
Article MATH Google Scholar
Ghahramani, Z., & Hinton, G. E. (1997). Hierarchical non-linear factor analysis and topographic maps. Advances in Neural Information Processing Systems 10, NIPS*97, 486–492.
Google Scholar
Gilks, W. R. S Richardson., & Spiegelhalter, D. J. (1998). Markov Chain Monte Carlo in Practice. Boca Raton, FL. Chapman & Hall.
Google Scholar
Hannan, E. J., & Quinn, B. (1979). The determination of the order of an autoregression. Journal of the Royal Statistical Society, Series B 41, 190–191.
MathSciNet MATH Google Scholar
Hoeting, J., Madigan, D., Raftery, A., & Volinskym, C. T. (1999). Bayesian model averaging: A tutorial (with discussion). Statistical Science, 14:4, 382–417.
Article MathSciNet MATH Google Scholar
Hogarth, R. M. (1987). Judgment and choice: The psychology of decision, 2nd ed. New York: Chichester [West Susses]. Wiley.
Google Scholar
Howard, R. (1989). Knowledge maps. Management Science, 35, 903–922.
Google Scholar
Howard, R., & Matheson, J. (1984). Influence diagrams. In R. Howard & J. Matheson, (Ed.), The principles and applications of decision analysis, SDG systems, (pp. 719–762). Menlo Park, CA.
Google Scholar
Hyvärinen, A. (1999). Survey on independent component analysis. Neural Computing Surveys, 2, 94–128.
Google Scholar
Hyvärinen, A., & Oja, E. (2000). Independent component analysis: Algorithms and applications. Neural Networks, 13:4–5, 411–430.
Article Google Scholar
Hyvärinen, A., & Erkki, O. (1997). A fast fixed-point algorithm for independent component analysis. Neural Computation, 9:7, 1483–1492.
Google Scholar
Hyvärinen, A. (1998). New approximation of differential entropy for independent component analysis and project pursuit. In Advance in Neural Information Processing Systems, 19, 273–279.
Google Scholar
Jacobs, R. A., Jordan, M. I., Nowlan, S. J., & Hinton, G. E. (1991). Adaptive mixtures of local experts. Neural Comput., 3:1, 79–87.
Google Scholar
Jordan, M. I., & Jacobs, R. A. (1994). Hierarchical mixtures of experts and the EM algorithm. Neural Comput, 6:2, 181–214.
Google Scholar
Jolliffe, I. T. (1986). Principal component analysis. Springer-Verlag.
Jones, M. C., & Sibson, R. (1986). What is projection pursuit? Journal of the Royal Statistical Society, Ser. A, 150, 1–36.
MathSciNet Google Scholar
Jutten, C., & Herault, J. (1991). Blind separation of sources, part I: An adaptive algorithm based on neuromimetic architecture. Signal Processing, 24, 1–10.
Article MATH Google Scholar
Karhunen, J., Oja, E., Wang, L., Vigário, R., & Joutsensalo, J. (1997). A class of neural networks for independent component analysis. IEEE Trans. On Neural Networks, 8:3, 486–504.
Article Google Scholar
Kass, R. E., & Raftery, A. E. (1995). Bayes factors. Journal of the American Statistical Association, 90:430, 773–795.
MATH Google Scholar
Madigan, D., & Raftery, A. E. (1994). Model selection and accounting for model uncertainty in graphical models using occam's window. J. Amer. Statist. Assoc., 89, 1535–1546.
MATH Google Scholar
Morris, P. (1977). Combining expert judgments: A bayesian approach. Management Science, 23, 679–69
MATH Google Scholar
Papoulis, A. (1991). Probability, random variables, and stochastic processes. New York: McGraw-Hill.
Google Scholar
Rissanen, J. (1978). Modeling by shortest data description. Automatica, 14, 465–471.
Article MATH Google Scholar
Schmid, J., & Leiman, J. M. (1957). The development of hierarchical factor solutions. Psychometrika, 22, 53–61.
Article MATH Google Scholar
Schwarz, G. (1978). Estimating the dimension of a model. Annals of Statistics, 6, 461–464.
MATH MathSciNet Google Scholar
Shachter, R. (1986). Evaluating influence diagrams. Oper. Res., 34, 871–882.
MathSciNet Google Scholar
Shachter, R. (1988). Probabilistic inference and influence diagrams. Oper. Res., 36, 589–604.
Article MATH Google Scholar
Takeuchi, K. (1976). Distribution of informational statistics and a criterion of model fitting. Suri-Kagaku (Mathematical Sciences), 153, 12–18 (in Japanese).
Google Scholar
Woodroofe, M. (1982). On model selection and the arcsine laws. Ann. Statist., 10, 1182–1194.
MATH MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Nuclear Science and Engineering, Massachusetts Institute of Technology, 77 Massachusetts Ave., Cambridge, MA, 02139
Mingyang Xu & Michael W. Golay
1 River Court Apt. #2506, Jersey City, NJ, 07310
Mingyang Xu & Michael W. Golay

Authors

Mingyang Xu
View author publications
You can also search for this author inPubMed Google Scholar
Michael W. Golay
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Mingyang Xu.

Additional information

Editor:

Dan Roth

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xu, M., Golay, M.W. Data-guided model combination by decomposition and aggregation. Mach Learn 63, 43–67 (2006). https://doi.org/10.1007/s10994-005-5931-5

Download citation

Received: 23 March 2005
Accepted: 01 November 2005
Published: 09 March 2006
Issue Date: April 2006
DOI: https://doi.org/10.1007/s10994-005-5931-5

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Data-guided model combination by decomposition and aggregation

Abstract

Article PDF

Similar content being viewed by others

Fully and partially exploratory factor analysis with bi-level Bayesian regularization

Hierarchical disjoint principal component analysis

Generalized Structured Component Analysis Accommodating Convex Components: A Knowledge-Based Multivariate Method with Interpretable Composite Indexes

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Editor:

Rights and permissions

About this article

Cite this article

Share this article

Keywords