Abstract
The k-nearest neighbor(k-NN) is improved by applying the distance functions with relearning and ensemble computations to classify text data with the higher accuracy values. The proposed relearning and combining ensemble computations are an effective technique for improving accuracy. We develop a new approach to combine kNN classifier based on weighted distance function with relearning and ensemble computations. The combining algorithm shows higher generalization accuracy, compared to other conventional algorithms. First, to improve classification accuracy, a relearning method with genetic algorithm is developed. Second, ensemble computations are followed by the relearning. Experiments have been conducted on some benchmark datasets from the UCI Machine Learning Repository.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Wilson, D.R., Martinez, T.R.: An Integrated Instance-Based Learning Algorithm. Computer Intelligence 16(1), 1–28 (2000)
Bao, Y., Tsuchiya, E., Ishii, N., Du, X.: Classification by Instance-Based Learning Algorithm. In: Gallagher, M., Hogan, J.P., Maire, F. (eds.) IDEAL 2005. LNCS, vol. 3578, pp. 133–140. Springer, Heidelberg (2005)
Bao, Y., Ishii, N., Du, X.: A Tolerant Instance-Based Learning Algorithm. In: Dosch, W., Lee, R.Y., Wu, C. (eds.) SERA 2004. LNCS, vol. 3647, pp. 14–22. Springer, Heidelberg (2006)
Wilson, D.R., Martinez, T.R.: Improved Heterogeneous Distance Functions. Journal of Artificial Intelligence Research 6, 3–21 (1997)
Witten, I.H., Frank, E.: Data Mining Practical Learning Tools and Techniques. Morgan Kaufman, USA (2005)
Bay, S.D.: Nearest neighbor classification from multiple feature subsets. Intelligent Data Analysis 3, 191–209 (1999)
Kaneko, S., Igarashi, S.: Combining Multiple k-Neighbor Classifiers Using Feature Combinations. IEICE TRANSACTIONS on Information and Systems l.2(3), 23–31 (2000)
Merz, C.J., Murphy, P.M.: UCI Repository of Machine Learning Databases, Irvine, CA: University of California Irvine. In: Department of Information and Computer Science (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html
Pawlak, Z.: “Rough Sets”. Kluwer Academic Publishers, Dordrecht (1991)
Pawlak, Z.: Decision Networks. Rough Sets and Current Trends in Computing 2004. In: Tsumoto, S., Słowiński, R., Komorowski, J., Grzymała-Busse, J.W. (eds.) RSCTC 2004. LNCS (LNAI), vol. 3066, pp. 1–7. Springer, Heidelberg (2004)
Yamada, T., Yamashita, K., Ishii, N.: Text Classification by Combining Different Distance Functions with Weights. In: Proc. of SNPD 2006, pp. 85–90. IEEE Computer Society, Los Alamitos (2006)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Ishii, N., Yamada, T., Bao, Y. (2008). Text Classification by Relearning and Ensemble Computation. In: Lee, R. (eds) Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing. Studies in Computational Intelligence, vol 149. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-70560-4_18
Download citation
DOI: https://doi.org/10.1007/978-3-540-70560-4_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-70559-8
Online ISBN: 978-3-540-70560-4
eBook Packages: EngineeringEngineering (R0)