Abstract
Detection of outliers is important in many applications and has attracted much attention in the data mining research community recently. However, most existing methods are designed for mining outliers from a single dataset without considering the class labels of data objects. In this paper, we consider the class outlier detection problem, i.e., ”given a set of observations with class labels, find those that arouse suspicions, taking into account the class labels.” By generalizing two pioneering contributions in this field, we propose the notion of class outliers and practical solutions by extending existing outlier detection algorithms to detect class outliers. Furthermore, its potential applications in CRM (customer relationship management) are discussed. The experiments on real datasets have shown that our method can find interesting outliers and can be used in practice.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
He, Z., Deng, S., Xu, X.: Outlier detection integrating semantic knowledge. In: WAIM 2002, pp. 126–131 (2002)
Papadimitriou, S., Faloutsos, C.: Cross-outlier detection. In: Hadzilacos, T., Manolopoulos, Y., Roddick, J., Theodoridis, Y. (eds.) SSTD 2003. LNCS, vol. 2750, pp. 199–213. Springer, Heidelberg (2003)
Hawkins, D.: Identification of outliers. Chapman and Hall, Reading (1980)
Gibson, D., et al.: Clustering categorical data: an approach based on dynamic systems. In: VLDB (1998)
He, Z., et al.: A Frequent Pattern Discovery Method for Outlier Detection. In: Li, Q., Wang, G., Feng, L. (eds.) WAIM 2004. LNCS, vol. 3129, Springer, Heidelberg (2004)
He, Z., Xu, X., Deng, S.: Discovering Cluster Based Local Outliers. Pattern Recognition Letters (2003)
He, Z., Huang, J., Xu, X., Deng, S.: Mining Class Outlier: Concepts, Algorithms and Applications. Technology Report, HIT (2003), http://www.angelfire.com/mac/zengyouhe/publications/Class_Outlier.pdf
Yao, Y., Zhong, N., Huang, J., Ou, C., Liu, C.: Using Market Value Functions for Targeted Marketing Data Mining. International Journal of Pattern Recognition and Artificial Intelligence 16(8), 1117–1132 (2002)
Setnes, M., Kaymak, U.: Fuzzy Modeling of Client Preference from Large Data Sets: An Application to Target Selection in Direct Marketing. IEEE Transactions on Fuzzy Systems 9(1), 153–163 (2001)
SPSS Inc., SPSS CHAID for Windows 6.0. Prentice-Hall, Englewood Cliffs (1993)
Ling, C.X., Li, C.: Data Mining for Direct Marketing: Problems and Solutions. In: KDD 1998, pp. 73–79 (1998)
Liu, B., Ma, Y., Wong, C.K., Yu, P.S.: Scoring the Data Using Association Rules. Applied intelligence (2003)
The Coil dataset can found at: http://www.liacs.nl/~putten/library/cc2000/
Lewandowski, A.: How to detect potential customers. In: CoIL Challenge 2000: The Insurance Company Case, Technical Report 2000-09, Leiden Institute of Advanced Computer Science, Netherlands (2000)
Elkan, C.: Magical Thinking in Data Mining: Lessons From CoIL Challenge 2000. In: Proc of KDD 2001 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
He, Z., Huang, J.Z., Xu, X., Deng, S. (2004). Mining Class Outliers: Concepts, Algorithms and Applications. In: Li, Q., Wang, G., Feng, L. (eds) Advances in Web-Age Information Management. WAIM 2004. Lecture Notes in Computer Science, vol 3129. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-27772-9_59
Download citation
DOI: https://doi.org/10.1007/978-3-540-27772-9_59
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22418-1
Online ISBN: 978-3-540-27772-9
eBook Packages: Springer Book Archive