Abstract
Web access log analysis is to examine the patterns of web site usage and the features of user’s behavior. Preprocessing of the log data is very essential for efficient web usage mining as the normal log data is very noisy. Session construction is very vital step in the preprocessing phase and recently various real world problems can be modeled as traversals on graph and mining from these traversals provides effective results. On the other hand, the traversals on unweighted graph have been taken into consideration in existing works. This paper oversimplifies this to the case where vertices of graph are given weights to reflect their significance. Patterns are closed frequent Directed Acyclic Graphs with page browsing time. The proposed method constructs sessions using an efficient Directed Acyclic Graph approach which contains pages with calculated weights. Hierarchical Directed Acyclic Graph (HDAG) Kernel approach is used for session construction. The HDAG directly accepts several levels of both chunks and their relations, and then efficiently computes the weighed sum of the number of common attribute sequences of the HDAGs. This will help site administrators to find the interesting pages for users and to redesign their web pages. After weighting each page according to browsing time a DAG structure is constructed for each user session.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Mobasher, B.: Data Mining for Web Personalization. LCNS. Springer, Heidelberg (2007)
Catlegde, L., Pitkow, J.: Characterising browsing behaviours in the World Wide Web. Computer Networks and ISDN systems (1995)
Cooley, R., Mobasher, B., Srivastava, J.: Data preparation for mining World Wide Web browsing patterns. Knowledge and Information Systems (1999)
Cooley, R., Mobasher, B., Srivastava, J.: Web mining: Information and Pattern Discovery on the World Wide Web. In: International Conference on Tools with Artificial Intelligence, Newport Beach, pp. 558–567. IEEE (1997)
Mihara, K., Terabe, M., Hashimoto, K.: A Novel web usage mining method. Mining and Clustering of DAG Access Patterns Considering Page Browsing Time (2008)
Hofgesang, P.I.: Methodology for Preprocessing and Evaluating the Time Spent on Web Pages. In: Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence (2006)
Lee, S.D., Park, H.C.: Mining Weighted Frequent Patterns from Path Traversals on Weighted Graph. IJCSNS International Journal of Computer Science and Network Security 7(4) (2007)
Spilipoulou, M., Mobasher, B., Berendt, B.: A framework for the Evaluation of Session Reconstruction Heuristics in Web Usage Analysis. Informs Journal on Computing Spring (2003)
Suresh, R.M., Padmajavalli, R.: An Overview of Data Preprocessing in Data and Web usage Mining. IEEE (2006)
Termier, A., Tamada, Y., Numata, K., Imoto, S., Washio, T., Higuchi, T.: DIGDAG, a first algorithm to mine closed frequent embedded sub-DAGs. In: The 5th International Workshop on Mining and Learning with Graphs, MLG 2007 (2007)
Wang, T., He, P.-L.: Find Duration Time Maximal Frequent Traversal Sequence on Web Sites. In: IEEE International Conference on Control and Automation (2007)
Li, Y., Feng, B., Mao, Q.: Research on Path Completion Technique in Web Usage Mining. In: International Symposium on Computer Science and Computational Technology. IEEE (2008)
Li, Y., Feng, B.: The Construction of Transactions for Web Usage Mining. In: International Conference on Computational Intelligence and Natural Computing. IEEE (2009)
Etminani, K., Delui, A.R., Yanehsari, N.R., Rouhani, M.: Web Usage Mining: Discovery of the Users’ Navigational Patterns Using SOM. In: First International Conference on Networked Digital Technologies, pp. 224–249 (2009)
Nina, S.P., Rahman, M., Bhuiyan, K.I., Ahmed, K.: Pattern Discovery of Web Usage Mining. In: International Conference on Computer Technology and Development, vol. 1, pp. 499–503 (2009)
Lee, C.-H., Fu, Y.-H.: Web Usage Mining Based on Clustering of Browsing Features. In: Eighth International Conference on Intelligent Systems Design and Applications, vol. 1, pp. 281–286 (2008)
Suzuki, J., Hirao, T., Sasaki, Y., Maeda, E.: Hierarchical Directed Acyclic Graph Kernel: Methods for Structured Natural Language Data. Meeting of the Association for Computational Linguistics, pp. 32–39 (2003)
Collins, M., Duffy, N.: Parsing with a Single Neuron: Convolution Kernels for Natural Language Problems. Technical Report UCS-CRL-01-10, UC Santa Cruz (2001)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chitra, S., Kalpana, B. (2013). Hierarchical Directed Acyclic Graph (HDAG) Based Preprocessing Technique for Session Construction. In: Meghanathan, N., Nagamalai, D., Chaki, N. (eds) Advances in Computing and Information Technology. Advances in Intelligent Systems and Computing, vol 177. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31552-7_61
Download citation
DOI: https://doi.org/10.1007/978-3-642-31552-7_61
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-31551-0
Online ISBN: 978-3-642-31552-7
eBook Packages: EngineeringEngineering (R0)