Abstract
We use Conceptual Graphs (CGs) to model web content extraction rules (CG-Wrappers). The approach presented incorporates all major existing extraction techniques and allows the definition of synergies of cooperative wrappers for handling complex extraction task, without requiring programming.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
CoGXML: http://cogitant.sourceforge.net/cogitant_html/cogxml.html
Document Object Model (DOM) Level 3 Core Specification (2002), http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/
Flesca, S., Manco, G., Masciari, E., Rende, E., Tagarelli, A.: Web wrapper induction: a brief survey. In: AI Communications, vol. 17, pp. 57–61. IOS Press, Amsterdam (2004)
Kauchak, D., Smarr, J., Elkan, C.: Sources of Success for Boosted Wrapper Induction. Journal of Machine Learning 5, 499–527 (2004)
Kokkoras, F., Bassiliades, N., Vlahavas, I.: Aggregator: A Knowledge based Comparison Chart Builder for e-Shopping. Intelligent Knowledge-Based Systems: Business and Technology in the New Millennium. In: Leondes, C.T. (ed.) Knowledge-Based Systems, vol.1, ch. 6, pp. 140–163. Kluwer Academic Publishers (2005)
Laender, A., Ribeiro-Neto, B., da Silva, A.S., Teixeira, J.: A Brief Survey of Web Data Extrac-tion Tools. ACM SIGMOD Record 31(2) (2002)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kokkoras, F., Bassiliades, N., Vlahavas, I. (2007). Cooperative CG-Wrappers for Web Content Extraction. In: Priss, U., Polovina, S., Hill, R. (eds) Conceptual Structures: Knowledge Architectures for Smart Applications. ICCS 2007. Lecture Notes in Computer Science(), vol 4604. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73681-3_38
Download citation
DOI: https://doi.org/10.1007/978-3-540-73681-3_38
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73680-6
Online ISBN: 978-3-540-73681-3
eBook Packages: Computer ScienceComputer Science (R0)