Abstract
Current Web search engines find new documents basically crawling the hyperlinks with the aid of spider agents. Nevertheless, when indexing newly discovered documents they revert to conventional information retrieval models and single-document indexing, thus neglecting the inherently hypertextual structure of Web documents. Therefore, it can happen that a query string, partially present in a document, with the remaining part available in a linked document on the same site, does not correspond to a hit. This considerably reduces retrieval effectiveness. To overcome this and other limits we propose an approach based on temporal logic that, starting with the modeling of a web site as a finite state graph, allows one to define complex queries over hyperlinks with the aid of Computation Tree Logic (CTL) operators. Query formulation is composed by two steps: the first one is user-oriented and provides a user with a friendly interface to pose queries. The second step is the query translation in CTL formulas. The formulation of the query is not visible to the user that simply expresses his/her requirements in natural language. We implemented the proposed approach in a prototype system. Results of experiments show an improvement in retrieval effectiveness.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
S. Abiteboul, P. Buneman, and D. Suciu. Data on the web. Morgan Kaufmann, Los Altos, 2000.
C. Beeri and Y. Kornatzky. A logical query language for hypermedia systems. Information Sciences 77:1–37, 1994.
T. Berners-Lee, J. Hendler, and O. Lassila. The semantic web. Scientific American 501(5):1–3, 2001.
K. Bharat and M.R. Henzinger. Improved algorithms for topic distillation in a hyperlinked environment. In ACM SIGIR-98 pages 104–111, 1998.
D. Calvanese, G. De Giacomo, and M. Lenzerini. Representing and reasoning on xml documents: A description logic approach. Journal of Logic and Computation 9(3): 295–318, 1999.
E.M. Clarke, O.M. Grumberg, and D.A. Peled. Model Checking The MIT Press, 1999.
E. A. Emerson. Automated temporal reasoning about reactive systems. In Logics for Concurrency number 1043 in Lecture Notesin Computer Science. Springer-Verlag, 1996.
D. Florescu, A.Y. Levy, and A. Mendelzon. Database techniques for the worldwide-web: a survey. SIGMOD Record 27(33):59–74, 1998.
M. Gordon and P. Pathak. Finding information on the World Wide Web: the retrieval evectiveness of search engines. Information Processing and Management 35:141–180, 1999.
M. Hacid and F. Toumani. Logic-based approach to semistructured data retrieval. In ISMIS 2000 number 1932 in Lecture Notesin Artificial Intelligence, pages 77–85. Springer-Verlag, 2000.
J. Kleinberg. Authoritative sources in a hyperlinked environment. In SODA-98 pages 668–677, 1998.
M. Kobayashi and K. Takeda. Information retrieva on the web. ACM Computing Surveys 32(2): 145–173, 2000.
G. Salton and M.J. McGill. Introduction to modern Information Retrieval McGraw-Hill, New York, 1989.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Di Sciascio, E., Donini, F.M., Mongiello, M. (2002). I-Search: A System for Intelligent Information Search on the Web. In: Hacid, MS., Raś, Z.W., Zighed, D.A., Kodratoff, Y. (eds) Foundations of Intelligent Systems. ISMIS 2002. Lecture Notes in Computer Science(), vol 2366. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48050-1_18
Download citation
DOI: https://doi.org/10.1007/3-540-48050-1_18
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43785-7
Online ISBN: 978-3-540-48050-1
eBook Packages: Springer Book Archive