Abstract
In this paper we survey algorithmic aspects of Web information retrieval. As an example, we discuss ranking of search engine results using connectivity analysis.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
R. Baeza-Yates and B. Ribeiro-Neto. Modern Information Retrieval. Addison-Wesley, 1999.
K. Bharat and A. Z. Broder. A technique for measuring the relative size and overlap of public Web search engines. In Proceedings of the Seventh International World Wide Web Conference 1998, pages 379–388.
K. Bharat, A. Z. Broder, J. Dean, and M. Henzinger. A comparison of Techniques to Find Mirrored Hosts on the World Wide Web. To appear in the Journal of the American Society for Information Science.
K. Bharat, A. Z. Broder, M. Henzinger, P. Kumar, and S. Venkatasubramanian. The connectivity server: Fast access to linkage information on the Web. In Proceedings of the Seventh International World Wide Web Conference 1998, pages 469–477.
S. Brin, J. Davis, and H. García-Molina. Copy detection mechanisms for digital documents. In M. J. Carey and D. A. Schneider, editors, Proceedings of the 1995 ACM SIGMOD International Conference on Management of Data, pages 398–409, San Jose, California, May 1995.
S. Brin and L. Page. The anatomy of a large-scale hypertextual Web search engine. In Proceedings of the Seventh International World Wide Web Conference 1998, pages 107–117.
A. Z. Broder, S. C. Glassman, M. S. Manasse, and G. Zweig. Syntactic clustering of the Web. In Proceedings of the Sixth International World Wide Web Conference 1997, pages 391–404.
A. Z. Broder and M. R. Henzinger. Algorithmic Aspects of Information Retrieval on the Web. In Handbook of Massive Data Sets. J. Abello, P.M. Pardalos, M.G.C. Resende (eds.), Kluwer Academic Publishers, Boston, forthcoming.
J. Carriere and R. Kazman. Webquery: Searching and visualizing the web through connectivity. In Proceedings of the Sixth International World Wide Web Conference 1997, pages 701–711.
S. Chakrabarti, B. Dom, and P. Indyk. Enhanced hypertext categorization using hyperlinks. in Proceedings of the ACM SIGMOD International Conference on Management of Data, 1998, pages 307–318.
J. Cho and H. García-Molina. The Evolution of the Web and Implications for an incremental Crawler. Proceedings of the 26th International Conference on Very Large Databases (VLDB), 2000.
J. Cho, H. García-Molina, and L. Page. Efficient crawling through URL ordering. In Proceedings of the Seventh International World Wide Web Conference 1998, pages 161–172.
J. Cho, N. Shivakumar, and H. García-Molina. Finding replicated Web collections. Proceedings of the 2000 ACM International Conference on Management of Data (SIGMOD), 2000.
E. G. Coffman, Z. Liu, and R. R. Weber. Optimal robot scheduling for Web search engines. Technical Report 3317, INRIA, Dec. 1997.
J. Dean and M. R. Henzinger. Finding Related Web Pages in the World Wide Web. In Proceedings of the 8th International World Wide Web Conference 1998, pages 389–401.
R. B. Doorenbos, O. Etzioni, and D. S. Weld. A scalable comparison-shopping agent for the World-Wide Web. In W. L. Johnson and B. Hayes-Roth, editors, Proceedings of the 1st International Conference on Autonomous Agents, pages 39–48, New York, Feb. 1997. ACM Press.
E. Garfield. Citation analysis as a tool in journal evaluation. Science, 178, 1972.
E. Garfield. Citation Indexing. ISI Press, 1979.
T. Haveliwala. Efficient Computation of PageRank. Technical Report 1999-31, Stanford University, 1999.
M. R. Henzinger, A. Heydon, M. Mitzenmacher, and M. Najork. Measuring Search Engine Quality using Random Walks on the Web. In Proceedings of the 8th International World Wide Web Conference 1999, pages 213–225.
M. R. Henzinger, A. Heydon, M. Mitzenmacher, and M. Najork. On near-uniform URL sampling. In Proceedings of the Ninth International World Wide Web Conference 2000, pages 295–308.
B. J. Jansen, A. Spin, J. Bateman, and T. Saraceffic. Real Life Information Retrieval: A Study of User Queries on the Web. SIGIR FORUM, 32(1):5–17, 1998.
M. M. Kessler. Bibliographic coupling between scientific papers. American Documentation, 14, 1963.
L. Katz. A new status index derived from sociometric analysis. Psychometrika, 18(1):39–43, March 1953.
J. Kleinberg. Authoritative sources in a hyperlinked environment. In Proceedings of the 9th Annual ACM-SIAM Symposium on Discrete Algorithms, pages 668–677, January 1998.
J. Kleinberg, S.R. Kumar, P. Raghavan, S. Rajagopalan, and A. Tomkins. The Web as a graph: Measurements, models and methods. Invited survey at the International Conference on Combinatorics and Computing, 1999.
S. Lawrence and C. L. Giles. Searching the World Wide Web. Science, 280(360):98, 1998.
S. Lawrence and C. L. Giles. Accessibility of Information on the Web. Nature, 400(6740):107–109, 1999.
Dharmendra S. Modha and W. Scott Spangler. Clustering Hypertext with Applications to Web Searching. Proceedings of the ACM Hypertext 2000 Conference, San Antonio, TX, 2000. Also appears as IBM Research Report RJ 10160 (95035), October 1999.
M. S. Mizruchi, P. Mariolis, M. Schwartz, and B. Mintz. Techniques for disaggregating centrality scores in social networks. In N. B. Tuma, editor, Sociological Methodology, pages 26–48. Jossey-Bass, San Francisco, 1986.
L. Page, S. Brin, R. Motwani, and T. Winograd. The PageRank citation ranking: Bringing order to the Web. Stanford Digital Library Technologies, Working Paper 1999-0120, 1998.
C. Papadimitriou, P. Raghavan, H. Tamaki, and S. Vempala. Latent Semantic Indexing: A Probabilistic Analysis. In Proceedings of the 17th ACM Symposium on the Principles of Database Systems, 1998.
D. Rafiei, and A. Mendelzon. What is this page known for? Computing Web page reputations. In Proceedings of the Ninth International World Wide Web Conference 2000, pages 823–836.
G. Salton. The SMART System-Experiments in Automatic Document Processing. Prentice Hall.
N. Shivakumar and H. García-Molina. Finding near-replicas of documents on the Web. In Proceedings of Workshop on Web Databases (WebDB’98), March 1998.
C. Silverstein, M. Henzinger, H. Marais, and M. Moricz. Analysis of a Very Large AltaVista Query Log. Technical Note 1998-014, Compaq Systems Research Center, 1998. To appear in SIGIR FORUM.
H. Small. Co-citation in the scientific literature: A new measure of the relationship between two documents. J. Amer. Soc. Info. Sci., 24, 1973.
O. Zamir and O. Etzioni. Web document clustering: A feasibility demonstration. In Proceedings of the 21st International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’98), pages 46–54.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Henzinger, M. (2000). Web Information Retrieval - an Algorithmic Perspective. In: Paterson, M.S. (eds) Algorithms - ESA 2000. ESA 2000. Lecture Notes in Computer Science, vol 1879. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45253-2_1
Download citation
DOI: https://doi.org/10.1007/3-540-45253-2_1
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41004-1
Online ISBN: 978-3-540-45253-9
eBook Packages: Springer Book Archive