Abstract
Global query execution in a multidatabase system can be done parallelly, as all the local databases are independent. In this paper, a cost model that considers parallel execution of subqueries for a global query is developed. In order to obtain maximum parallelism in query execution, it is required to find a query execution plan that is represented in the form of a bushy tree and this query tree should be balanced to the maximal possible extent with respect to execution time. A new bottom up approach called Agglomerative Approach (AA) is proposed to construct balanced bushy trees with respect to execution time. By the deterministic nature of this approach, it generates local optimal solutions. This local minima problem will be severe in the case of graph queries, i.e., queries that are represented with a graph structure. A Simulated annealing Approach (SA) is employed to obtain a (near) optimal solution. These approaches (AA and SA) are suitable for handling on-line and off-line queries respectively. A Hybrid Approach (HA), that is an integration of AA and SA, is proposed to optimize queries for which the estimated time to be spent on optimization is known a priori. Results obtained with AA and SA on both tree and graph structured queries are presented.
Similar content being viewed by others
References
E. Aarts and J. Korst, Simulated Annealing and Boltzman Machine: A Stochastic Approach to Combinatorial Optimization, John Wiley and Sons: New York, 1989.
P.M.G. Apers, A.R. Hevener, and S.B. Yao, “Optimization algorithms for distributed queries,” IEEE Transactions on Software Engineering, vol. 9, no. 1, 1983.
K. Bennett, M.C. Ferris, and Y.E. Ioannidis, “A genetic algorithm for database query optimization,” in 4th International Conference on Genetic Algorithms and Applications, Morgan Kaufman: San Mateo, CA, 1991.
Y. Breitbart, H. Gracia-Molina, and A. Silberschatz, “Overview of multidatabase transaction management,” Technical Report STAN-CS–92–1432, Stanford University, May 1992.
M.W. Bright, A.R. Hurson, and S.H. Pakzad, “A taxonomy and current issues in multidatabase systems,” IEEE Computer, pp. 50–60, March 1992.
D. Brill, M. Templeton, and C.T. Yu, “Distributed query processing strategies in mermaid, a frontend to data management systems,” in International Conference on Data Engineering, 1984.
G.V. Bultzingsloewen, “Optimization SQL queries for parallel execution,” ACM SIGMOD Record, vol. 18, no. 4, pp. 17–22, 1989.
S. Ceri and G. Pelagatti, Distributed Databases: Principles and Systems, McGraw-Hill Book Company, 1984.
A.L.P. Chen, D. Brill, M. Templeton, and C.T. Yu, “Distributed query processing in multiple database system,” IEEE Journal on Selected Areas in Communications, vol. 7, no. 3, pp. 390–398, April 1989.
U. Dayal, “Query processing in multidatabase system,” in Query Processing in Database Systems, W. Kim, D. Reiner, and D. Batory (Eds.), Springer Verlag, 1985, pp. 81–108.
D.J. Dewih and J. Goray, “Parallel database systems: The future of database processing or a passing fad,” ACM SIGMOD Record, vol. 19, no. 4, pp. 104–112, 1990.
W. Du, R. Krishnamurthy, and M.C. Shan, “Query optimization in heterogeneous DBMS,” in 18th International Conference on VLDB, 1992, pp. 277–291.
W. Du, M. Shan, and U. Dayal, “Reducing multidatabase query response time by tree balancing,” in Proc. of the ACM SIGMOD, 1995.
A.K. Elmagarmid and M. Rusinkiewicz, “Critical issues in multidatabase systems,” Information Sciences, vol. 57–58, pp. 403–424, 1991.
C. Evrendilek et al., “Query optimization in multidatabase systems,” in Proc. of Next Generation Information Technologies, 1995, pp. 49–58.
S. Ganguly, W. Hasan, and R. Krishnamurthy, “Query optimization for parallel execution,” ACM SIGMOD Record, pp. 9–18, 1992.
D.E. Goldberg, Genetic Algorithms in Search, Optimization and Machine Learning, Addison Wesley: Reading MA, 1989.
G. Graefe, “Query evaluation techniques for large databases,” ACM Computing Surveys, vol. 25, no. 2, 1993.
G. Graefe and K. Ward, “Dynamic query evaluation plans,” in Proc. of ACM SIGMOD Conf., 1989.
G. Graefe and W.J. Mckenna, “The volcano optimizer generator: Extensibility and efficient search,” in Proc. of IEEE Intl. Conf. on Data Engg., 1993.
T. Ibaraki and T. Kameda, “Optimal nesting for computing n-relational joins,” ACM Transactions on Database Systems, vol. 9, no. 3, pp. 482–502, 1984.
Y.E. Ioannidis and E. Wong, “Query optimization by simulated annealing,” in ACM SIGMOD Conference on the Management of Data, May 1987, pp. 9–22.
Y.E. Ioannidis and Y. Kang, “Randomized algorithms for optimizing large join queries,” in ACM SIGMOD Conference on Management of Data, May 1990, pp. 312–321.
Y.E. Ioannidis and Y.C. Kang, “Left-deep vs. bushy trees: An analysis of strategy spaces and its implications for query optimization,” in ACM SIGMOD International Conference on Management of Data, 1991, pp. 168–177.
A.K. Jain and R.C. Dubes, Algorithms for Clustering Data, Prentice Hall: Englewood Cliffs, NJ 07632, 1989.
S. Kirkpatrick, C.D. Gellat, and M.P. Vecchi, “Optimization by simulated annealing,” Science, vol. 220, no. 4598, pp. 671–680, May 1983.
Knuth, Fundamental Algorithms, Addison Wesley, 1992.
R.E. Korf, “Depth first iterative deepening: An optimal admissible tree search,” Artificial Intelligence, vol. 27, pp. 97–109, 1985.
R. Krishnamurthy, H. Boral, and C. Zaniolo, “Optimization of nonrecursive queries,” in 12th International Conference on VLDB, Kyoto, Aug. 1986, pp. 128–137.
R.S.G. Lanzelotte and P. Valduriez, “Extending the search strategy in query optimizer,” in 17th International Conference on VLDB, Sept. 1991, pp. 363–373.
E. Lim and J. Srivastava, “Query optimization and processing in federated database systems,” in 2nd International Conference on Information and Knowledge Management, Washington, D.C., 1993.
G.M. Lohman, C. Mohan, L.M. Haas, D. Daniels, B.G. Lindsay, P.G. Selinger, and P.F. Wilms, “Query processing in r*, in Query Processing in Database Systems, W. Kim, D. Reiner, and D. Batory (Eds.), Springer Verlag, 1985, pp. 31–47.
H. Lu, M.C. Shan, and K.L. Tan, “Optimization of multi-way join queries for parallel execution,” in 17th International Conference on VLDB, 1991.
H. Lu, B. Ooi, and C. Goh, “On global multidatabase query optimization,” SIGMOD RECORD, vol. 21, no. 4, pp. 6–11, 1992.
L.F. Mackert and G.M. Lohman, “R* Optimizer validation and performance evaluation for distributed queries,” in 12th International Conference on VLDB, Kyoto, Aug. 1986, pp. 149–157.
S. Nahar, S. Sahni, and E. Shragowitz, “Simulated annealing and combinatorial optimization,” in 23rd Design Automation Conference, 1986, pp. 293–299.
S. Shekhar, J. Srivastava, and S. Dutta, “A formal model of trade-off between optimization and execution costs in semantic query optimization,” in International Conference on VLDB, 1988, pp. 457–467.
J. Srivastava and Elsessor, “Optimizing multi-join queries in parallel relational databases,” in 2nd International Conference on Parallel and Distributed Information Systems, 1993.
A. Swami, “Optimization of large join queries: Combining heuristics and combinatorial techniques,” in ACM SIGMOD International Conference on Management of Data, 1989, pp. 367–376.
A. Swami and A. Gupta, “Optimization of large join queries,” in ACM SIGMOD International Conference on Management of Data, 1988, pp. 8–17.
M. Templeton, D. Brill, A.L.P. Chen, S. Dao, and E. Lund, “Mermaid-Experiences with network operation,” in Proc. of the IEEE Data Engg. Conf., 1986.
M. Templeton et al., “Mermaid-A front end to distributed heterogeneous databases,” Proceedings of IEEE, May 1987, vol. 75, no. 5, pp. 695–708.
J. Wolf, D. Dias, and P. Yu, “An effective algorithm for parallelizing sort merge joins in the presence of data skew,” in Proc. of the 2nd Intl. Symp. on Databases in Parallel and Distributed Systems, 1990.
C.T. Yu, C.C. Chang, M. Templeton, D. Brill, and E. Lund, “Query processing in a fragmented relational distributed database system: Mermaid,” IEEE Transactions on Software Engineering, SE-11, Aug. 1985.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Subramanian, D., Subramanian, K. Query Optimization in Multidatabase Systems. Distributed and Parallel Databases 6, 183–210 (1998). https://doi.org/10.1023/A:1008691331104
Issue Date:
DOI: https://doi.org/10.1023/A:1008691331104