Efficient Top-<i>k</i> Query Processing on Uncertain Temporal Data

Computer Science ›› 2020, Vol. 47 ›› Issue (9): 67-73.doi: 10.11896/jsjkx.190800143

• Database & Big Data & Data Science • Previous Articles     Next Articles

Efficient Top-k Query Processing on Uncertain Temporal Data

WEI Jian-hua, XU Jian-qiu   

  1. Department of Computer Science and Technology,Nanjing University of Aeronautics and Astronautics,Nanjing 210016,China
  • Received:2019-08-29 Published:2020-09-10
  • About author:WEI Jian-hua,born in 1994,postgradua-te.Her main research interests include temporal data and uncertainty.
    XU Jian-qiu,born in 1982,Ph.D,professor,is a member of China Computer Federation.His main research interests include mobile objects,and spatio-temporal data.
  • Supported by:
    Natural Science Fundation of China (61972198).

Abstract: Temporal data is widely used in many applications such as medical,economic and e-commerce.The uncertainty is mainly caused by factors such as inaccurate measurement techniques.This paper studies top-k queries over uncertain temporal data.Such a query returns Top-k intervals with the largest scores which are calculated by a function combining the original weight of the data and the probability of intersection with the query data.To answer the query efficiently,this paper proposes a 2D R-tree based on the relational model and auxiliary structures.The relational model is used to manage all intervals,and the auxiliary structure is used to manage the order of the weights of each node in the R-tree.Based on the proposed index structure,a query algorithm for accessing data in descending order by weights is proposed.It traverses the R-tree from the root node.For each node that intersects with the query point,the item with the largest weight in it can be found according to the information stored in the auxiliary structure,and it is determined as the next accessed object.This paper uses synthetic datasets with data sizes ranging from 300000 to 10 million,and a real dataset of a flight information with size of 3.2 million.In the extensible database system SECONDO,the proposed method is compared with the unindexed method,R-tree,and interval tree,and the average I/O access times and CPU time are used as the indicators of the experimental results.The experimental results show that the proposed approach outperforms baseline methods by 2 to 3 orders of magnitude using 10 million intervals.Comparing the probabilities and weights of the k results with the results of all intersecting data,it is found that the probabilities and weights of the k results are close to the maximum value of the actual intersecting data,so the proposed algorithm is feasible and effective.

Key words: Interval data, Top-k, Uncertainty

CLC Number: 

  • TP391
[1] LI R,ZHANG X,ZHOU X,et al.INK:A Cloud-Based System for Efficient Top-k Interval Keyword Search[C]//Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management.New York,2014:2003-2005.
[2] SNODGRASS R T,AHN I.A Taxonomy of Time in Databases.[J].ACM,1985,14(4):236-246.
[3] UYSAL M.Efficiently Processing Queries on Interval-and-Va-lue Tuples in Relational Databases[C]//International Confe-rence on Very Large Data Bases.DBLP,2005:385-396.
[4] ARGE L,VITTER J S.Optimal external memory interval ma-nagement [J].SIAMJ.2003,32(6):1488-1508.
[5] HAMPEL M.Joining Interval Data in Relational Databases[C]//ACM SIGMOD International Conference on Management of Data.DBLP,2004:683-694.
[6] LAYER R M,SKADRON K,ROBINS G,et al.Binary Interval Search:a scalable algorithm for counting interval intersections[J].Bioinformatics,2013,29(1):1-7.
[7] AGARWAL P K,ARGE L,YI K.An optimal dynamic interval stabbing-max data structure[C]//Proceedings of the Sixteenth Annual ACM-SIAM Symposium on Discrete Algorithms.Vancouver,SODA,2005:803-812.
[8] SFAKIANAKIS G,PATLAKAS I,NTARMOS N,et al.Interval indexing and querying on key-value cloud stores[C]//Proceedings of IEEE 29th International Conference on Data Engineering.Brisbane:IEEE Press,2013:805-816.
[9] XU J Q,LU H.Efficiently answer top-k queries on typed intervals[J].Information Systems,2017,71:164-181.
[10] LI S,WANG H.Dispersing points on intervals [J].Discrete Applied Mathematics,2018,239:106-118.
[11] COWLEY W,PLEXOUSAKIS D.An Interval Algebra for Indeterminate Time[C]//Seventeenth National Conference on Artificial Intelligence.AAAI,2000:470-475.
[12] PLESNIEWICZ G S,VU NGUEN T M,DMITRY M.Queryanswering over fact bases for fuzzy interval ontologies[C]//Proceedings of the 2nd International Workshop on Soft Computing Applications and Knowledge Discovery.Moscow,2016:93-100.
[13] WU W Y,LI Y,NI Z W,et al.Probabilistic interval-valued hesitant fuzzy information aggregation operators and their application to multi-attribute decision making[J].Algorithms,2018,11(8):120-128.
[14] ZHANG S C.Interval-Gap-Based Representation of Temporal Knowledge[J].Journal of Software,1994,5(6):49-54.
[15] LIN J Y,PENG H,XIE J M.Uncertain temporal information representation and the extensions of temporal operation[J].Computer Science,2005,32(8):161-163.
[16] KATERINA P,MARTIN T,MICHAEL B.Supporting Set Op-erations in Temporal-Probabilistic Databases[C]//34th IEEE.
International Conference on Data Engineering(ICDE).2018:1180-1191.
[17] BLANKENAGEL G.External segment trees[J].Algorithmica,1994,12(6):498-532.
[18] IMAI H,ASANO T.Dynamic orthogonal segment intersection search[J].Journal of Algorithms,1987,8(1):1-18.
[19] MCCREIGHT E M.Priority search trees[J].SIAM J,1985,14(2):257-276.
[20] GUY D T MOL R D,BRONSELAER A.Indexing Possibilistic Numerical Data:The Interval B+-tree Approach[M]//Information Processing and Management of Uncertainty in Knowledge-Based Systems.Springer International Publishing,2016:305-316.
[21] MEI W,MENG X,PENG S,et al.A hybrid index for temporal big data[J].China Modern Medicine,2017,72:264-272.
[22] KRIEGEL H P,PÖTKE,MARCO,et al.Managing IntervalsEfficiently in Object-Relational Databases[C]//International Conference on Very Large Data Bases.Morgan Kaufmann Publishers Inc,2000:407-418.
[23] BLIUJUTE R,JENSEN C S,SALTENIS S,et al.R-tree Based Indexing of Now-Relative Bitemporal Data[C]//International Conference on Very Large Data Bases.Morgan Kaufmann Publishers Inc.,1998:345-356.
[24] ZENG X J.Temporal index trees for uncertain temporal information [J].Microcomputer Information,2011(5):236-237.
[25] GÜTING R H,BEHR T,DÜNTGEN C.SECONDO:A plat-form for moving objects database rsearch and for publishing and integrating research implementations[J].IEEE Data Eng,2010,33(2):56-63.
[1] LIN Chao-wei, LIN Bing, CHEN Xing. Study on Scientific Workflow Scheduling Based on Fuzzy Theory Under Edge Environment [J]. Computer Science, 2022, 49(2): 312-320.
[2] MU Cong-cong, WANG Yi-shu, YUAN Ye, QIAO Bai-you, MA Yu-liang. Top-k Densest Subgraphs Search in Temporal Graphs [J]. Computer Science, 2021, 48(10): 152-159.
[3] YANG Jie,WANG Guo-yin,LI Shuai. Neighborhood Knowledge Distance Measure Model Based on Boundary Regions [J]. Computer Science, 2020, 47(3): 61-66.
[4] YANG Wen-hua,XU Chang,YE Hai-bo,ZHOU Yu,HUANG Zhi-qiu. Taxonomy of Uncertainty Factors in Intelligence-oriented Cyber-physical Systems [J]. Computer Science, 2020, 47(3): 11-18.
[5] Renata WONG. Uncertainty Principle as Related to Quantum Computation [J]. Computer Science, 2020, 47(1): 40-50.
[6] YUE Chuan, PENG Xiao-hong. Evaluation Model of Software Quality with Interval Data [J]. Computer Science, 2019, 46(10): 209-214.
[7] XU Hua-jie, WU Qing-hua, HU Xiao-ming. Privacy Protection Algorithm Based on Multi-characteristics of Trajectory [J]. Computer Science, 2019, 46(1): 190-195.
[8] YANG Jie, WANG Guo-yin, ZHANG Qing-hua, FENG Lin. Uncertainty Measure of Rough Fuzzy Sets in Hierarchical Granular Structure [J]. Computer Science, 2019, 46(1): 45-50.
[9] ZHENG Hong-liang, HOU Xue-hui, SONG Xiao-ying, PANG Kuo, ZOU Li. Approach for Knowledge Reasoning Based on Hesitate Fuzzy Credibility [J]. Computer Science, 2019, 46(1): 131-137.
[10] ZHOU Ming-quan, JIANG Guo-hua. New Spectrum-based Fault Localization Method Combining HittingSet and Genetic Algorithm [J]. Computer Science, 2018, 45(9): 207-212.
[11] WANG Lin-na, YANG Xin, YANG Xi-bei. Supervised Neighborhood Rough Set [J]. Computer Science, 2018, 45(8): 186-190.
[12] WU Jian-hui, HUANG Zhong-xiang, LI Wu, WU Jian-hui, PENG Xin and ZHANG Sheng. Robustness Optimization of Sequence Decision in Urban Road Construction [J]. Computer Science, 2018, 45(4): 89-93.
[13] MAO Ying-chi and CHEN Yang. Uncertain Vehicle Intersection Trajectory Prediction [J]. Computer Science, 2018, 45(3): 235-240.
[14] DAI Hua, BAO Jing-jing, ZHU Xiang-yang, YI Xun, YANG Geng. Integrity-verifying Single Keyword Search Method in Clouds [J]. Computer Science, 2018, 45(12): 92-97.
[15] ZHANG Xiao-yan, SANG Bin-bin and WEI Ling. Fuzzy Entropy and Uncertain Measurement in Lattice-valued Information Systems [J]. Computer Science, 2017, 44(9): 88-92.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!