Management of interval probabilistic data | Acta Informatica
Skip to main content

Management of interval probabilistic data

  • Original Article
  • Published:
Acta Informatica Aims and scope Submit manuscript

Abstract

In this paper we present a data model for uncertain data, where uncertainty is represented using interval probabilities. The theory introduced in the paper can be applied to different specific data models, because the entire approach has been developed independently of the kind of manipulated objects, like XML documents, relational tuples, or other data types. As a consequence, our theory can be used to extend existing data models with the management of uncertainty. In particular, the data model we obtain as an application to XML data is the first proposal that combines XML, interval probabilities and a powerful query algebra with selection, projection, and cross product. The cross product operator is not based on assumptions of independence between XML trees from different collections. Being defined with a possible worlds semantics, our operators are proper extensions of their traditional counterparts, and reduce to them when there is no uncertainty. The main practical result of the paper is a set of equivalences that can be used to compare or rewrite algebraic queries on interval probabilistic data, in particular XML and relational.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Al-Khalifa, S., Yu, C., Jagadish, H.V.: Querying structured text in an XML database. In: SIGMOD Conference (2003)

  2. Barbara D., Garcia-Molina H. and Porter D. (1992). The management of probabilistic data. IEEE Trans. Knowl. Data Eng. 4(5): 487–501

    Article  Google Scholar 

  3. Bonissone P.P. and Tong R.M. (1985). Editorial: Reasoning with uncertainty in expert systems. Int. J. Man Mach. Stud. 22(3): 241–250

    Article  Google Scholar 

  4. Boulos, J., Dalvi, N., Mandhani, B., Mathur, S., Re, C., Suciu, D.: Mystiq: a system for finding more answers by using probabilities. In: SIGMOD ’05: Proceedings of the 2005 ACM SIGMOD international conference on Management of data, pp. 891–893. ACM Press, New York (2005). http://doi.acm.org/10.1145/1066157.1066277

  5. Codd E.F. (1979). Extending the database relational model to capture more meaning. ACM Trans. Database Syst. 4(4): 397–434 http://doi.acm.org/10.1145/320107.320109

    Article  Google Scholar 

  6. Dalvi, N.N., Suciu, D.: Efficient query evaluation on probabilistic databases. In: VLDB Conference (2004)

  7. Dekhtyar, A., Goldsmith, J., Hawkes, S.R.: Semistructured probalistic databases. In: Statistical and Scientific Database Management (2001)

  8. Demolombe R. (1997). Uncertainty in intelligent databases. In: Motro, A. and Thanos, C. (eds) Uncertainty Management in Information Systems, pp. Kluwer, Dordrecht

    Google Scholar 

  9. Dey D. and Sarkar S. (1996). A probabilistic relational model and algebra. ACM Trans. Database Syst. 21(3): 339–369

    Article  Google Scholar 

  10. Eiter T., Lu J.J., Lukasiewicz T. and Subrahmanian V.S. (2001). Probabilistic object bases. ACM Trans. Database Syst. 26(3): 264–312 http://doi.acm.org/10.1145/502030.502031

    Article  MATH  Google Scholar 

  11. Fuhr N. and Rölleke T. (1997). A probabilistic relational algebra for the integration of information retrieval and database systems. ACM Trans. Inf. Syst. 15(1): 32–66

    Article  Google Scholar 

  12. Hung, E., Getoor, L., Subrahmanian, V.: Probabilistic interval XML. In: ICDT. Siena (2003)

  13. Hung, E., Getoor, L., Subrahmanian, V.: PXML: A probabilistic semistructured data model and algebra. In: ICDE. Bangalore (2003)

  14. Hunter, A., Liu, W.: Merging uncertain information with semantic heterogeneity in XML. Knowl. Inf. Syst. (2005) (accepted for publication)

  15. Jagadish, H., Lakshmanan, L., Srivastava, D., Thompson, K.: TAX: A tree algebra for XML. In: DBPL Workshop (2001)

  16. Lakshmanan L.V.S., Leone N., Ross R. and Subrahmanian V.S. (1997). ProbView: a flexible probabilistic database system. ACM Trans. Database Syst. 22(3): 419–469

    Article  Google Scholar 

  17. Lee, S.K.: An extended relational database model for uncertain and imprecise information. In: Yuan, L.Y. (ed.) VLDB Conference (1992)

  18. Magnani, M., Montesi, D.: A unified approach to structured and XML data modeling and manipulation. Data Knowl. Eng. 59(1) (2006)

  19. Magnani, M., Rizopoulos, N., McBrien, P., Montesi, D.: Schema integration based on uncertain semantic mappings. In: International Conference of Conceptual Modeling, LNCS 3716 (2005)

  20. Motro A. (1995). Imprecision and uncertainty in database systems. In: Bosc, P. and Kacprzyk, J. (eds) Fuzziness in Database Management Systems, pp 3–22. Physica-Verlag, New York

    Google Scholar 

  21. Nierman, A., Jagadish, H.V.: ProTDB: Probabilistic data in XML. In: VLDB Conference (2002)

  22. Pal N.R. (1999). On quantification of different facets of uncertainty. Fuzzy Sets Syst. 107: 81–91

    Article  MATH  Google Scholar 

  23. Pittarelli M. (1994). An algebra for probabilistic databases. IEEE Trans. Knowl. Data Eng. 6(2): 293–303

    Article  Google Scholar 

  24. Shafer G. (1976). A mathematical theory of evidence. Princeton University Press, New Jersey

    MATH  Google Scholar 

  25. Smets P. (1997). Imperfect information: Imprecision - uncertainty. In: Motro, A. and Smets, Ph. (eds) Uncertainty Management in Information Systems. From Needs to Solutions, pp 225–254. Kluwer, Dordrecht

    Google Scholar 

  26. Smithson M.J. (1989). Ignorance and Uncertainty: Emerging Paradigms. Springer, New York

    Google Scholar 

  27. Widom, J.: Trio: A system for integrated management of data, accuracy, and lineage. In: CIDR, pp. 262–276 (2005)

  28. Witold Lipski J. (1979). On semantic issues connected with incomplete information databases. ACM Trans. Database Syst. 4(3): 262–296 http://doi.acm.org/10.1145/320083.320088

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Matteo Magnani.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Magnani, M., Montesi, D. Management of interval probabilistic data. Acta Informatica 45, 93–130 (2008). https://doi.org/10.1007/s00236-007-0065-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00236-007-0065-9

Keywords