Abstract
Most conventional video processing platforms treat database merely as a storage engine rather than a computation engine, which causes inefficient data access and massive amount of data movement. Motivated by providing a convergent platform, we push down video processing to the database engine using User Defined Functions (UDFs).
However, the existing UDF technology suffers from two major limitations. First, a UDF cannot take a set of tuples as input or as output, which restricts the modeling capability for complex applications, and the tuple-wise pipelined UDF execution often leads to inefficiency and rules out the potential for enabling data-parallel computation inside the function. Next, the UDFs coded in non-SQL language such as C, either involve hard-to-follow DBMS internal system calls for interacting with the query executor, or sacrifice performance by converting input objects to strings.
To solve the above problems, we realized the notion of Relation Valued Function (RVF) in an industry-scale database engine. With tuple-set input and output, an RVF can have enhanced modeling power, efficiency and in-function data-parallel computation potential. To have RVF execution interact with the query engine efficiently, we introduced the notion of RVF invocation patterns and based on that developed RVF containers for focused system support.
We have prototyped these mechanisms on the Postgres database engine, and tested their power with Support Vector Machine (SVM) classification and learning, the most widely used analytics model for video understanding. Our experience reveals the value of the proposed approach in multiple dimensions: modeling capability, efficiency, in-function data-parallelism with multi-core CPUs, as well as usability; all these are fundamental to converging data-intensive analytics and data management.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Boser, B.E., et al.: A Training Algorithm for Optimal Margin Classifiers. In: Proceedings of the Fifth Annual Workshop on Computational Learning Theory, vol. 5, pp. 144–152 (1992)
Chaiken, R., Jenkins, B., Larson, P.-Å., Ramsey, B., Shakib, D., Weaver, S., Zhou, J.: SCOPE: Easy and Efficient Parallel Processing of Massive Data Sets. In: VLDB 2008 (2008)
Chen, Q., Hsu, M.: Data-Continuous SQL Process Model. In: Proc. 16th International Conference on Cooperative Information Systems, CoopIS 2008 (2008)
Chen, Q., Hsu, M.: Inter-Enterprise Collaborative Business Process Management. In: Proc. of 17th Int’l Conf on Data Engineering (ICDE 2001), Germany (2001)
Dayal, U., Hsu, M., Ladin, R.: A Transaction Model for Long-Running Activities. In: VLDB 1991 (1991)
Dean, J.: Experiences with MapReduce, an abstraction for large-scale computation. In: Int. Conf. on Parallel Architecture and Compilation Techniques. ACM, New York (2006)
DeWitt, D.J., Paulson, E., Robinson, E., Naughton, J., Royalty, J., Shankar, S., Krioukov, A.: Clustera: An Integrated Computation And Data Management System. In: VLDB 2008 (2008)
Graf, H.P., Cosatto, E., Bottou, L., Durdanovic, I., Vapnik, V.: Parallel Support Vector Machines: The Cascade SVM. In: NIPS 2004 (2004)
Jaedicke, M., Mitschang, B.: User-Defined Table Operators: Enhancing Extensibility of ORDBMS. In: VLDB 1999 (1999)
Novick, A.: Drilling Down into Performance Problem. Transact-SQL User-Defined Functions, ch. 11, pp. 235–244. Wordware Publishing (2004) ISBN 1-55622
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chen, Q., Hsu, M., Liu, R., Wang, W. (2009). Scaling-Up and Speeding-Up Video Analytics Inside Database Engine. In: Bhowmick, S.S., Küng, J., Wagner, R. (eds) Database and Expert Systems Applications. DEXA 2009. Lecture Notes in Computer Science, vol 5690. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03573-9_19
Download citation
DOI: https://doi.org/10.1007/978-3-642-03573-9_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03572-2
Online ISBN: 978-3-642-03573-9
eBook Packages: Computer ScienceComputer Science (R0)