Abstract
Nowhere are the ‘growing pains’ of Very Large-scale Digital Libraries more pronounced than in collections containing multimedia data. Not only do such collections contain large numbers of items, but they also push the boundaries of scale in terms of storage space and processing expense. In this paper we explore how applying parallel processing open-source libraries and techniques—previously developed for and applied to textual content—can be of benefit to multimedia digital libraries. We provide a real-world use case of ingesting video into the ReplayMe! system, an extension of the Greenstone digital library software, that simultaneously records and ingests all of the free-to-air television channels available in New Zealand. Current ingest of video in ReplayMe! is intentionally light due to processing time constraints on the single processor architecture it was developed on. The work reported here investigates how this system can be scaled up to include the conversion of the broadcast video transport format to a suitable a streaming format (MP4) and to automatically extract content analysis based keyframes, while still performing within real-time. By applying parallel processing, and utilizing a cluster of sixteen desktop computers, the paper shows how this processing time can be significantly reduced compared to the equivalent computation if conducted serially. We then generalize the work, and show how the same basic techniques can be applied to other common digital library software such as DSpace to provide similar advantages when dealing with processor intensive content.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bursuc, A., Zaharia, T., Prêteux, F.: OVIDIUS: A Web Platform for Video Browsing and Search. In: Schoeffmann, K., Merialdo, B., Hauptmann, A.G., Ngo, C.-W., Andreopoulos, Y., Breiteneder, C. (eds.) MMM 2012. LNCS, vol. 7131, pp. 649–651. Springer, Heidelberg (2012)
Christel, M., Kanade, T., Mauldin, M., Reddy, R., Sirbu, M., Stevens, S., Wactlar, H.: Informedia digital video library. Commun. ACM 38(4), 57–58 (1995)
Christenson, H.: Hahtitrust: A research library at web scale. Library Resources & Technical Services 55, 93–102 (2011), http://www.hathitrust.org/
Gabriel, E., Fagg, G.E., Bosilca, G., Angskun, T., Dongarra, J.J., Squyres, J.M., Sahay, V., Kambadur, P., Barrett, B., Lumsdaine, A., Castain, R.H., Daniel, D.J., Graham, R.L., Woodall, T.S.: Open MPI: Goals, Concept, and Design of a Next Generation MPI Implementation. In: Kranzlmüller, D., Kacsuk, P., Dongarra, J. (eds.) EuroPVM/MPI 2004. LNCS, vol. 3241, pp. 97–104. Springer, Heidelberg (2004)
Goetz, B.: Java theory and practice: Thread pools and work queues. Tech. Rep. j-jtp0730, IBM, New York, United States (2002), http://www.ibm.com/developerworks/library/j-jtp0730/index.html
Heesch, D., Pickering, M.J., Rüger, S., Yavlinsky, A.: Video retrieval using search and browsing with key frames. In: Proceedings of TRECVID 2003 NIST (2003)
Marchionini, G.: Cambridge Series on Human-Computer Interaction: Information Seeking in Electronic Environments. Cambridge University Press, Cambridge (1997) (reprint edn.)
Marchionini, G., Geisler, G.: The open video digital library. D-Lib Magazine 8 (December 2002)
Pratha, L., Mattam, M., Ambati, V., Reddy, R.: Multimedia digital library: Performance and scalability issues (2006)
Preston, J., Moses, S.A.: ejamaica.org, a dspace driven digital repository to promote the sharing of information. In: Third International Conference on Open Repositories 2008 (April 2008)
Reddy, R., StClair, G.: The million book digital library project (December 2001), http://www.rr.cs.cmu.edu/mbdl.html
Roüast, M., Bainbridge, D.: Live television in a digital library. In: Proceedings of the 12th ACM/IEEE-CS Joint Conference on Digital Libraries, JCDL 2012, pp. 81–90. ACM, New York (2012)
Smith, M., Barton, M., Bass, M., Branschofsky, M., McClellan, G., Stuve, D., Tansley, R., Walker, J.H.: Dspace: An open source dynamic digital repository. D-Lib Magazine 9 (January 2003)
Speisman: Digital libraries and digital preservation: ICT call 6 information day (October 2009), http://cordis.europa.eu/fp7/ict/telearn-digicult/call6-infoday_en.html
Thompson, J., Bainbridge, D., Suleman, H.: Towards Very Large Scale Digital Library Building in Greenstone Using Parallel Processing. In: Xing, C., Crestani, F., Rauber, A. (eds.) ICADL 2011. LNCS, vol. 7008, pp. 332–341. Springer, Heidelberg (2011)
Tridgell, A.: Trivial database (tdb) (2011), http://tdb.samba.org/
Wenqing, W., Ling, C.: Building the new-generation china academic digital library information system (CADLIS): A review and prospectus. D-Lib Magazine 16 (May 2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Thompson, J., Bainbridge, D., Roüast, M. (2012). Parallel Processing Videos in Very Large Digital Libraries. In: Chen, HH., Chowdhury, G. (eds) The Outreach of Digital Libraries: A Globalized Resource Network. ICADL 2012. Lecture Notes in Computer Science, vol 7634. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34752-8_28
Download citation
DOI: https://doi.org/10.1007/978-3-642-34752-8_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-34751-1
Online ISBN: 978-3-642-34752-8
eBook Packages: Computer ScienceComputer Science (R0)