Abstract
Due to the increasingly massive amounts of data that need to be analyzed in digital forensic investigations, it is necessary to automatically recognize suspect files and filter out non-relevant files. To achieve this goal, digital forensic practitioners employ hashing algorithms to classify files into known-good, known-bad and unknown files. However, a typical personal computer may store hundreds of thousands of files and the task becomes extremely time-consuming. This paper attempts to address the problem using a framework that speeds up processing by using multiple threads. Unlike a typical multithreading approach, where the hashing algorithm is performed by multiple threads, the proposed framework incorporates a dedicated prefetcher thread that reads files from a device. Experimental results demonstrate a runtime efficiency of nearly 40% over single threading.
Chapter PDF
Similar content being viewed by others
References
D. Alcantara, A. Sharf, F. Abbasinejad, S. Sengupta, M. Mitzenmacher, J. Owens and N. Amenta, Real-time parallel hashing on the GPU, ACM Transactions on Graphics, vol. 28(5), article no. 154, 2009.
C. Altheide and H. Carvey, Digital Forensics with Open Source Tools, Syngress, Waltham, Massachusetts, 2011.
H. Baier and F. Breitinger, Security aspects of piecewise hashing in computer forensics, Proceedings of the Sixth International Conference on IT Security Incident Management and IT Forensics, pp. 21–36, 2011.
A. Baxter, SSD vs. HDD ( www.storagereview.com/ssd_vs_hdd ), 2012.
B. Bloom, Space/time trade-offs in hash coding with allowable errors, Communications of the ACM, vol. 13(7), pp. 422–426, 1970.
F. Breitinger and H. Baier, Performance issues about context-triggered piecewise hashing, Proceedings of the Third International ICST Conference on Digital Forensics and Cyber Crime, pp. 141–155, 2011.
F. Breitinger and H. Baier, Similarity preserving hashing: Eligible properties and a new algorithm mrsh-v2, Proceedings of the Fourth International ICST Conference on Digital Forensics and Cyber Crime, 2012.
L. Chen and G. Wang, An efficient piecewise hashing method for computer forensics, Proceedings of the First International Workshop on Knowledge Discovery and Data Mining, pp. 635–638, 2008.
J. Kornblum, Identifying almost identical files using context triggered piecewise hashing, Digital Investigation, vol. 3(S), pp. S91–S97, 2006.
A. Menezes, P. van Oorschot and S. Vanstone, Handbook of Applied Cryptography, CRC Press, Boca Raton, Florida, 1997.
G. Moore, Cramming more components onto integrated circuits, Electronics Magazine, pp. 114–117, April 19, 1965.
National Institute of Standards and Technology, Secure Hash Standard, FIPS Publication 180-3, Gaithersburg, Maryland, 2008.
National Institute of Standards and Technology, National Software Reference Library, Gaithersburg, Maryland ( www.nsrl.nist.gov ), 2012.
L. Noll, FNV hash ( www.isthe.com/chongo/tech/comp/fnv/index.html ), 2012.
R. Rivest, MD5 Message-Digest Algorithm, RFC 1321, 1992.
V. Roussev, Data fingerprinting with similarity digests, in Advances in Digital Forensics VI, K. Chow and S. Shenoi (Eds.), Springer, Heidelberg, Germany, pp. 207–226, 2010.
V. Roussev, An evaluation of forensic similarity hashes, Digital Investigation, vol. 8(S), pp. S34–S41, 2011.
S. Sumathi and S. Esakkirajan, Fundamentals of Relational Database Management Systems, Springer-Verlag, Berlin Heidelberg, Germany, 2010.
S. Woerthmueller, Multithreaded file I/O, Dr. Dobb’s Journal, September 28, 2009.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 IFIP International Federation for Information Processing
About this paper
Cite this paper
Breitinger, F., Petrov, K. (2013). Reducing the Time Required for Hashing Operations. In: Peterson, G., Shenoi, S. (eds) Advances in Digital Forensics IX. DigitalForensics 2013. IFIP Advances in Information and Communication Technology, vol 410. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41148-9_7
Download citation
DOI: https://doi.org/10.1007/978-3-642-41148-9_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-41147-2
Online ISBN: 978-3-642-41148-9
eBook Packages: Computer ScienceComputer Science (R0)