Abstract
Computation time is an important performance characteristic of computer vision algorithms. The paper shows how existing (slow) binary decision algorithms can be approximated by a (fast) trained WaldBoost classifier.
WaldBoost learning minimises the decision time of the classifier while guaranteeing predefined precision. We show that the WaldBoost algorithm together with bootstrapping is able to efficiently handle an effectively unlimited number of training examples provided by the implementation of the approximated algorithm.
Two interest point detectors, the Hessian-Laplace and the Kadir-Brady saliency detectors, are emulated to demonstrate the approach. Experiments show that while the repeatability and matching scores are similar for the original and emulated algorithms, a 9-fold speed-up for the Hessian-Laplace detector and a 142-fold speed-up for the Kadir-Brady detector is achieved. For the Hessian-Laplace detector, the achieved speed is similar to SURF, a popular and very fast handcrafted modification of Hessian-Laplace; the WaldBoost emulator approximates the output of the Hessian-Laplace detector more precisely.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Baker, S., & Nayar, S. K. (1996). Algorithms for pattern rejection. In International conference on pattern recognition (Vol. 2, pp. 869–874).
Bay, H., Ess, A., Tuytelaars, T., & Gool, L. V. (2008). Speeded-up robust features (SURF). Computer Vision and Image Understanding, 110(3), 346–359.
Bourdev, L., & Brandt, J. (2005). Robust object detection via soft cascade. In IEEE conference on computer vision and pattern recognition (pp. 236–243). Washington: IEEE Computer Society.
Brubaker, S. C., Wu, J., Sun, J., Mullin, M. D., & Rehg, J. M. (2008). On the design of cascades of boosted ensembles for face detection. International Journal of Computer Vision, 77(1–3), 65–86.
Dollár, P., Tu, Z., & Belongie, S. (2006). Supervised learning of edges and object boundaries. In IEEE conference on computer vision and pattern recognition (Vol. 2, pp. 1964–1971).
Fergus, R., Perona, P., & Zisserman, A. (2005). A sparse object category model for efficient learning and exhaustive recognition. In IEEE conference on computer vision and pattern recognition (Vol. 1, pp. 380–387).
Friedman, J., Hastie, T., & Tibshirani, R. (1998). Additive logistic regression: a statistical view of boosting. Technical report, Department of Statistics, Stanford University, Sequoia Hall.
Froba, B., & Ernst, A. (2004). Face detection with the modified census transform. In International conference on automatic face and gesture recognition (pp. 91–96).
Grabner, M., Grabner, H., & Bischof, H. (2006). Fast approximated SIFT. In Asian conference on computer vision (Vol. I, pp. 918–927).
Hare, J. S., & Lewis, P. H. (2004). Salient regions for query by image content. In: Image and video retrieval: third international conference (pp. 317–325).
Huang, C., Ai, H., Lao, S., & Li, Y. (2007). High-performance rotation invariant multiview face detection. Pattern Analysis and Machine Intelligence, 29(4), 671–686.
Jones, M. J., & Rehg, J. M. (2002). Statistical color models with application to skin detection. International Journal of Computer Vision, 46(1), 81–96.
Kadir, T., & Brady, M. (2001). Saliency, scale and image description. International Journal of Computer Vision, 45(2), 83–105.
Kálal, Z., Matas, J., & Mikolajczyk, K. (2008). Weighted sampling for large-scale boosting. In British machine vision conference.
Lepetit, V., Lagger, P., & Fua, P. (2005). Randomized trees for real-time keypoint recognition. In IEEE conference on computer vision and pattern recognition (Vol. II, pp. 775–781).
Lienhart, R., & Maydt, J. (2002). An extended set of Haar-like features for rapid object detection. In International conference on image processing (Vol. 1, pp. 900–903).
Martin, D., Fowlkes, C., & Malik, J. (2004). Learning to detect natural image boundaries using local brightness, color, and texture cues. Pattern Analysis and Machine Intelligence, 26(5), 530–549.
Matas, J., & Šochman, J. (2007). Wald’s sequential analysis for time-constrained vision problems. In International conference on robotics and automation.
Mikolajczyk, K. (2002). Detection of local features invariant to affine transformations. Ph.D. thesis, INPG, Grenoble.
Mikolajczyk, K. (2008a). http://www.robots.ox.ac.uk/~vgg/research/affine.
Mikolajczyk, K. (2008b) Personal communication.
Mikolajczyk, K., & Schmid, C. (2004). Scale and affine invariant interest point detectors. International Journal of Computer Vision, 60(1), 63–86.
Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman, A., Matas, J., Schaffalitzky, F., Kadir, T., & Van Gool, L. (2005). A comparison of affine region detectors. International Journal of Computer Vision, 65(1/2), 43–72.
Ojala, T., Pietikäinen, M., & Mäenpää, T. (2002). Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. Pattern Analysis and Machine Intelligence, 24(7), 971–987.
Rivest, R. L. (1987). Learning decision lists. In Machine learning (pp. 229–246).
Rosten, E., & Drummond, T. (2006). Machine learning for high-speed corner detection. In European conference on computer vision (Vol. 1, pp. 430–443).
Schapire, R. E., & Singer, Y. (1999). Improved boosting algorithms using confidence-rated predictions. Machine Learning, 37(3), 297–336.
Siegmund, D. (1985). Sequential analysis. Test and confidence intervals. Springer series in statistics. New York: Springer.
Šochman, J., & Matas, J. (2005). WaldBoost—learning for time constrained sequential detection. In IEEE conference on computer vision and pattern recognition (Vol. 2, pp. 150–157).
Šochman, J., & Matas, J. (2007). Learning a fast emulator of a binary decision process. In Asian conference on computer vision (Vol. II, pp. 236–245).
Sung, K. K., & Poggio, T. (1998). Example-based learning for view-based human face detection. Pattern Analysis and Machine Intelligence, 20(1), 39–51.
Viola, P., & Jones, M. (2001). Robust real time object detection. In International workshop on statistical and computational theories of vision.
Wald, A. (1947). Sequential analysis. New York: Dover.
Xiao, R., Zhu, L., & Zhang, H. (2003). Boosting chain learning for object detection. In International conference on computer vision (pp. 709–715).
Zhu, L., Chen, Y., & Yuille, A. L. (2006). Unsupervised learning of a probabilistic grammar for object detection and parsing. In Advances in neural information processing systems (pp. 1617–1624).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Šochman, J., Matas, J. Learning Fast Emulators of Binary Decision Processes. Int J Comput Vis 83, 149–163 (2009). https://doi.org/10.1007/s11263-009-0229-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11263-009-0229-x