Meeting State Recognition from Visual and Aural Labels | SpringerLink
Skip to main content

Meeting State Recognition from Visual and Aural Labels

  • Conference paper
Machine Learning for Multimodal Interaction (MLMI 2007)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4892))

Included in the following conference series:

Abstract

In this paper we present a meeting state recognizer based on a combination of multi-modal sensor data in a smart room. Our approach is based on the training of a statistical model to use semantical cues generated by perceptual components. These perceptual components generate these cues in processing the output of one or multiple sensors. The presented recognizer is designed to work with an arbitrary combination of multi-modal input sensors. We have defined a set of states representing both meeting and non-meeting situations, and a set of features we base our classification on. Thus, we can model situations like presentation or break which are important information for many applications. We have hand-annotated a set of meeting recordings to verify our statistical classification, as appropriate multi-modal corpora are currently very sparse. We have also used several statistical classification methods for the best classification, which we validated on the hand-annotated corpus of real meeting data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Banerjee, S., Rudnicky, A.I.: Using simple speech-based features to detect the state of a meeting and the roles of the meeting participants. In: Proceedings of ICSLP 2004, Jeju Island, Korea (2004)

    Google Scholar 

  2. Hakeem, A., Shah, M.: Ontology and taxonomy collaborated framework for meeting classification. In: ICPR 2004. Proceedings of the 17th International Conference on Pattern Recognition (2004)

    Google Scholar 

  3. Wang, J., Chen, G., Kotz, D.: A meeting detector and its applications. NH, USA (2004)

    Google Scholar 

  4. Campbell, N., Suzuki, N.: Working with very sparse data to detect speaker and listener participation in a meetings corpus. In: Proceedings of Multimodal Behaviour Theory to Usable Models, Genova, Italy (2006)

    Google Scholar 

  5. Carletta, J., Ashby, S., Bourban, S., Flynn, M., Guillemot, M., Hain, T., Kadlec, J., Kasairos, V., Kraaij, W., Kronenthal, M., Lathoud, G., Lincoln, M., Lisowska, A., McCowan, I., Post, W., Reidsma, D., Wellner, P.: The AMI meeting corpus: a pre–anouncement. In: Renals, S., Bengio, S. (eds.) MLMI 2005. LNCS, vol. 3869, pp. 28–39. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  6. McCowan, I., Gatica-Perez, D., Bengio, S., Lathoud, G., Barnard, M., Zhang, D.: Automatic analysis of multimodal group action in meetings. IEEE Transactions on Pattern Analysis and Machine Intelligence 27(3), 305–317 (2005)

    Article  Google Scholar 

  7. Chen, L., Rose, R.T., Parrill, F., Han, X., Tu, J., Huang, Z., Harper, M., Quek, F., McNeill, D., Tuttle, R.: (VACE) multimodal meeting corpus. In: Renals, S., Bengio, S. (eds.) MLMI 2005. LNCS, vol. 3869, pp. 40–51. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  8. Stiefelhagen, R., Bowers, R.: CLEAR (Classification of Events. MD, USA (2007), http://isl.ira.uka.de/clear07/

    Google Scholar 

  9. Danninger, M., Robles, E., Takayama, L., Wang, Q., Kluge, T., Nass, C., Stiefelhagen, R.: The connector service - predicting availability in mobile contexts. In: Renals, S., Bengio, S., Fiscus, J.G. (eds.) MLMI 2006. LNCS, vol. 4299, pp. 129–141. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  10. Crowley, J.L., Coutaz, J., Rey, G., Reignier, P.: Perceptual components for context aware computing. In: Borriello, G., Holmquist, L.E. (eds.) UbiComp 2002. LNCS, vol. 2498, Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  11. Fleury, P., Cuřín, J., Kleindienst, J.: SitCom - development platform for multimodal perceptual services. In: Marik, V., Vyatkin, V., Colombo, A.W. (eds.) HoloMAS 2007. LNCS (LNAI), vol. 4659, pp. 104–113. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  12. Witten, I.H., Frank, E.: Data mining. Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Andrei Popescu-Belis Steve Renals Hervé Bourlard

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Cuřín, J., Fleury, P., Kleindienst, J., Kessl, R. (2008). Meeting State Recognition from Visual and Aural Labels. In: Popescu-Belis, A., Renals, S., Bourlard, H. (eds) Machine Learning for Multimodal Interaction. MLMI 2007. Lecture Notes in Computer Science, vol 4892. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78155-4_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-78155-4_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-78154-7

  • Online ISBN: 978-3-540-78155-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics