A Multimodal Reference Resolution Approach in Virtual Environment

Chen, Xiaowu; Xu, Nan

doi:10.1007/11890881_3

Xiaowu Chen^21,22 &
Nan Xu^21,22

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4270))

Included in the following conference series:

International Conference on Virtual Systems and Multimedia

1331 Accesses

Abstract

This paper presents a multimodal reference resolution approach in virtual environment, which is called RRVE. Based on the relationship between cognitive status and reference, RRVE divides the objects into four status hierarchies including pointing, in focus, activated, extinct, and step by step it processes multimodal reference resolution according to current status hierarchy. Also it defines a match function to compute the match probability of referring expression and potential referent, and describes the semantic signification and temporal constraints. Finally, sense shape is used to deal with the pointing ambiguity, which helps the user to interact precisely in immersive virtual environment.

This paper is support by National Natural Science Foundation of China (60503066), Advancing Research Foundation (51*0305*05), Program for New Century Excellent Talents in University, Virtual Environment Aided Design and Manufacture Project (VEADAM), National 863 Program of China, China Next Generation Internet (CNGI) Project (CNGI-04-15-7A), MOE Program for New Century Excellent Talents in University and Beijing Program for New Stars in Research (2004A11).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Who, Me? How Virtual Agents Can Shape Conversational Footing in Virtual Reality

Real-time multimodal interaction in virtual reality - a case study with a large virtual interface

Article 02 February 2023

Social Virtual Reality: Implementing Non-verbal Cues in Remote Synchronous Communication

References

Oviatt, S.: Ten myths of multimodal interaction. Ten Myths of Multimodal Interaction 42(11), 74–81 (1999)
Google Scholar
Oviatt, S.: Multimodal interactive maps: Designing for human performance. Human-Computer Interaction 12, 93–129 (1997)
Article Google Scholar
Cohen, P.R., Johnston, M., McGee, D., Oviatt, S.L., Pittman, J., Smith, I., Chen, L., Clow, J.: Quickset: Multimodal interaction for distributed applications. In: 5th ACM Int. Multimedia Conf., pp. 31–40 (1997)
Google Scholar
Chai, J.: Semantics-based representation for multimodal interpretation in conversational systems (coling-2002). In: The 19th International Conference on Computational Linguistics, pp. 141–147 (2002)
Google Scholar
Olwal, A., Benko, H., Feiner, S.: Senseshapes: Using statistical geometry for object selection in a multimodal augmented reality. In: The Second IEEE and ACM International Symposium on Mixed and Augmented Reality, pp. 300–301 (2003)
Google Scholar
Chen, X., Xu, N., Li, Y.: A virtual environment for collaborative assembly. In: Second International Conference on Embedded Software and Systems (ICESS 2005), Xi’an, China, pp. 414–421. IEEE CS Press, Los Alamitos (2005)
Google Scholar
Bolt, R.A.: “put-that-there”: Voice and gesture at the graphics interface. Computer Graphics 14(3), 262–270 (1980)
Article MathSciNet Google Scholar
Koons, D.B., Sparrell, C.J., Thorisson, K.R.: Integrating simultaneous input from speech, gaze, and hand gestures. American Association for Artificial Intelligence, 257–276 (1993)
Google Scholar
Pineda, L., Garza, G.: A model for multimodal reference resolution. Computational Linguistics 26(2), 139–193 (2000)
Article Google Scholar
Johnston, M., Bangalore, S.: Finite-state multimodal parsing and understanding. In: Proceedings of the 18th conference on Computational linguistics, Baldonado, pp. 369–375 (2000)
Google Scholar
Chai, J.Y., Hong, P., Zhou, M.X.: A probabilistic approach to reference resolution in multimodal user interfaces. In: Proceedings of the 2004 International Conference on Intelligent User Interfaces (IUI 2004), Madeira, Portugal, pp. 70–77. ACM, New York (2004)
Chapter Google Scholar
Pfeiffer, T., Latoschik, M.E.: Resolving object references in multimodal dialogues for immersive virtual environments. In: Proceedings of the IEEE Virtual Reality 2004 (VR 2004), Chicago, USA, pp. 35–42 (2004)
Google Scholar
Latoschik, M.E.: A user interface framework for multimodal vr interactions. In: Proceedings of the 7th international conference on Multimodal interfaces (ICMI 2005), Trento, Italy, pp. 76–83 (2005)
Google Scholar
Kaiser, E., Olwal, A., McGee, D., Benko, H., Corradini, A., Li, X., Cohen, P., Feiner, S.: Mutual disambiguation of 3d multimodal interaction in augmented and virtual reality. In: Proceedings of the 5th international conference on Multimodal interfaces (ICMI 2003), pp. 12–19 (2003)
Google Scholar
Gundel, J.K., Hedberg, N., Zacharski, R.: Cognitive status and the form of referring expressions in discourse. Language 69(2), 274–307 (1993)
Article Google Scholar
Grice, H.P.: Logic and conversation, pp. 41–58. Academic Press, New York (1975)
Google Scholar
Kehler, A.: Cognitive status and form of reference in multimodal human-computer interaction. In: Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence, pp. 685–690 (2000)
Google Scholar
Chai, J.Y., Prasov, Z., Blaim, J., Jin, R.: Linguistic theories in efficient multimodal reference resolution: an empirical investigation. In: Proceedings of the 10th international conference on Intelligent user interfaces, California, USA, pp. 43–50 (2005)
Google Scholar
Pu, J., Dong, S.: A task-oriented and hierarchical multimodal integration model a task-oriented and hierarchical multimodal integration model and its corresponding algorithm. Journal of Computer Research and Development (in Chinese) 38(8), 966–971 (2001)
Google Scholar
Chai, J.Y., Prasov, Z., Blaim, J., Jin, R.: The reality of virtual reality. In: The Reality of Virtual Reality. Proceedings of Seventh International Conference on Virtual Systems and Multimedia (VSMM 2001), pp. 43–50 (2001)
Google Scholar

Download references

Author information

Authors and Affiliations

The Key Laboratory of Virtual Reality Technology, Ministry of Education, China
Xiaowu Chen & Nan Xu
School of Computer Science and Engineering, Beihang University, Beijing, 100083, P.R. China
Xiaowu Chen & Nan Xu

Authors

Xiaowu Chen
View author publications
You can also search for this author in PubMed Google Scholar
Nan Xu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Machine Intelligence, Peking University, 100871, Beijing, China
Hongbin Zha
State Key Lab of CAD&CG, Zhejiang University, 310027, Hangzhou, China
Zhigeng Pan
Faculty of Creative Multimedia, Multimedia University, Kuala Lumpur, Malaysia
Hal Thwaites
World Heritage Centre, UNESCO, 7, Place de Fontenoy, F-75352, Paris, France
Alonzo C. Addison
IVHN, talian NRC, Roma, Italy
Maurizio Forte

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, X., Xu, N. (2006). A Multimodal Reference Resolution Approach in Virtual Environment. In: Zha, H., Pan, Z., Thwaites, H., Addison, A.C., Forte, M. (eds) Interactive Technologies and Sociotechnical Systems. VSMM 2006. Lecture Notes in Computer Science, vol 4270. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11890881_3

Download citation

DOI: https://doi.org/10.1007/11890881_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-46304-7
Online ISBN: 978-3-540-46305-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics