default search action
Daniel Garcia-Romero
Person information
Other persons with a similar name
SPARQL queries
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j6]Jaesung Huh, Joon Son Chung, Arsha Nagrani, Andrew Brown, Jee-weon Jung, Daniel Garcia-Romero, Andrew Zisserman:
The VoxCeleb Speaker Recognition Challenge: A Retrospective. IEEE ACM Trans. Audio Speech Lang. Process. 32: 3850-3866 (2024) - [c64]Raghuveer Peri, Sai Muralidhar Jayanthi, Srikanth Ronanki, Anshu Bhatia, Karel Mundnich, Saket Dingliwal, Nilaksh Das, Zejiang Hou, Goeric Huybrechts, Srikanth Vishnubhotla, Daniel Garcia-Romero, Sundararajan Srinivasan, Kyu J. Han, Katrin Kirchhoff:
SpeechGuard: Exploring the Adversarial Robustness of Multi-modal Large Language Models. ACL (Findings) 2024: 10018-10035 - [i7]Raghuveer Peri, Sai Muralidhar Jayanthi, Srikanth Ronanki, Anshu Bhatia, Karel Mundnich, Saket Dingliwal, Nilaksh Das, Zejiang Hou, Goeric Huybrechts, Srikanth Vishnubhotla, Daniel Garcia-Romero, Sundararajan Srinivasan, Kyu J. Han, Katrin Kirchhoff:
SpeechGuard: Exploring the Adversarial Robustness of Multimodal Large Language Models. CoRR abs/2405.08317 (2024) - [i6]Jaesung Huh, Joon Son Chung, Arsha Nagrani, Andrew Brown, Jee-weon Jung, Daniel Garcia-Romero, Andrew Zisserman:
The VoxCeleb Speaker Recognition Challenge: A Retrospective. CoRR abs/2408.14886 (2024) - 2023
- [i5]Jaesung Huh, Andrew Brown, Jee-weon Jung, Joon Son Chung, Arsha Nagrani, Daniel Garcia-Romero, Andrew Zisserman:
VoxSRC 2022: The Fourth VoxCeleb Speaker Recognition Challenge. CoRR abs/2302.10248 (2023) - [i4]Raghuveer Peri, Seyed Omid Sadjadi, Daniel Garcia-Romero:
VoxWatch: An open-set speaker recognition benchmark on VoxCeleb. CoRR abs/2307.00169 (2023) - 2022
- [c63]Rohit Paturi, Sundararajan Srinivasan, Katrin Kirchhoff, Daniel Garcia-Romero:
Directed speech separation for automatic speech recognition of long form conversational speech. INTERSPEECH 2022: 5388-5392 - 2021
- [c62]Pengcheng Guo, Florian Boyer, Xuankai Chang, Tomoki Hayashi, Yosuke Higuchi, Hirofumi Inaguma, Naoyuki Kamo, Chenda Li, Daniel Garcia-Romero, Jiatong Shi, Jing Shi, Shinji Watanabe, Kun Wei, Wangyou Zhang, Yuekai Zhang:
Recent Developments on Espnet Toolkit Boosted By Conformer. ICASSP 2021: 5874-5878 - 2020
- [j5]Jesús Villalba, Nanxin Chen, David Snyder, Daniel Garcia-Romero, Alan McCree, Gregory Sell, Jonas Borgstrom, Leibny Paola García-Perera, Fred Richardson, Réda Dehak, Pedro A. Torres-Carrasquillo, Najim Dehak:
State-of-the-art speaker recognition with neural network embeddings in NIST SRE18 and Speakers in the Wild evaluations. Comput. Speech Lang. 60 (2020) - [c61]Daniel Garcia-Romero, Alan McCree, David Snyder, Gregory Sell:
Jhu-HLTCOE System for the Voxsrc Speaker Recognition Challenge. ICASSP 2020: 7559-7563 - [c60]Daniel Garcia-Romero, Gregory Sell, Alan McCree:
MagNetO: X-vector Magnitude Estimation Network plus Offset for Improved Speaker Recognition. Odyssey 2020: 1-8 - [c59]Jesús Antonio Villalba López, Daniel Garcia-Romero, Nanxin Chen, Gregory Sell, Jonas Borgstrom, Alan McCree, Leibny Paola García-Perera, Saurabh Kataria, Phani Sankar Nidadavolu, Pedro Torres-Carrasquiilo, Najim Dehak:
Advances in Speaker Recognition for Telephone and Audio-Visual Data: the JHU-MIT Submission for NIST SRE19. Odyssey 2020: 273-280 - [i3]Pengcheng Guo, Florian Boyer, Xuankai Chang, Tomoki Hayashi, Yosuke Higuchi, Hirofumi Inaguma, Naoyuki Kamo, Chenda Li, Daniel Garcia-Romero, Jiatong Shi, Jing Shi, Shinji Watanabe, Kun Wei, Wangyou Zhang, Yuekai Zhang:
Recent Developments on ESPnet Toolkit Boosted by Conformer. CoRR abs/2010.13956 (2020)
2010 – 2019
- 2019
- [c58]David Snyder, Daniel Garcia-Romero, Gregory Sell, Alan McCree, Daniel Povey, Sanjeev Khudanpur:
Speaker Recognition for Multi-speaker Conversations Using X-vectors. ICASSP 2019: 5796-5800 - [c57]Gregory Sell, David Etter, Daniel Garcia-Romero, Alan McCree:
Script Identification using Across- and Within-Image Distribution Estimation. ICDAR 2019: 1084-1089 - [c56]Alan McCree, Gregory Sell, Daniel Garcia-Romero:
Speaker Diarization Using Leave-One-Out Gaussian PLDA Clustering of DNN Embeddings. INTERSPEECH 2019: 381-385 - [c55]Jesús Villalba, Nanxin Chen, David Snyder, Daniel Garcia-Romero, Alan McCree, Gregory Sell, Jonas Borgstrom, Fred Richardson, Suwon Shon, François Grondin, Réda Dehak, Leibny Paola García-Perera, Daniel Povey, Pedro A. Torres-Carrasquillo, Sanjeev Khudanpur, Najim Dehak:
State-of-the-Art Speaker Recognition for Telephone and Video Speech: The JHU-MIT Submission for NIST SRE18. INTERSPEECH 2019: 1488-1492 - [c54]Daniel Garcia-Romero, David Snyder, Gregory Sell, Alan McCree, Daniel Povey, Sanjeev Khudanpur:
x-Vector DNN Refinement with Full-Length Recordings for Speaker Recognition. INTERSPEECH 2019: 1493-1496 - [c53]Daniel Garcia-Romero, David Snyder, Shinji Watanabe, Gregory Sell, Alan McCree, Daniel Povey, Sanjeev Khudanpur:
Speaker Recognition Benchmark Using the CHiME-5 Corpus. INTERSPEECH 2019: 1506-1510 - 2018
- [c52]Gregory Sell, Kevin Duh, David Snyder, Dave Etter, Daniel Garcia-Romero:
Audio-Visual Person Recognition in Multimedia Data From the Iarpa Janus Program. ICASSP 2018: 3031-3035 - [c51]David Snyder, Daniel Garcia-Romero, Gregory Sell, Daniel Povey, Sanjeev Khudanpur:
X-Vectors: Robust DNN Embeddings for Speaker Recognition. ICASSP 2018: 5329-5333 - [c50]Anna Silnova, Niko Brümmer, Daniel Garcia-Romero, David Snyder, Lukás Burget:
Fast Variational Bayes for Heavy-tailed PLDA Applied to i-vectors and x-vectors. INTERSPEECH 2018: 72-76 - [c49]Gregory Sell, David Snyder, Alan McCree, Daniel Garcia-Romero, Jesús Villalba, Matthew Maciejewski, Vimal Manohar, Najim Dehak, Daniel Povey, Shinji Watanabe, Sanjeev Khudanpur:
Diarization is Hard: Some Experiences and Lessons Learned for the JHU Team in the Inaugural DIHARD Challenge. INTERSPEECH 2018: 2808-2812 - [c48]Alan McCree, David Snyder, Gregory Sell, Daniel Garcia-Romero:
Language Recognition for Telephone and Video Speech: The JHU HLTCOE Submission for NIST LRE17. Odyssey 2018: 68-73 - [c47]David Snyder, Daniel Garcia-Romero, Alan McCree, Gregory Sell, Daniel Povey, Sanjeev Khudanpur:
Spoken Language Recognition using X-vectors. Odyssey 2018: 105-111 - [i2]Anna Silnova, Niko Brummer, Daniel Garcia-Romero, David Snyder, Lukás Burget:
Fast variational Bayes for heavy-tailed PLDA applied to i-vectors and x-vectors. CoRR abs/1803.09153 (2018) - 2017
- [c46]Daniel Garcia-Romero, David Snyder, Gregory Sell, Daniel Povey, Alan McCree:
Speaker diarization using deep neural network embeddings. ICASSP 2017: 4930-4934 - [c45]David Snyder, Daniel Garcia-Romero, Daniel Povey, Sanjeev Khudanpur:
Deep Neural Network Embeddings for Text-Independent Speaker Verification. INTERSPEECH 2017: 999-1003 - [c44]Alan McCree, Gregory Sell, Daniel Garcia-Romero:
Extended Variability Modeling and Unsupervised Adaptation for PLDA Speaker Recognition. INTERSPEECH 2017: 1552-1556 - 2016
- [c43]Gregory Sell, Alan McCree, Daniel Garcia-Romero:
Priors for Speaker Counting and Diarization with AHC. INTERSPEECH 2016: 2194-2198 - [c42]Daniel Garcia-Romero, Alan McCree:
Stacked Long-Term TDNN for Spoken Language Recognition. INTERSPEECH 2016: 3226-3230 - [c41]Alan McCree, Gregory Sell, Daniel Garcia-Romero:
Augmented Data Training of Joint Acoustic/Phonotactic DNN i-vectors for NIST LRE15. Odyssey 2016: 204-209 - [c40]Audrey Tong, Craig S. Greenberg, Alvin F. Martin, Désiré Bansé, John M. Howard, Hui Zhao, George R. Doddington, Daniel Garcia-Romero, Alan McCree, Douglas A. Reynolds, Elliot Singer, Jaime Hernandez-Cordero, Lisa P. Mason:
Summary of the 2015 NIST Language Recognition i-Vector Machine Learning Challenge. Odyssey 2016: 297-302 - [c39]David Snyder, Pegah Ghahremani, Daniel Povey, Daniel Garcia-Romero, Yishay Carmiel, Sanjeev Khudanpur:
Deep neural network-based speaker embeddings for end-to-end speaker verification. SLT 2016: 165-170 - 2015
- [c38]David Snyder, Daniel Garcia-Romero, Daniel Povey:
Time delay deep neural network-based universal background models for speaker recognition. ASRU 2015: 92-97 - [c37]Chandler May, Francis Ferraro, Alan McCree, Jonathan Wintrode, Daniel Garcia-Romero, Benjamin Van Durme:
Topic Identification and Discovery on Text and Speech. EMNLP 2015: 2377-2387 - [c36]Gregory Sell, Daniel Garcia-Romero:
Diarization resegmentation in the factor analysis subspace. ICASSP 2015: 4794-4798 - [c35]Jonathan Wintrode, Gregory Sell, Aren Jansen, Michelle Fox, Daniel Garcia-Romero, Alan McCree:
Content-based recommender systems for spoken documents. ICASSP 2015: 5201-5205 - [c34]Alan McCree, Daniel Garcia-Romero:
DNN senone MAP multinomial i-vectors for phonotactic language recognition. INTERSPEECH 2015: 394-397 - [c33]Daniel Garcia-Romero, Alan McCree:
Insights into deep neural networks for speaker recognition. INTERSPEECH 2015: 1141-1145 - [c32]Désiré Bansé, George R. Doddington, Daniel Garcia-Romero, John J. Godfrey, Craig S. Greenberg, Jaime Hernandez-Cordero, John M. Howard, Alvin F. Martin, Lisa P. Mason, Alan McCree, Douglas A. Reynolds:
Analysis of the second phase of the 2013-2014 i-vector machine learning challenge. INTERSPEECH 2015: 3041-3045 - [c31]Gregory Sell, Daniel Garcia-Romero, Alan McCree:
Speaker diarization with i-vectors from DNN senone posteriors. INTERSPEECH 2015: 3096-3099 - 2014
- [c30]Aren Jansen, Daniel Garcia-Romero, Pascal Clark, Jaime Hernandez-Cordero:
Unsupervised idiolect discovery for speaker recognition. ICASSP 2014: 1675-1679 - [c29]Niko Brümmer, Daniel Garcia-Romero:
Generative modelling for unsupervised score calibration. ICASSP 2014: 1680-1684 - [c28]Daniel Garcia-Romero, Alan McCree:
Supervised domain adaptation for I-vector based speaker recognition. ICASSP 2014: 4047-4051 - [c27]Désiré Bansé, George R. Doddington, Daniel Garcia-Romero, John J. Godfrey, Craig S. Greenberg, Alvin F. Martin, Alan McCree, Mark A. Przybocki, Douglas A. Reynolds:
Summary and initial results of the 2013-2014 speaker recognition i-vector machine learning challenge. INTERSPEECH 2014: 368-372 - [c26]Alan McCree, Douglas A. Reynolds, Daniel Garcia-Romero, Tomi Kinnunen, Craig S. Greenberg, Désiré Bansé, George R. Doddington, John J. Godfrey, Alvin F. Martin, Mark A. Przybocki:
The NIST 2014 Speaker Recognition i-vector Machine Learning Challenge. Odyssey 2014: 224-230 - [c25]Niko Brummer, Alan McCree, Stephen Shum, Daniel Garcia-Romero, Carlos Vaquero:
Unsupervised Domain Adaptation for I-Vector Speaker Recognition. Odyssey 2014: 260-264 - [c24]Alan McCree, Stephen Shum, Douglas A. Reynolds, Daniel Garcia-Romero:
Unsupervised Clustering Approaches for Domain Adaptation in Speaker Recognition Systems. Odyssey 2014: 265-272 - [c23]Daniel Garcia-Romero, Xiaohui Zhang, Alan McCree, Daniel Povey:
Improving speaker recognition performance in the domain adaptation challenge using deep neural networks. SLT 2014: 378-383 - [c22]Gregory Sell, Daniel Garcia-Romero:
Speaker diarization with plda i-vector scoring and unsupervised calibration. SLT 2014: 413-417 - 2013
- [j4]Balaji Vasan Srinivasan, Yuancheng Luo, Daniel Garcia-Romero, Dmitry N. Zotkin, Ramani Duraiswami:
A Symmetric Kernel Partial Least Squares Framework for Speaker Recognition. IEEE Trans. Speech Audio Process. 21(7): 1415-1423 (2013) - [c21]Daniel Garcia-Romero, Alan McCree:
Subspace-constrained supervector PLDA for speaker verification. INTERSPEECH 2013: 2479-2483 - [i1]Niko Brümmer, Daniel Garcia-Romero:
Generative Modelling for Unsupervised Score Calibration. CoRR abs/1311.0707 (2013) - 2012
- [b1]Daniel Garcia-Romero:
Robust speaker Recognition based on Latent variable Models. University of Maryland, College Park, MD, USA, 2012 - [c20]Daniel Garcia-Romero, Xinhui Zhou, Dmitry N. Zotkin, Balaji Vasan Srinivasan, Yuancheng Luo, Sriram Ganapathy, Samuel Thomas, Sridhar Krishna Nemala, Garimella S. V. S. Sivaram, Majid Mirbagheri, Sri Harish Reddy Mallidi, Thomas Janu, Padmanabhan Rajan, Nima Mesgarani, Mounya Elhilali, Hynek Hermansky, Shihab A. Shamma, Ramani Duraiswami:
The UMD-JHU 2011 speaker recognition system. ICASSP 2012: 4229-4232 - [c19]Daniel Garcia-Romero, Xinhui Zhou, Carol Y. Espy-Wilson:
Multicondition training of Gaussian PLDA models in i-vector space for noise and reverberation robust speaker recognition. ICASSP 2012: 4257-4260 - [c18]Xinhui Zhou, Daniel Garcia-Romero, Nima Mesgarani, Maureen L. Stone, Carol Y. Espy-Wilson, Shihab A. Shamma:
Automatic intelligibility assessment of pathologic speech in head and neck cancer based on auditory-inspired spectro-temporal modulations. INTERSPEECH 2012: 542-545 - 2011
- [c17]Xinhui Zhou, Daniel Garcia-Romero, Ramani Duraiswami, Carol Y. Espy-Wilson, Shihab A. Shamma:
Linear versus mel frequency cepstral coefficients for speaker recognition. ASRU 2011: 559-564 - [c16]Daniel Garcia-Romero, Carol Y. Espy-Wilson:
Analysis of i-vector Length Normalization in Speaker Recognition Systems. INTERSPEECH 2011: 249-252 - [c15]Balaji Vasan Srinivasan, Daniel Garcia-Romero, Dmitry N. Zotkin, Ramani Duraiswami:
Kernel Partial Least Squares for Speaker Recognition. INTERSPEECH 2011: 493-496 - [c14]Jingting Zhou, Daniel Garcia-Romero, Carol Y. Espy-Wilson:
Automatic Speech Codec Identification with Applications to Tampering Detection of Speech Recordings. INTERSPEECH 2011: 2533-2536 - 2010
- [c13]Daniel Garcia-Romero, Carol Y. Espy-Wilson:
Automatic acquisition device identification from speech recordings. ICASSP 2010: 1806-1809 - [c12]Daniel Garcia-Romero, Carol Y. Espy-Wilson:
Joint Factor Analysis for Speaker Recognition Reinterpreted as Signal Coding Using Overcomplete Dictionaries. Odyssey 2010: 22
2000 – 2009
- 2008
- [c11]Vikramjit Mitra, Daniel Garcia-Romero, Carol Y. Espy-Wilson:
Language detection in audio content analysis. ICASSP 2008: 2109-2112 - [c10]Daniel Garcia-Romero, Carol Y. Espy-Wilson:
Intersession variability in speaker recognition: a behind the scene analysis. INTERSPEECH 2008: 1413-1416 - [c9]Vikramjit Mitra, Daniel Garcia-Romero, Carol Y. Espy-Wilson:
Language and genre detection in audio content analysis. INTERSPEECH 2008: 2506-2509 - 2006
- [j3]Daniel Garcia-Romero, Julian Fiérrez-Aguilar, Joaquin Gonzalez-Rodriguez, Javier Ortega-Garcia:
Using quality measures for multilevel speaker recognition. Comput. Speech Lang. 20(2-3): 192-209 (2006) - 2005
- [j2]Julian Fiérrez-Aguilar, Daniel Garcia-Romero, Javier Ortega-Garcia, Joaquin Gonzalez-Rodriguez:
Bayesian adaptation for user-dependent multimodal biometric authentication. Pattern Recognit. 38(8): 1317-1319 (2005) - [j1]Julian Fiérrez-Aguilar, Daniel Garcia-Romero, Javier Ortega-Garcia, Joaquin Gonzalez-Rodriguez:
Adapted user-dependent multimodal biometric authentication exploiting general information. Pattern Recognit. Lett. 26(16): 2628-2639 (2005) - [c8]Julian Fiérrez-Aguilar, Daniel Garcia-Romero, Javier Ortega-Garcia, Joaquin Gonzalez-Rodriguez:
Speaker Verification Using Adapted User-Dependent Multilevel Fusion. Multiple Classifier Systems 2005: 356-365 - 2004
- [c7]Julian Fiérrez-Aguilar, Daniel Garcia-Romero, Javier Ortega-Garcia, Joaquín González-Rodríguez:
Exploiting general knowledge in user-dependent fusion strategies for multimodal biometric verification. ICASSP (5) 2004: 617-620 - [c6]Daniel Garcia-Romero, Julian Fiérrez-Aguilar, Joaquín González-Rodríguez, Javier Ortega-Garcia:
On the use of quality measures for text-independent speaker recognition. Odyssey 2004: 105-110 - 2003
- [c5]Daniel Garcia-Romero, Joaquin Gonzalez-Rodriguez, Julian Fiérrez-Aguilar, Javier Ortega-Garcia:
U-NORM Likelihood Normalization in PIN-Based Speaker Verification Systems. AVBPA 2003: 208-213 - [c4]Julian Fiérrez-Aguilar, Javier Ortega-Garcia, Daniel Garcia-Romero, Joaquin Gonzalez-Rodriguez:
A Comparative Evaluation of Fusion Strategies for Multimodal Biometric Verification. AVBPA 2003: 830-837 - [c3]Daniel Garcia-Romero, Julian Fiérrez-Aguilar, Joaquín González-Rodríguez, Javier Ortega-Garcia:
Support vector machine fusion of idiolectal and acoustic speaker information in Spanish conversational speech. ICASSP (2) 2003: 229-232 - [c2]Daniel Garcia-Romero, Julian Fiérrez-Aguilar, Joaquin Gonzalez-Rodriguez, Javier Ortega-Garcia:
Support vector machine fusion of idiolectal and acoustic speaker information in Spanish conversational speech. ICME 2003: 205-208 - [c1]Joaquin Gonzalez-Rodriguez, Daniel Garcia-Romero, Marta Garcia-Gomar, Daniel Ramos, Javier Ortega-Garcia:
Robust likelihood ratio estimation in Bayesian forensic speaker recognition. INTERSPEECH 2003: 693-696
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-09-30 00:08 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint