{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,9,7]],"date-time":"2024-09-07T10:21:59Z","timestamp":1725704519658},"publisher-location":"New York, NY, USA","reference-count":33,"publisher":"ACM","funder":[{"name":"European Union","award":["101021797"]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2022,12,13]]},"DOI":"10.1145\/3551626.3564962","type":"proceedings-article","created":{"date-parts":[[2022,12,7]],"date-time":"2022-12-07T00:55:45Z","timestamp":1670374545000},"page":"1-5","update-policy":"http:\/\/dx.doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":3,"title":["Wider or Deeper Neural Network Architecture for Acoustic Scene Classification with Mismatched Recording Devices"],"prefix":"10.1145","author":[{"given":"Lam","family":"Pham","sequence":"first","affiliation":[{"name":"Austrian Institute of Technology, Austria"}]},{"given":"Khoa","family":"Tran","sequence":"additional","affiliation":[{"name":"Da Nang University, Viet Nam"}]},{"given":"Dat","family":"Ngo","sequence":"additional","affiliation":[{"name":"University of Essex, UK"}]},{"given":"Hieu","family":"Tang","sequence":"additional","affiliation":[{"name":"FPT University, Viet Nam"}]},{"given":"Son","family":"Phan","sequence":"additional","affiliation":[{"name":"Amanotes Company, Viet Nam"}]},{"given":"Alexander","family":"Schindler","sequence":"additional","affiliation":[{"name":"Austrian Institute of Technology, Austria"}]}],"member":"320","published-online":{"date-parts":[[2022,12,13]]},"reference":[{"key":"e_1_3_2_1_1_1","unstructured":"Fran\u00e7ois Chollet et al. 2015. Keras. https:\/\/keras.io. Fran\u00e7ois Chollet et al. 2015. Keras. https:\/\/keras.io."},{"key":"e_1_3_2_1_2_1","unstructured":"D. P. W. Ellis. 2009. Gammatone-like spectrogram. http:\/\/www.ee.columbia.edu\/dpwe\/resources\/matlab\/gammatonegram. D. P. W. Ellis. 2009. Gammatone-like spectrogram. http:\/\/www.ee.columbia.edu\/dpwe\/resources\/matlab\/gammatonegram."},{"key":"e_1_3_2_1_3_1","unstructured":"Detection and Classification of Acoustic Scenes and Events. 2018. (DCASE). https:\/\/dcase.community\/challenge2018. Detection and Classification of Acoustic Scenes and Events. 2018. (DCASE). https:\/\/dcase.community\/challenge2018."},{"key":"e_1_3_2_1_4_1","unstructured":"Detection and Classification of Acoustic Scenes and Events. 2020. (DCASE Task 1A Results). https:\/\/dcase.community\/challenge2020\/task-acoustic-scene-classification-results-a. Detection and Classification of Acoustic Scenes and Events. 2020. (DCASE Task 1A Results). https:\/\/dcase.community\/challenge2020\/task-acoustic-scene-classification-results-a."},{"key":"e_1_3_2_1_6_1","volume-title":"Proc. DCASE. 56--60","author":"Heittola Toni","year":"2020","unstructured":"Toni Heittola , Annamaria Mesaros , and Tuomas Virtanen . 2020 . Acoustic scene classification in DCASE 2020 challenge: generalization across devices and low complexity solutions . In Proc. DCASE. 56--60 . Toni Heittola, Annamaria Mesaros, and Tuomas Virtanen. 2020. Acoustic scene classification in DCASE 2020 challenge: generalization across devices and low complexity solutions. In Proc. DCASE. 56--60."},{"key":"e_1_3_2_1_7_1","volume-title":"Proceedings of the 32nd International Conference on Machine Learning. 448--456","author":"Ioffe Sergey","year":"2015","unstructured":"Sergey Ioffe and Christian Szegedy . 2015 . Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift . In Proceedings of the 32nd International Conference on Machine Learning. 448--456 . Sergey Ioffe and Christian Szegedy. 2015. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. In Proceedings of the 32nd International Conference on Machine Learning. 448--456."},{"key":"e_1_3_2_1_9_1","unstructured":"Yong-Deok Kim Eunhyeok Park Sungjoo Yoo Taelim Choi Lu Yang and Dongjun Shin. 2016. Compression of deep convolutional neural networks for fast and low power mobile applications. In ICLR. Yong-Deok Kim Eunhyeok Park Sungjoo Yoo Taelim Choi Lu Yang and Dongjun Shin. 2016. Compression of deep convolutional neural networks for fast and low power mobile applications. In ICLR."},{"key":"e_1_3_2_1_10_1","volume-title":"Kingma and Jimmy Ba","author":"Diederik","year":"2015","unstructured":"Diederik P. Kingma and Jimmy Ba . 2015 . Adam : A Method for Stochastic Optimization. CoRR abs\/1412.6980 (2015). Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. CoRR abs\/1412.6980 (2015)."},{"key":"e_1_3_2_1_11_1","volume-title":"Proc. DCASE. 86--90","author":"Koutini Khaled","year":"2020","unstructured":"Khaled Koutini , Florian Henkel , Hamid Eghbal-zadeh, and Gerhard Widmer . 2020 . Low-complexity models for acoustic scene classification based on receptive field regularization and frequency damping . In Proc. DCASE. 86--90 . Khaled Koutini, Florian Henkel, Hamid Eghbal-zadeh, and Gerhard Widmer. 2020. Low-complexity models for acoustic scene classification based on receptive field regularization and frequency damping. In Proc. DCASE. 86--90."},{"key":"e_1_3_2_1_12_1","volume-title":"Proc. DCASE. 86--90","author":"Koutini Khaled","year":"2020","unstructured":"Khaled Koutini , Florian Henkel , Hamid Eghbal-zadeh, and Gerhard Widmer . 2020 . Low-complexity models for acoustic scene classification based on receptive field regularization and frequency damping . In Proc. DCASE. 86--90 . Khaled Koutini, Florian Henkel, Hamid Eghbal-zadeh, and Gerhard Widmer. 2020. Low-complexity models for acoustic scene classification based on receptive field regularization and frequency damping. In Proc. DCASE. 86--90."},{"key":"e_1_3_2_1_13_1","volume-title":"On information and sufficiency. The annals of mathematical statistics 22, 1","author":"Kullback Solomon","year":"1951","unstructured":"Solomon Kullback and Richard A Leibler . 1951. On information and sufficiency. The annals of mathematical statistics 22, 1 ( 1951 ), 79--86. Solomon Kullback and Richard A Leibler. 1951. On information and sufficiency. The annals of mathematical statistics 22, 1 (1951), 79--86."},{"volume-title":"Human and machine hearing: extracting meaning from sound","author":"Lyon Richard F","key":"e_1_3_2_1_14_1","unstructured":"Richard F Lyon . 2017. Human and machine hearing: extracting meaning from sound . Cambridge University Press . Richard F Lyon. 2017. Human and machine hearing: extracting meaning from sound. Cambridge University Press."},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.25080\/Majora-7b98e3ed-003"},{"key":"e_1_3_2_1_16_1","volume-title":"International Conference on Machine Learning (ICML).","author":"Nair Vinod","year":"2010","unstructured":"Vinod Nair and Geoffrey E Hinton . 2010 . Rectified linear units improve restricted boltzmann machines . In International Conference on Machine Learning (ICML). Vinod Nair and Geoffrey E Hinton. 2010. Rectified linear units improve restricted boltzmann machines. In International Conference on Machine Learning (ICML)."},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2021-656"},{"key":"e_1_3_2_1_18_1","volume-title":"Sound context classification basing on join learning model and multi-spectrogram features. arXiv preprint arXiv:2005.12779","author":"Ngo Dat","year":"2020","unstructured":"Dat Ngo , Hao Hoang , Anh Nguyen , Tien Ly , and Lam Pham . 2020. Sound context classification basing on join learning model and multi-spectrogram features. arXiv preprint arXiv:2005.12779 ( 2020 ). Dat Ngo, Hao Hoang, Anh Nguyen, Tien Ly, and Lam Pham. 2020. Sound context classification basing on join learning model and multi-spectrogram features. arXiv preprint arXiv:2005.12779 (2020)."},{"key":"e_1_3_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2019-3002"},{"key":"e_1_3_2_1_20_1","volume-title":"Proc. DCASE. 130--134","author":"Ooi Kenneth","year":"2020","unstructured":"Kenneth Ooi , Santi Peksi , and Woon-Seng Gan . 2020 . Ensemble of pruned low-complexity models for acoustic scene classification . In Proc. DCASE. 130--134 . Kenneth Ooi, Santi Peksi, and Woon-Seng Gan. 2020. Ensemble of pruned low-complexity models for acoustic scene classification. In Proc. DCASE. 130--134."},{"key":"e_1_3_2_1_21_1","volume-title":"Specaugment: A simple data augmentation method for automatic speech recognition. arXiv preprint arXiv:1904.08779","author":"Park Daniel S","year":"2019","unstructured":"Daniel S Park , William Chan , Yu Zhang , Chung-Cheng Chiu , Barret Zoph , Ekin D Cubuk , and Quoc V Le . 2019 . Specaugment: A simple data augmentation method for automatic speech recognition. arXiv preprint arXiv:1904.08779 (2019). Daniel S Park, William Chan, Yu Zhang, Chung-Cheng Chiu, Barret Zoph, Ekin D Cubuk, and Quoc V Le. 2019. Specaugment: A simple data augmentation method for automatic speech recognition. arXiv preprint arXiv:1904.08779 (2019)."},{"key":"e_1_3_2_1_22_1","volume-title":"Robust Deep Learning Frameworks for Acoustic Scene and Respiratory Sound Classification. arXiv preprint arXiv:2107.09268","author":"Pham Lam","year":"2021","unstructured":"Lam Pham . 2021. Robust Deep Learning Frameworks for Acoustic Scene and Respiratory Sound Classification. arXiv preprint arXiv:2107.09268 ( 2021 ). Lam Pham. 2021. Robust Deep Learning Frameworks for Acoustic Scene and Respiratory Sound Classification. arXiv preprint arXiv:2107.09268 (2021)."},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2019-1841"},{"key":"e_1_3_2_1_25_1","volume-title":"Bag-of-Features Models Based on C-DNN Network for Acoustic Scene Classification. In In Proc. AES.","author":"Pham Lam","year":"2019","unstructured":"Lam Pham , Ian McLoughlin , Huy Phan , Ramaswamy Palaniappan , and Yue Lang . 2019 . Bag-of-Features Models Based on C-DNN Network for Acoustic Scene Classification. In In Proc. AES. Lam Pham, Ian McLoughlin, Huy Phan, Ramaswamy Palaniappan, and Yue Lang. 2019. Bag-of-Features Models Based on C-DNN Network for Acoustic Scene Classification. In In Proc. AES."},{"key":"e_1_3_2_1_26_1","volume-title":"Proc. CBMI. 23--28","author":"Pham Lam","year":"2021","unstructured":"Lam Pham , Dat Ngo , Phu X Nguyen , Truong Hoang , and Alexander Schindler . 2021 . An Audio-Visual Dataset and Deep Learning Frameworks for Crowded Scene Classification . In Proc. CBMI. 23--28 . Lam Pham, Dat Ngo, Phu X Nguyen, Truong Hoang, and Alexander Schindler. 2021. An Audio-Visual Dataset and Deep Learning Frameworks for Crowded Scene Classification. In Proc. CBMI. 23--28."},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.dsp.2020.102943"},{"key":"e_1_3_2_1_28_1","volume-title":"Oliver Y. Ch\u00e9n, Lam Pham, Philipp Koch, Ian McLoughlin, and Alfred Mertins.","author":"Phan Huy","year":"2021","unstructured":"Huy Phan , Huy Le Nguyen , Oliver Y. Ch\u00e9n, Lam Pham, Philipp Koch, Ian McLoughlin, and Alfred Mertins. 2021 . Multi-View Audio And Music Classification. In ICASSP. 611--615. Huy Phan, Huy Le Nguyen, Oliver Y. Ch\u00e9n, Lam Pham, Philipp Koch, Ian McLoughlin, and Alfred Mertins. 2021. Multi-View Audio And Music Classification. In ICASSP. 611--615."},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2019.8683288"},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/2733373.2806390"},{"key":"e_1_3_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1109\/TASLP.2014.2375575"},{"key":"e_1_3_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2019-2231"},{"key":"e_1_3_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.5555\/2627435.2670313"},{"key":"e_1_3_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCSVT.2019.2935128"},{"key":"e_1_3_2_1_36_1","volume-title":"International Conference on Learning Representations (ICLR).","author":"Tokozume Yuji","year":"2018","unstructured":"Yuji Tokozume , Yoshitaka Ushiku , and Tatsuya Harada . 2018 . Learning from between-class examples for deep sound recognition . In International Conference on Learning Representations (ICLR). Yuji Tokozume, Yoshitaka Ushiku, and Tatsuya Harada. 2018. Learning from between-class examples for deep sound recognition. In International Conference on Learning Representations (ICLR)."},{"key":"e_1_3_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-00764-5_2"}],"event":{"name":"MMAsia '22: ACM Multimedia Asia","sponsor":["SIGMM ACM Special Interest Group on Multimedia"],"location":"Tokyo Japan","acronym":"MMAsia '22"},"container-title":["Proceedings of the 4th ACM International Conference on Multimedia in Asia"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3551626.3564962","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,12,13]],"date-time":"2023-12-13T11:10:38Z","timestamp":1702465838000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3551626.3564962"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,12,13]]},"references-count":33,"alternative-id":["10.1145\/3551626.3564962","10.1145\/3551626"],"URL":"https:\/\/doi.org\/10.1145\/3551626.3564962","relation":{},"subject":[],"published":{"date-parts":[[2022,12,13]]},"assertion":[{"value":"2022-12-13","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}