| 1 | -- | 0 | Jakub Janský, Zbynek Koldovský, Jirí Málek, Tomás Kounovský, Jaroslav Cmejla. Auxiliary function-based algorithm for blind extraction of a moving speaker |
| 2 | -- | 0 | Siqing Qin, Longbiao Wang, Sheng Li 0010, Jianwu Dang, Lixin Pan. Improving low-resource Tibetan end-to-end ASR by multilingual and multilevel unit modeling |
| 3 | -- | 0 | Slawomir K. Zielinski, Pawel Antoniuk, Hyunkook Lee, Dale Johnson. Automatic discrimination between front and back ensemble locations in HRTF-convolved binaural recordings of music |
| 4 | -- | 0 | Wen-Hsing Lai, Siou-Lin Wang. RPCA-DRNN technique for monaural singing voice separation |
| 5 | -- | 0 | Haitao Li, Shuguo Yang, Wenwu Wang. Improved capsule routing for weakly labeled sound event detection |
| 6 | -- | 0 | Itay Ifergan, Boaz Rafaely. On the selection of the number of beamformers in beamforming-based binaural reproduction |
| 7 | -- | 0 | Xin Guan, Haoyue Zhao, Qiang Li. Estimation of playable piano fingering by pitch-difference fingering match model |
| 8 | -- | 0 | Yanze Xu, Weiqing Wang, Huahua Cui, Mingyang Xu, Ming Li 0026. Paralinguistic singing attribute recognition using supervised machine learning for describing the classical tenor solo singing voice in vocal pedagogy |
| 9 | -- | 0 | Pablo Gutierrez-Parera, José J. López, Javier M. Mora-Merchan, Diego Francisco Larios. Interaural time difference individualization in HRTF by scaling through anthropometric parameters |
| 10 | -- | 0 | Maximo Cobos, Jens Ahrens, Konrad Kowalczyk, Archontis Politis. An overview of machine learning and other data-based methods for spatial audio capture, processing, and reproduction |
| 11 | -- | 0 | Lekshmi Chandrika Reghunath, Rajeev Rajan. Transformer-based ensemble method for multiple predominant instruments recognition in polyphonic music |
| 12 | -- | 0 | Taiyo Mineo, Hayaru Shouno. Improving sign-algorithm convergence rate using natural gradient for lossless audio compression |
| 13 | -- | 0 | Maximo Cobos, Jens Ahrens, Konrad Kowalczyk, Archontis Politis. Data-based spatial audio processing |
| 14 | -- | 0 | Honey Gupta, Mayank Sharma. Language agnostic missing subtitle detection |
| 15 | -- | 0 | Henning F. Schepker, Florian Denk, Birger Kollmeier, Simon Doclo. Robust single- and multi-loudspeaker least-squares-based equalization for hearing devices |
| 16 | -- | 0 | Alexander Bohlender, Lucas Van Severen, Jonathan Sterckx, Nilesh Madhu. DOA-guided source separation with direction-based initialization and time annotations using complex angular central Gaussian mixture models |
| 17 | -- | 0 | Minghang Ju, Yanyan Xu, Dengfeng Ke, Kaile Su. Masked multi-center angular margin loss for language recognition |
| 18 | -- | 0 | Marco Comunità, Andrea Gerino, Lorenzo Picinali. PlugSonic: a web- and mobile-based platform for dynamic and navigable binaural audio |
| 19 | -- | 0 | Vincent Roger, Jérôme Farinas, Julien Pinquier. Deep neural networks for automatic speech processing: a survey from large corpora to limited data |
| 20 | -- | 0 | Jinxing Gao, Diqun Yan, Mingyu Dong. Black-box adversarial attacks through speech distortion for speech emotion recognition |
| 21 | -- | 0 | Yun-Ning Hung, Chih-Wei Wu, Iroro Orife, Aaron Hipple, William Wolcott, Alexander Lerch 0001. A large TV dataset for speech and music activity detection |
| 22 | -- | 0 | Yang Xiang, Liming Shi, Jesper Lisby Højvang, Morten Højfeldt Rasmussen, Mads Græsbøll Christensen. A speech enhancement algorithm based on a non-negative hidden Markov model and Kullback-Leibler divergence |
| 23 | -- | 0 | Léo Cances, Etienne Labbé, Thomas Pellegrini. Comparison of semi-supervised deep learning algorithms for audio classification |
| 24 | -- | 0 | Ali Parsayan, Seyed Mohammad Ahadi. Correction: N-dimensional N-microphone sound source localization |
| 25 | -- | 0 | Wim Boes, Hugo Van Hamme. Multi-encoder attention-based architectures for sound recognition with partial visual assistance |
| 26 | -- | 0 | Xinhao Mei, Xubo Liu, Mark D. Plumbley, Wenwu Wang. Automated audio captioning: an overview of recent progress and new challenges |
| 27 | -- | 0 | Xiao-lei Zhang, Menglong Xu. AUC optimization for deep learning-based voice activity detection |
| 28 | -- | 0 | Reemt Hinrichs, Kevin Gerkens, Alexander Lange, Jörn Ostermann. Convolutional neural networks for the classification of guitar effects and extraction of the parameter settings of single and multi-guitar effects from instrument mixes |
| 29 | -- | 0 | Chaofeng Lan, Lei Zhang 0006, Yuanyuan Zhang, Lirong Fu, Chao Sun, Yulan Han, Meng Zhang. Attention mechanism combined with residual recurrent neural network for sound event detection and localization |
| 30 | -- | 0 | Milap Rane, Philip Coleman, Russell Mason, Søren Bech. Quantifying headphone listening experience in virtual sound environments using distraction |
| 31 | -- | 0 | Cong Jin, Fengjuan Wu, Jing Wang 0037, Yang Liu, Zixuan Guan, Zhe Han. MetaMGC: a music generation framework for concerts in metaverse |
| 32 | -- | 0 | Xuan Cao, Maoshen Jia, Jiawei Ru, Tun-Wen Pai. Cross-corpus speech emotion recognition using subspace learning and domain adaption |
| 33 | -- | 0 | Francesc Lluís, Vasileios Chatziioannou, Alex Hofmann. Points2Sound: from mono to binaural audio using 3D point cloud scenes |