Abstract is missing.
- Representation Learning for Audio Privacy Preservation Using Source Separation and Robust Adversarial LearningDiep Luong, Minh Tran, Shayan Gharib, Konstantinos Drossos, Tuomas Virtanen. 1-5 [doi]
- Exploring the Integration of Speech Separation and Recognition with Self-Supervised Learning RepresentationYoshiki Masuyama, Xuankai Chang, Wangyou Zhang, Samuele Cornell, Zhong-qiu Wang, Nobutaka Ono, Yanmin Qian, Shinji Watanabe 0001. 1-5 [doi]
- SLMGAN: Exploiting Speech Language Model Representations for Unsupervised Zero-Shot Voice Conversion in GANsYinghao Aaron Li, Cong Han, Nima Mesgarani. 1-5 [doi]
- Signal Reconstruction from Mel-Spectrogram Based on Bi-Level Consistency of Full-Band Magnitude and PhaseYoshiki Masuyama, Natsuki Ueno, Nobutaka Ono. 1-5 [doi]
- Mitigating Cross-Database Differences for Learning Unified HRTF RepresentationYutong Wen, You Zhang 0001, Zhiyao Duan. 1-5 [doi]
- Neural Audio Decorrelation Using Generative Adversarial NetworksCarlotta Anemüller, Oliver Thiergart, Emanuël A. P. Habets. 1-5 [doi]
- Relative Transfer Function Vector Estimation for Acoustic Sensor Networks Exploiting Covariance Matrix StructureWiebke Middelberg, Henri Gode, Simon Doclo. 1-5 [doi]
- Efficient Deep Acoustic Echo Suppression with Condition-Aware TrainingErnst Seidel, Pejman Mowlaee, Tim Fingscheidt. 1-5 [doi]
- Learning Sub-Dimensional HRTF Representations Towards Individualization Applications - Traditional and Deep Learning ApproachesDevansh Zurale, Shlomo Dubnov. 1-5 [doi]
- Covariance Blocking and Whitening Method for Successive Relative Transfer Function Vector Estimation in Multi-Speaker ScenariosHenri Gode, Simon Doclo. 1-5 [doi]
- Multi-Source Direction-of-Arrival Estimation using Group-Sparse Fitting of Steered Response Power MapsElisa Tengan, Thomas Dietzen, Filip Elvander, Toon van Waterschoot. 1-5 [doi]
- Directional Target Speaker Extraction under Noisy Underdetermined Conditions through Conditional Variational Autoencoder with Global Style TokensRui Wang, Tomoki Toda. 1-5 [doi]
- Perceptual Musical Similarity Metric Learning with Graph Neural NetworksCyrus Vahidi, Shubhr Singh, Emmanouil Benetos, Huy Phan, Dan Stowell, György Fazekas, Mathieu Lagrange. 1-5 [doi]
- Yet Another Generative Model for Room Impulse Response EstimationSungho Lee, Hyeong-Seok Choi, Kyogu Lee. 1-5 [doi]
- Neural Networks for Interference Reduction in Multi-Track RecordingsRajesh R, Padmanabhan Rajan. 1-5 [doi]
- Pretraining Respiratory Sound Representations using Metadata and Contrastive LearningIlyass Moummad, Nicolas Farrugia. 1-5 [doi]
- SEFGAN: Harvesting the Power of Normalizing Flows and GANs for Efficient High-Quality Speech EnhancementMartin Strauss 0003, Nicola Pia, Nagashree K. S. Rao, Bernd Edler. 1-5 [doi]
- Towards on-Device Keyword Spotting using Low-Footprint Quaternion Neural ModelsAryan Chaudhary, Vinayak Abrol. 1-5 [doi]
- Optimizing Higher-Order Directional Audio Coding with Adaptive Mixing and Energy Matching for Ambisonic Compression and UpmixingChristoph Hold, Leo McCormack, Archontis Politis, Ville Pulkki. 1-5 [doi]
- Deep Adaptation Control for Stereophonic Acoustic Echo CancellationAmir Ivry, Israel Cohen, Baruch Berdugo. 1-5 [doi]
- Histogram Layer Time Delay Neural Networks for Passive Sonar ClassificationJarin Ritu, Ethan Barnes, Riley Martell, Alexandra Van Dine, Joshua Peeples. 1-5 [doi]
- Computing Acoustic Onsets Via an Eikonal SolverSamuel F. Potter, Monte Hoover, Dmitry N. Zotkin, Ramani Duraiswami. 1-5 [doi]
- Quaternion Anti-Transfer Learning for Speech Emotion RecognitionEric Guizzo, Tillman Weyde, Giacomo Tarroni, Danilo Comminiello. 1-5 [doi]
- Time-Domain Audio Source Separation Based on Gaussian Processes with Deep Kernel LearningAditya Arie Nugraha, Diego Di Carlo, Yoshiaki Bando, Mathieu Fontaine 0002, Kazuyoshi Yoshii. 1-5 [doi]
- Lace: A Light-Weight, Causal Model for Enhancing Coded Speech Through Adaptive ConvolutionsJan Büthe, Jean-Marc Valin, Ahmed Mustafa. 1-5 [doi]
- Inverted Cardioid Topology for Multi-Radius Spherical Microphone ArraysMark R. P. Thomas, Jan-Hendrik Hanschke. 1-5 [doi]
- Region-of-Interest Oriented Constant-Beamwidth Beamforming with Rectangular ArraysGal Itzhak, Israel Cohen. 1-5 [doi]
- Adaptive Sparse Linear Prediction in Fixed-Filter ANC Headphone Applications for Multi-Speaker Speech ReductionYurii Iotov, Sidsel Marie Nørholm, Valiantsin Belyi, Mads Græsbøll Christensen. 1-5 [doi]
- Analysis of XLS-R for Speech Quality AssessmentBastiaan Tamm, Rik Vandenberghe, Hugo Van Hamme. 1-5 [doi]
- Hybrid Noise Shaping for Audio Coding Using Perfectly Overlapped WindowByeongho Jo, Seungkwon Beack. 1-5 [doi]
- General Purpose Audio Effect RemovalMatthew Rice, Christian J. Steinmetz, George Fazekas, Joshua D. Reiss. 1-5 [doi]
- Low Bit Rate Binaural Link for Improved Ultra Low-Latency Low-Complexity Multichannel Speech Enhancement in Hearing AidsNils L. Westhausen, Bernd T. Meyer. 1-5 [doi]
- Temporal Noise Shaping on MDCT Subband Signals for Transform Audio CodingRichard Füg, Bernd Edler. 1-5 [doi]
- Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text RepresentationsYuma Koizumi, Heiga Zen, Shigeki Karita, Yifan Ding, Kohei Yatabe, Nobuyuki Morioka, Yu Zhang 0033, Wei Han, Ankur Bapna, Michiel Bacchiani. 1-5 [doi]
- Differentiable Representation of Warping Based on Lie Group TheoryAtsushi Miyashita, Tomoki Toda. 1-5 [doi]
- Distribution of Modal Damping in Absorptive Shoebox RoomsMaximilian Schäfer, Karolina Prawda, Rudolf Rabenstein, Sebastian J. Schlecht. 1-5 [doi]
- CLIPSonic: Text-to-Audio Synthesis with Unlabeled Videos and Pretrained Language-Vision ModelsHao-Wen Dong, Xiaoyu Liu, Jordi Pons, Gautam Bhattacharya, Santiago Pascual, Joan Serrà, Taylor Berg-Kirkpatrick, Julian J. McAuley. 1-5 [doi]
- A Differentiable Acoustic Guitar Model for String-Specific Polyphonic SynthesisAndrew Wiggins, Youngmoo E. Kim. 1-5 [doi]
- Array Configuration Mismatch in Deep DOA Estimation: Towards Robust TrainingAyal Schwartz, Elior Hadad, Sharon Gannot, Shlomo E. Chazan. 1-5 [doi]
- Unsupervised Improvement of Audio-Text Cross-Modal RepresentationsZhepei Wang, Cem Subakan, Krishna Subramani, Junkai Wu, Tiago Tavares, Fábio Ayres, Paris Smaragdis. 1-5 [doi]
- Perceptual Quality Enhancement of Sound Field Synthesis Based on Combination of Pressure and Amplitude MatchingKeisuke Kimura, Shoichi Koyama, Hiroshi Saruwatari. 1-5 [doi]
- An Improved Metric of Informational Masking for Perceptual Audio Quality MeasurementPablo M. Delgado, Jürgen Herre. 1-5 [doi]
- Multichannel Subband-Fullband Gated Convolutional Recurrent Neural Network for Direction-Based Speech Enhancement with Head-Mounted Microphone ArraysBenjamin Stahl, Alois Sontacchi. 1-5 [doi]
- Single Channel Speech Presence Probability Estimation based on Hybrid Global-Local InformationShuai Tao, Yang Xiang, Himavanth Reddy, Jesper Rindom Jensen, Mads Græsbøll Christensen. 1-5 [doi]
- An Objective Evaluation of Hearing AIDS and DNN-Based Binaural Speech Enhancement in Complex Acoustic ScenesEnric Gusó, Joanna Luberadzka, Martí Baig, Umut Sayin Saraç, Xavier Serra. 1-5 [doi]
- All-in-One Metrical and Functional Structure Analysis with Neighborhood Attentions on Demixed AudioTaejun Kim, Juhan Nam. 1-5 [doi]
- Compressing Audio CNNS with Graph Centrality Based Filter PruningJames A. King, Arshdeep Singh, Mark D. Plumbley. 1-5 [doi]
- Leveraging Synthetic Data for Improving Chamber Ensemble SeparationSaurjya Sarkar, Louise Thorpe, Emmanouil Benetos, Mark Sandler 0001. 1-5 [doi]
- Sound Source Distance Estimation in Diverse and Dynamic Acoustic ConditionsSaksham Singh Kushwaha, Irán R. Román, Magdalena Fuentes, Juan Pablo Bello. 1-5 [doi]
- Fitting Auditory Filterbanks with Multiresolution Neural NetworksVincent Lostanlen, Daniel Haider, Han Han, Mathieu Lagrange, Péter Balázs, Martin Ehler. 1-5 [doi]
- Extending Audio Masked Autoencoders toward Audio RestorationZhi Zhong, Hao Shi, Masato Hirano, Kazuki Shimada, Kazuya Tateishi, Takashi Shibuya 0001, Shusuke Takahashi, Yuki Mitsufuji. 1-5 [doi]
- Slim-Tasnet: A Slimmable Neural Network for Speech SeparationMohamed Elminshawi, Srikanth Raj Chetupalli, Emanuël A. P. Habets. 1-5 [doi]
- Convolutive Block-Matching Segmentation Algorithm with Application to Music Structure AnalysisAxel Marmoret, Jérémy E. Cohen, Frédéric Bimbot. 1-5 [doi]
- Low-Complexity Higher Order Scattering Delay NetworksLeny Vinceslas, Matteo Scerbo, Hüseyin Hacihabiboglu, Zoran Cvetkovic, Enzo De Sena. 1-5 [doi]
- Music De-Limiter Networks Via Sample-Wise Gain InversionChang-Bin Jeon, Kyogu Lee. 1-5 [doi]
- Automatic Detection of Poor Tone Quality in Classical Guitar Playing Using Deep Anomaly Detection MethodKenta Ogawa, Shun Sawada, Kouichi Katsurada, Hidehumi Ohmura. 1-5 [doi]
- Predicting Thresholds in an Auditory Overshoot Paradigm Using a Computational Subcortical Model with Efferent FeedbackAfagh Farhadi, Laurel H. Carney. 1-5 [doi]
- A Novel Method to Detect Instrumental Music in a Large Scale Music CatalogWo Jae Lee, Emanuele Coviello. 1-5 [doi]
- Blind Room Acoustic Parameters Estimation Using Mobile Audio TransformerShivam Saini, Jürgen Peissig. 1-5 [doi]
- Wide-Area 6DOF Rendering of Multi-Point Ambisonic Recordings Based on Interpolation of Spatial ParametersArchontis Politis, Lauros Pajunen, Jussi Leppänen, Sujeet Mate, Antti J. Eronen. 1-5 [doi]
- Location as Supervision for Weakly Supervised Multi-Channel Source Separation of Machine SoundsRicardo Falcón Pérez, Gordon Wichern, François G. Germain, Jonathan Le Roux. 1-5 [doi]
- Class Activation Mapping-Driven Data Augmentation: Masking Significant Regions for Enhanced Acoustic Scene ClassificationPil Moo Byun, Jeong Hwan Choi, Joon-Hyuk Chang. 1-5 [doi]
- Flexible Multichannel Speech Enhancement for Noise-Robust FrontendAnte Jukic, Jagadeesh Balam, Boris Ginsburg. 1-5 [doi]
- The Effect of Spoken Language on Speech Enhancement Using Self-Supervised Speech Representation Loss FunctionsGeorge Close, Thomas Hain, Stefan Goetze. 1-5 [doi]
- Kernel Interpolation of Incident Sound Field in Region Including Scattering ObjectsShoichi Koyama, Masaki Nakada, Juliano G. C. Ribeiro, Hiroshi Saruwatari. 1-5 [doi]
- Diffusion Posterior Sampling for Informed Single-Channel DereverberationJean-Marie Lemercier, Simon Welker, Timo Gerkmann. 1-5 [doi]
- Annotating Jazz Recordings Using Lead Sheet Alignment with Deep Chroma FeaturesIvan Shanin, Simon Dixon. 1-5 [doi]
- Hyperbolic Unsupervised Anomalous Sound DetectionFrançois G. Germain, Gordon Wichern, Jonathan Le Roux. 1-5 [doi]
- Audio Inputs for Active Speaker Detection and Localization Via Microphone ArrayDavide Berghi, Philip J. B. Jackson. 1-5 [doi]
- A High-Rate Extension to SoundstreamHong-Goo Kang, Jan Skoglund, W. Bastiaan Kleijn, Andrew Storus, Hengchin Yeh. 1-5 [doi]
- Design of Frequency-Invariant Beamformers with Sparse Concentric Circular ArraysYaakov Buchris, Israel Cohen, Alon Amar. 1-5 [doi]
- Bridging High-Quality Audio and Video Via Language for Sound Effects Retrieval from Visual QueriesJulia Wilkins, Justin Salamon, Magdalena Fuentes, Juan Pablo Bello, Oriol Nieto. 1-5 [doi]
- Consolidating Compression and Revisiting Expansion: an Alternative Amplification Rule for Wide Dynamic Range CompressionAlice Sokolova, Baris Aksanli, Fred Harris 0001, Harinath Garudadri. 1-5 [doi]
- Robust Audio Anti-Spoofing System Based on Low-Frequency Sub-Band InformationMenglu Li, Xiao-Ping Zhang. 1-5 [doi]
- Masked Frequency Modeling for Improving Packet Loss Concealment in Speech Transmission SystemsDa-Hee Yang, Donghyun Kim, Joon-Hyuk Chang. 1-5 [doi]
- Estimating the Direction of Arrival of a Spoken Wake Word Using a Single Sensor on an Elastic PanelTre DiPassio, Michael C. Heilemann, Benjamin Thompson, Mark F. Bocko. 1-5 [doi]
- Complete and Separate: Conditional Separation with Missing Target Source Attribute CompletionDimitrios Bralios, Efthymios Tzinis, Paris Smaragdis. 1-5 [doi]
- A Differentiable Image Source Model for Room Acoustics OptimizationBowen Zhi, Alisha Sharma, Dmitry N. Zotkin, Ramani Duraiswami. 1-5 [doi]
- Single-Channel Speaker Distance Estimation in Reverberant EnvironmentsMichael Neri, Archontis Politis, Daniel Krause 0001, Marco Carli, Tuomas Virtanen. 1-5 [doi]
- Diff-Pitcher: Diffusion-Based Singing Voice Pitch CorrectionJiarui Hai, Mounya Elhilali. 1-5 [doi]
- AECSQI: Referenceless Acoustic Echo Cancellation Measures Using Speech Quality and Intelligibility ImprovementJin Woo Lee, Hyeong-Seok Choi, Kyogu Lee. 1-5 [doi]
- Correlation Based Glimpse Proportion IndexAhmed Alghamdi, Leonard Moen, Wai-Yip Chan, Daniel Fogerty, Jesper Jensen 0001. 1-5 [doi]
- Mixed-Delay Distributed Beamforming for Own-Speech Separation in Hearing Devices with Wireless Remote MicrophonesRyan M. Corey. 1-5 [doi]