researchr
explore
Tags
Journals
Conferences
Authors
Profiles
Groups
calendar
New Conferences
Events
Deadlines
search
search
You are not signed in
Sign in
Sign up
Links
Filter by Year
OR
AND
NOT
1
2007
2009
2011
2013
2015
2017
2019
2021
2023
Filter by Tag
Filter by Author
[+]
OR
AND
NOT
1
Berlin Chen
Bhuvana Ramabhadran
Chanwoo Kim
Hermann Ney
Hung-yi Lee
James R. Glass
Lei Xie 0001
Lin-Shan Lee
Mark J. F. Gales
Philip C. Woodland
Ralf Schlüter
Sakriani Sakti
Sanjeev Khudanpur
Shinji Watanabe
Shinji Watanabe 0001
Steve Renals
Tara N. Sainath
Thomas Hain
Tomoki Toda
Yanmin Qian
Filter by Top terms
[+]
OR
AND
NOT
1
acoustic
asr
automatic
deep
detection
end
language
learning
model
models
multi
neural
recognition
robust
speaker
speech
spoken
system
training
using
ASRU (asru)
Editions
Publications
Viewing Publication 1 - 100 from 1123
2023
Towards Developing State-of-The-Art TTS Synthesisers for 13 Indian Languages with Signal Processing Aided Alignments
Anusha Prakash 0001
,
Srinivasan Umesh
,
Hema A. Murthy
.
asru 2023
:
1-8
[doi]
Neuralecho: Hybrid of Full-Band and Sub-Band Recurrent Neural Network For Acoustic Echo Cancellation and Speech Enhancement
Meng Yu 0003
,
Yong Xu 0004
,
Chunlei Zhang
,
Shi-Xiong Zhang
,
Dong Yu 0001
.
asru 2023
:
1-8
[doi]
Multi Transcription-Style Speech Transcription Using Attention-Based Encoder-Decoder Model
Yan Huang 0028
,
Piyush Behre
,
Guoli Ye
,
Shawn Chang
,
Yifan Gong 0001
.
asru 2023
:
1-6
[doi]
IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2023, Taipei, Taiwan, December 16-20, 2023
IEEE,
2023.
[doi]
Dialect Adaptation and Data Augmentation for Low-Resource ASR: Taltech Systems for the Madasr 2023 Challenge
Tanel Alumäe
,
Jiaming Kong
,
Daniil Robnikov
.
asru 2023
:
1-7
[doi]
Wiki-En-ASR-Adapt: Large-Scale Synthetic Dataset for English ASR Customization
Alexandra Antonova
.
asru 2023
:
1-8
[doi]
Improved Long-Form Speech Recognition By Jointly Modeling The Primary And Non-Primary Speakers
Guru Prakash Arumugam
,
Shuo-Yiin Chang
,
Tara N. Sainath
,
Rohit Prabhavalkar
,
Quan Wang
,
Shaan Bijwadia
.
asru 2023
:
1-8
[doi]
Importance of Smoothness Induced by Optimizers in Fl4Asr: Towards Understanding Federated Learning for End-To-End ASR
Sheikh Shams Azam
,
Tatiana Likhomanenko
,
Martin Pelikan
,
Jan Honza Silovsky
.
asru 2023
:
1-8
[doi]
Enabling Noisy Label Usage for Out-of-Airspace Data in Read-Back Error Detection
Lakshmi Rajendram Bashyam
,
Alexander Blatt
,
Dietrich Klakow
.
asru 2023
:
1-8
[doi]
Pareto Efficiency of Learning-Forgetting Trade-Off in Neural Language Model Adaptation
Jerome R. Bellegarda
.
asru 2023
:
1-8
[doi]
Ending the Blind Flight: Analyzing the Impact of Acoustic and Lexical Factors on WAV2VEC 2.0 in Air-Traffic Control
Alexander Blatt
,
Badr M. Abdullah
,
Dietrich Klakow
.
asru 2023
:
1-8
[doi]
Efficient Cascaded Streaming ASR System Via Frame Rate Reduction
Xingyu Cai
,
David Qiu
,
Shaojin Ding
,
Dongseong Hwang
,
Weiran Wang
,
Antoine Bruguier
,
Rohit Prabhavalkar
,
Tara N. Sainath
,
Yanzhang He
.
asru 2023
:
1-8
[doi]
Prompting and Adapter Tuning For Self-Supervised Encoder-Decoder Speech Model
Kai-Wei Chang
,
Ming-Hsin Chen
,
Yun-Ping Lin
,
Jing Neng Hsu
,
Paul Kuo-Ming Huang
,
Chien-Yu Huang
,
Shang-wen Li 0001
,
Hung-yi Lee
.
asru 2023
:
1-8
[doi]
Meta-Learning Framework for End-to-End Imposter Identification in Unseen Speaker Recognition
Ashutosh Chaubey
,
Sparsh Sinha
,
Susmita Ghose
.
asru 2023
:
1-8
[doi]
Few-Shot Spoken Language Understanding Via Joint Speech-Text Models
Chung-Ming Chien
,
Mingjiamei Zhang
,
Ju-Chieh Chou
,
Karen Livescu
.
asru 2023
:
1-8
[doi]
Adversarial Augmentation For Adapter Learning
Jen-Tzung Chien
,
Wei-Yu Sun
.
asru 2023
:
1-7
[doi]
Extending Self-Distilled Self-Supervised Learning For Semi-Supervised Speaker Verification
Jeong Hwan Choi
,
Jehyun Kyung
,
Ju-Seok Seong
,
Ye-Rin Jeoung
,
Joon-Hyuk Chang
.
asru 2023
:
1-8
[doi]
Evaluating Self-Supervised Speech Models on a Taiwanese Hokkien Corpus
Yi-Hui Chou
,
Kalvin Chang
,
Meng-Ju Wu
,
Winston Ou
,
Alice Wen-Hsin Bi
,
Carol Yang
,
Bryan Y. Chen
,
Rong-Wei Pai
,
Po-Yen Yeh
,
Jo-Peng Chiang
,
Iu-Tshian Phoann
,
Winnie Chang
,
Chenxuan Cui
,
Noel Chen
,
Jiatong Shi
.
asru 2023
:
1-7
[doi]
Improving Audiovisual Active Speaker Detection in Egocentric Recordings with the Data-Efficient Image Transformer
Jason Clarke
,
Yoshihiko Gotoh
,
Stefan Goetze
.
asru 2023
:
1-8
[doi]
Can Unpaired Textual Data Replace Synthetic Speech in ASR Model Adaptation?
Pasquale D'Alterio
,
Christian Hensel
,
Bashar Awwad Shiekh Hasan
.
asru 2023
:
1-8
[doi]
Leveraging Multilingual Self-Supervised Pretrained Models for Sequence-to-Sequence End-to-End Spoken Language Understanding
Pavel Denisov
,
Ngoc Thang Vu
.
asru 2023
:
1-8
[doi]
Generalized Zero-Shot Audio-to-Intent Classification
Veera Raghavendra Elluru
,
Devang Kulshreshtha
,
Rohit Paturi
,
Sravan Bodapati
,
Srikanth Ronanki
.
asru 2023
:
1-8
[doi]
MUST: A Multilingual Student-Teacher Learning Approach for Low-Resource Speech Recognition
Muhammad Umar Farooq
,
Rehan Ahmad
,
Thomas Hain
.
asru 2023
:
1-6
[doi]
CAMSAT: Augmentation Mix and Self-Augmented Training Clustering for Self-Supervised Speaker Recognition
Abderrahim Fathan
,
Jahangir Alam
.
asru 2023
:
1-8
[doi]
No Pitch Left Behind: Addressing Gender Unbalance In Automatic Speech Recognition Through Pitch Manipulation
Dennis Fucci
,
Marco Gaido
,
Matteo Negri
,
Mauro Cettolo
,
Luisa Bentivogli
.
asru 2023
:
1-8
[doi]
LV-CTC: Non-Autoregressive ASR With CTC and Latent Variable Models
Yuya Fujita
,
Shinji Watanabe 0001
,
Xuankai Chang
,
Takashi Maekaku
.
asru 2023
:
1-6
[doi]
Robust End-to-End Diarization with Domain Adaptive Training and Multi-Task Learning
Ivan Fung
,
Lahiru Samarakoon
,
Samuel J. Broughton
.
asru 2023
:
1-7
[doi]
GPU-Accelerated Wfst Beam Search Decoder for CTC-Based Speech Recognition
Daniel Galvez
,
Tim Kaldewey
.
asru 2023
:
1-7
[doi]
Robust Recognition of Speaker Emotion With Difference Feature Extraction Using a Few Enrollment Utterances
Daichi Hayakawa
,
Takehiko Kagoshima
,
Kenji Iwata
,
Norbert Braunschweiler
,
Rama Doddipatla
.
asru 2023
:
1-7
[doi]
Maximizing Data Efficiency for Cross-Lingual TTS Adaptation by Self-Supervised Representation Mixing and Embedding Initialization
Wei-Ping Huang
,
Sung-Feng Huang
,
Hung-yi Lee
.
asru 2023
:
1-8
[doi]
Simulation of Teacher-Learner Interaction in English Language Pronunciation Learning
Elaf Islam
,
Thomas Hain
,
Protima Nomo Sudro
.
asru 2023
:
1-6
[doi]
Model-Based Fairness Metric for Speaker Verification
Maliha Jahan
,
Laureano Moro-Velázquez
,
Thomas Thebaud
,
Najim Dehak
,
Jesús Villalba 0001
.
asru 2023
:
1-7
[doi]
Summarize While Translating: Universal Model With Parallel Decoding for Summarization and Translation
Takatomo Kano
,
Atsunori Ogawa
,
Marc Delcroix
,
Kohei Matsuura
,
Takanori Ashihara
,
William Chen
,
Shinji Watanabe 0001
.
asru 2023
:
1-8
[doi]
A Token-Wise Beam Search Algorithm for RNN-T
Gil Keren
.
asru 2023
:
1-8
[doi]
Pseudo-Label Based Supervised Contrastive Loss for Robust Speech Representations
Varun Krishna
,
Sriram Ganapathy
.
asru 2023
:
1-8
[doi]
Towards General-Purpose Text-Instruction-Guided Voice Conversion
Chun-Yi Kuan
,
Chen-An Li
,
Tsu-Yuan Hsu
,
Tse-Yang Lin
,
Ho-Lam Chung
,
Kai-Wei Chang
,
Shuo-Yiin Chang
,
Hung-yi Lee
.
asru 2023
:
1-8
[doi]
Audio-Visual Neural Syntax Acquisition
Cheng-I Jeff Lai
,
Freda Shi
,
Puyuan Peng
,
Yoon Kim
,
Kevin Gimpel
,
Shiyu Chang
,
Yung-Sung Chuang
,
Saurabhchand Bhati
,
David D. Cox
,
David Harwath
,
Yang Zhang 0001
,
Karen Livescu
,
James R. Glass
.
asru 2023
:
1-8
[doi]
Cross-Modal Learning for CTC-Based ASR: Leveraging CTC-Bertscore and Sequence-Level Training
Mun-Hak Lee
,
Sang-Eon Lee
,
Ji-Eun Choi
,
Joon-Hyuk Chang
.
asru 2023
:
1-8
[doi]
AWMC: Online Test-Time Adaptation Without Mode Collapse for Continual Adaptation
Jae Hong Lee
,
Do-Hee Kim
,
Joon-Hyuk Chang
.
asru 2023
:
1-8
[doi]
LC4SV: A Denoising Framework Learning to Compensate for Unseen Speaker Verification Models
Chi-Chang Lee
,
Hong-Wei Chen
,
Chu-Song Chen
,
Hsin-Min Wang
,
Tsung-Te Liu
,
Yu Tsao 0001
.
asru 2023
:
1-8
[doi]
Exploring the Viability of Synthetic Audio Data for Audio-Based Dialogue State Tracking
Jihyun Lee
,
Yejin Jeon
,
Wonjun Lee
,
Yunsu Kim 0001
,
Gary Geunbae Lee
.
asru 2023
:
1-8
[doi]
Optimizing Two-Pass Cross-Lingual Transfer Learning: Phoneme Recognition And Phoneme To Grapheme Translation
Wonjun Lee
,
Gary Geunbae Lee
,
Yunsu Kim 0001
.
asru 2023
:
1-8
[doi]
Yodas: Youtube-Oriented Dataset for Audio and Speech
Xinjian Li
,
Shinnosuke Takamichi
,
Takaaki Saeki
,
William Chen
,
Sayaka Shiota
,
Shinji Watanabe 0001
.
asru 2023
:
1-8
[doi]
After: Active Learning Based Fine-Tuning Framework for Speech Emotion Recognition
Dongyuan Li
,
Yusong Wang
,
Kotaro Funakoshi
,
Manabu Okumura
.
asru 2023
:
1-8
[doi]
Unconstrained Dysfluency Modeling for Dysfluent Speech Transcription and Detection
Jiachen Lian
,
Carly Feng
,
Naasir Farooqi
,
Steve Li
,
Anshul Kashyap
,
Cheol Jun Cho
,
Peter Wu
,
Robbie Netzorg
,
Tingle Li
,
Gopala Krishna Anumanchipalli
.
asru 2023
:
1-8
[doi]
Av-Data2Vec: Self-Supervised Learning of Audio-Visual Speech Representations with Contextualized Target Representations
Jiachen Lian
,
Alexei Baevski
,
Wei-Ning Hsu
,
Michael Auli
.
asru 2023
:
1-8
[doi]
Zero-Shot Domain-Sensitive Speech Recognition with Prompt-Conditioning Fine-Tuning
Feng-Ting Liao
,
Yung-Chieh Chan
,
Yi-Chang Chen
,
Chan-Jan Hsu
,
Da-shan Shiu
.
asru 2023
:
1-8
[doi]
MelHuBERT: A Simplified Hubert on Mel Spectrograms
Tzu-Quan Lin
,
Hung-yi Lee
,
Hao Tang 0002
.
asru 2023
:
1-8
[doi]
Reducing the Cost of Spoof Detection Labeling using Mixed-Strategy Active Learning and Pretrained Models
Mark Lindsey
,
Nathaniel R. Robinson
,
Francis Kubala
,
Richard M. Stern
.
asru 2023
:
1-7
[doi]
Improved Multi-Modal Emotion Recognition Using Squeeze-and-Excitation Block in Cross-Modal Attention
Junchen Liu
,
Jesin James
,
Karan Nathwani
.
asru 2023
:
1-8
[doi]
Cross-Modal Alignment With Optimal Transport For CTC-Based ASR
Xugang Lu
,
Peng Shen
,
Yu Tsao 0001
,
Hisashi Kawai
.
asru 2023
:
1-7
[doi]
End-To-End Training of a Neural HMM with Label and Transition Probabilities
Daniel Mann
,
Tina Raissi
,
Wilfried Michel
,
Ralf Schlüter
,
Hermann Ney
.
asru 2023
:
1-8
[doi]
Deriving Translational Acoustic Sub-Word Embeddings
Amit Meghanani
,
Thomas Hain
.
asru 2023
:
1-8
[doi]
LibriSpeech-PC: Benchmark for Evaluation of Punctuation and Capitalization Capabilities of End-to-End ASR Models
Aleksandr Meister
,
Matvei Novikov
,
Nikolay Karpov
,
Evelina Bakhturina
,
Vitaly Lavrukhin
,
Boris Ginsburg
.
asru 2023
:
1-7
[doi]
Identifying People with Mild Cognitive Impairment at Risk of Developing Dementia using Speech Analysis
Bahman Mirheidari
,
Ronan O'Malley
,
Daniel Blackburn
,
Heidi Christensen
.
asru 2023
:
1-6
[doi]
Knowledge Distillation From Offline to Streaming Transducer: Towards Accurate and Fast Streaming Model by Matching Alignments
Ji-Hwan Mo
,
Jae-Jin Jeon
,
Mun-Hak Lee
,
Joon-Hyuk Chang
.
asru 2023
:
1-7
[doi]
Combining Relative and Absolute Learning Formulations to Predict Emotional Attributes From Speech
Abinay Reddy Naini
,
Shruthi Subramanium
,
Seong-Gyun Leem
,
Carlos Busso
.
asru 2023
:
1-8
[doi]
Permod: Perceptually Grounded Voice Modification With Latent Diffusion Models
Robin Netzorg
,
Ajil Jalal
,
Luna McNulty
,
Gopala Krishna Anumanchipalli
.
asru 2023
:
1-8
[doi]
Audio-Adapterfusion: A Task-Id-Free Approach for Efficient and Non-Destructive Multi-Task Speech Recognition
Hillary Ngai
,
Rohan Agrawal
,
Neeraj Gaur
,
W. Ronny Huang
,
Parisa Haghani
,
Pedro Moreno Mengibar
.
asru 2023
:
1-8
[doi]
Multitask Learning Model with Text and Speech Representation for Fine-Grained Speech Scoring
Seongjin Park
,
Rutuja Ubale
.
asru 2023
:
1-7
[doi]
The Role of Feature Correlation on Quantized Neural Networks
David Qiu
,
Shaojin Ding
,
Yanzhang He
.
asru 2023
:
1-7
[doi]
Can We Use Speaker Embeddings On Spontaneous Speech Obtained From Medical Conversations To Predict Intelligibility?
Sebastião Quintas
,
Mathieu Balaguer
,
Julie Mauclair
,
Virginie Woisard
,
Julien Pinquier
.
asru 2023
:
1-7
[doi]
Robust Logarithmic Champernowne Algorithm for Feedback Cancellation in Hearing aids
Vanitha Devi R
,
Vasundhara
.
asru 2023
:
1-5
[doi]
MASR: Multi-Label Aware Speech Representation
Anjali Raj
,
Shikhar Bharadwaj
,
Sriram Ganapathy
,
Min Ma
,
Shikhar Vashishth
.
asru 2023
:
1-8
[doi]
Paraconsistent Feature Analysis for the Competency Evaluation of Voice Impersonation
Rajeev Rajan
,
Noumida Abdul Kareem
,
Sreelakshmi S
.
asru 2023
:
1-7
[doi]
On Time Domain Conformer Models for Monaural Speech Separation in Noisy Reverberant Acoustic Environments
William Ravenscroft
,
Stefan Goetze
,
Thomas Hain
.
asru 2023
:
1-7
[doi]
On the Relevance of Phoneme Duration Variability of Synthesized Training Data for Automatic Speech Recognition
Nick Rossenbach
,
Benedikt Hilmes
,
Ralf Schlüter
.
asru 2023
:
1-8
[doi]
A Single Speech Enhancement Model Unifying Dereverberation, Denoising, Speaker Counting, Separation, And Extraction
Kohei Saijo
,
Wangyou Zhang
,
Zhong-qiu Wang
,
Shinji Watanabe 0001
,
Tetsunori Kobayashi
,
Tetsuji Ogawa
.
asru 2023
:
1-6
[doi]
Leveraging the Multilingual Indonesian Ethnic Languages Dataset In Self-Supervised Models for Low-Resource ASR Task
Sakriani Sakti
,
Benita Angela Titalim
.
asru 2023
:
1-8
[doi]
Transformer Attractors for Robust and Efficient End-To-End Neural Diarization
Lahiru Samarakoon
,
Samuel J. Broughton
,
Marc Härkönen
,
Ivan Fung
.
asru 2023
:
1-8
[doi]
Invert-Classify: Recovering Discrete Prosody Inputs for Text-To-Speech
Nicholas Sanders
,
Korin Richmond
.
asru 2023
:
1-7
[doi]
HEVAL: A New Hybrid Evaluation Metric for Automatic Speech Recognition Tasks
Zitha Sasindran
,
Harsha Yelchuri
,
T. V. Prabhakar
,
Supreeth Rao
.
asru 2023
:
1-7
[doi]
Espnet-Summ: Introducing a Novel Large Dataset, Toolkit, and a Cross-Corpora Evaluation of Speech Summarization Systems
Roshan S. Sharma
,
William Chen
,
Takatomo Kano
,
Ruchira Sharma
,
Siddhant Arora
,
Shinji Watanabe 0001
,
Atsunori Ogawa
,
Marc Delcroix
,
Rita Singh
,
Bhiksha Raj
.
asru 2023
:
1-8
[doi]
Generative Linguistic Representation for Spoken Language Identification
Peng Shen
,
Xuguang Lu
,
Hisashi Kawai
.
asru 2023
:
1-8
[doi]
Findings of the 2023 ML-Superb Challenge: Pre-Training And Evaluation Over More Languages And Beyond
Jiatong Shi
,
William Chen
,
Dan Berrebbi
,
Hsiu-Hsuan Wang
,
Wei-Ping Huang
,
En-Pei Hu
,
Ho-Lam Chuang
,
Xuankai Chang
,
Yuxun Tang
,
Shang-wen Li 0001
,
Abdelrahman Mohamed
,
Hung-yi Lee
,
Shinji Watanabe 0001
.
asru 2023
:
1-8
[doi]
Domain Adaptation by Data Distribution Matching Via Submodularity For Speech Recognition
Yusuke Shinohara
,
Shinji Watanabe 0001
.
asru 2023
:
1-7
[doi]
Discriminative Speech Recognition Rescoring With Pre-Trained Language Models
Prashanth Gurunath Shivakumar
,
Jari Kolehmainen
,
Yile Gu
,
Ankur Gandhe
,
Ariya Rastrow
,
Ivan Bulyko
.
asru 2023
:
1-7
[doi]
Detecting Speech Abnormalities With a Perceiver-Based Sequence Classifier that Leverages a Universal Speech Model
Hagen Soltau
,
Izhak Shafran
,
Alex Ottenwess
,
Joseph R. Duffy
,
Rene L. Utianski
,
Leland R. Barnard
,
John L. Stricker
,
Daniela A. Wiepert
,
David T. Jones
,
Hugo Botha
.
asru 2023
:
1-7
[doi]
Segment-Level Vectorized Beam Search Based on Partially Autoregressive Inference
Masao Someki
,
Nicholas Eng
,
Yosuke Higuchi
,
Shinji Watanabe 0001
.
asru 2023
:
1-8
[doi]
Contextual Spelling Correction with Large Language Models
Gan Song
,
Zelin Wu
,
Golan Pundak
,
Angad Chandorkar
,
Kandarp Joshi
,
Xavier Velez
,
Diamantino Caseiro
,
Ben Haynor
,
Weiran Wang
,
Nikhil Siddhartha
,
Pat Rondon
,
Khe Chai Sim
.
asru 2023
:
1-8
[doi]
Joint Energy-Based Model for Robust Speech Classification System Against Dirty-Label Backdoor Poisoning Attacks
Martin Sustek
,
Sonal Joshi
,
Henry Li
,
Thomas Thebaud
,
Jesús Villalba 0001
,
Sanjeev Khudanpur
,
Najim Dehak
.
asru 2023
:
1-8
[doi]
Thai-Dialect: Low Resource Thai Dialectal Speech to Text Corpora
Artit Suwanbandit
,
Jaturong Chitiyaphol
,
Sutthinan Chuenchom
,
Kanyarat Kwiecien
,
Husen Sawal
,
Ruslan Uthai
,
Orathai Sangpetch
,
Ekapol Chuangsuwanich
.
asru 2023
:
1-8
[doi]
Clustering Unsupervised Representations as Defense Against Poisoning Attacks on Speech Commands Classification System
Thomas Thebaud
,
Sonal Joshi
,
Henry Li
,
Martin Sustek
,
Jesús Villalba 0001
,
Sanjeev Khudanpur
,
Najim Dehak
.
asru 2023
:
1-8
[doi]
ECAPA2: A Hybrid Neural Network Architecture and Training Strategy for Robust Speaker Embeddings
Jenthe Thienpondt
,
Kris Demuynck
.
asru 2023
:
1-8
[doi]
Hierarchical Attention-Based Contextual Biasing For Personalized Speech Recognition Using Neural Transducers
Sibo Tong
,
Philip Harding
,
Simon Wiesler
.
asru 2023
:
1-8
[doi]
Gated Multi Encoders and Multitask Objectives for Dialectal Speech Recognition in Indian Languages
Sathvik Udupa
,
Jesuraja Bandekar
,
Deekshitha G
,
Saurabh Kumar
,
Prasanta Kumar Ghosh
,
Sandhya Badiger
,
Abhayjeet Singh
,
Savitha Murthy
,
Priyanka Pai
,
Srinivasa Raghavan K. M.
,
Raoul Nanavati
.
asru 2023
:
1-8
[doi]
Parameter-Efficient Tuning with Adaptive Bottlenecks for Automatic Speech Recognition
Geoffroy Vanderreydt
,
Amrutha Prasad
,
Driss Khalil
,
Srikanth R. Madikeri
,
Kris Demuynck
,
Petr Motlícek
.
asru 2023
:
1-7
[doi]
Exploring Effective Distillation of Self-Supervised Speech Models for Automatic Speech Recognition
Yujin Wang
,
Changli Tang
,
Ziyang Ma
,
Zhisheng Zheng
,
Xie Chen
,
Wei-Qiang Zhang
.
asru 2023
:
1-6
[doi]
Speech Emotion Diarization: Which Emotion Appears When?
Yingzhi Wang
,
Mirco Ravanelli
,
Alya Yacoubi
.
asru 2023
:
1-7
[doi]
Zero-Shot Singing Voice Synthesis from Musical Score
Jun-You Wang
,
Hung-yi Lee
,
Jyh-Shing Roger Jang
,
Li Su
.
asru 2023
:
1-8
[doi]
Adapting Pretrained Speech Model for Mandarin Lyrics Transcription and Alignment
Jun-You Wang
,
Chon-In Leong
,
Yu-Chen Lin
,
Li Su
,
Jyh-Shing Roger Jang
.
asru 2023
:
1-8
[doi]
COCO-NUT: Corpus of Japanese Utterance and Voice Characteristics Description for Prompt-Based Control
Aya Watanabe
,
Shinnosuke Takamichi
,
Yuki Saito
,
Wataru Nakata
,
Detai Xin
,
Hiroshi Saruwatari
.
asru 2023
:
1-8
[doi]
Not All Errors Are Created Equal: Evaluating The Impact of Model and Speaker Factors on ASR Outcomes in Clinical Populations
Daniela A. Wiepert
,
Rene L. Utianski
,
Joseph R. Duffy
,
John L. Stricker
,
Leland Barnard
,
Keith A. Josephs
,
Jennifer L. Whitwell
,
David T. Jones
,
Hugo Botha
.
asru 2023
:
1-6
[doi]
Variational Gaussian Process Data Uncertainty
Jeremy Heng Meng Wong
,
Huayun Zhang
,
Nancy F. Chen
.
asru 2023
:
1-8
[doi]
Towards Matching Phones and Speech Representations
Gene-Ping Yang
,
Hao Tang 0002
.
asru 2023
:
1-8
[doi]
Towards Robust Packet Loss Concealment System With ASR-Guided Representations
Da-Hee Yang
,
Joon-Hyuk Chang
.
asru 2023
:
1-8
[doi]
Generative Speech Recognition Error Correction With Large Language Models and Task-Activating Prompting
Chao-Han Huck Yang
,
Yile Gu
,
Yi-Chieh Liu
,
Shalini Ghosh
,
Ivan Bulyko
,
Andreas Stolcke
.
asru 2023
:
1-8
[doi]
Fast-Hubert: an Efficient Training Framework for Self-Supervised Speech Representation Learning
Guanrou Yang
,
Ziyang Ma
,
Zhisheng Zheng
,
Yakun Song
,
Zhikang Niu
,
Xie Chen 0001
.
asru 2023
:
1-7
[doi]
Investigating The Effect of Language Models in Sequence Discriminative Training For Neural Transducers
Zijian Yang
,
Wei Zhou 0043
,
Ralf Schlüter
,
Hermann Ney
.
asru 2023
:
1-8
[doi]
Consistency Based Unsupervised Self-Training for ASR Personalisation
Jisi Zhang
,
Vandana Rajan
,
Haaris Mehmood
,
David Tuckey
,
Pablo Peso Parada
,
Md Asif Jalal
,
Karthikeyan Saravanan
,
Gil Ho Lee
,
Jungin Lee
,
Seokyeong Jung
.
asru 2023
:
1-8
[doi]
Sign in
or
sign up
to see more results.