Abstract is missing.
- Message from the Program ChairYu Tsao 0001, Chi-Chun Lee. [doi]
- The Message of the O-COCOSDA ConvenorSakriani Sakti. [doi]
- Robust Audio-Visual Speech Enhancement: Correcting Misassignments in Complex Environments With Advanced Post-ProcessingWenze Ren, Kuo-Hsuan Hung, Rong Chao, You-Jin Li, Hsin-Min Wang, Yu Tsao 0001. 1-6 [doi]
- DEVELOPMENT OF AN ENGLISH ORAL ASSESSMENT SYSTEM WITH THE GEPT DATASETHao-Chien Lu, Chung-Chun Wang, Jhen-Ke Lin, Berlin Chen. 1-6 [doi]
- A Neural Machine Translation System for the Low-Resource Sixian Hakka LanguageYi-Hsiang Hung, Yi-Chin Huang. 1-6 [doi]
- Improving Real-Time Music Accompaniment Separation with MMDenseNetChun-Hsiang Wang, Chung-Che Wang, Jun-You Wang, Jyh-Shing Roger Jang, Yen-Hsun Chu. 1-6 [doi]
- Developing a Thai Name Pronunciation Dictionary from Road Signs and Naming websitesAusdang Thangthai. 1-6 [doi]
- Cao Robot for Taiwanese/English Knowledge Graph ApplicationChang-Shing Lee, Mei-Hui Wang, Guan-Ying Tseng, Chao-Cyuan Yue, Hao-Chun Hsieh, Marek Z. Reformat. 1-6 [doi]
- Using Automatic Speech Recognition for Speech Comprehension Evaluation in the Cochlear ImplantHsin-Li Chang, Enoch Hsin-Ho Huang, Yi-Ching Wang, Yu Tsao 0001. 1-5 [doi]
- Gated Adapters with Balanced Activation for Effective Contextual Speech RecognitionYu-Chun Liu, Yi-Cheng Wang, Li-Ting Pai, Jia-liang Lu, Berlin Chen. 1-6 [doi]
- Chunk Size Scheduling for Optimizing the Quality-Latency Trade-off in Simultaneous Speech TranslationIqbal Pahlevi Amin, Haotian Tan, Kurniawati Azizah, Sakriani Sakti. 1-6 [doi]
- Unified Spoken Language Proficiency Assessment SystemSunil Kumar Kopparapu, Ashish Panda. 1-6 [doi]
- Updated Activities on Resources Development for Vietnamese Speech and NLPDo Van Hai, Luong Chi Mai. 1-6 [doi]
- Annotation of Addressing Behavior in Multi-Party ConversationKeisuke Kadota, Seima Oyama, Yasuharu Den. 1-6 [doi]
- CL-Child Corpus: The Phonological Development of Putonghua in Children from Dialect-Speaking RegionsJiewen Zheng, Tianxin Zheng, Mengxue Cao. 1-6 [doi]
- A Preliminary Study on Taiwanese POS Taggers: Leveraging Chinese in the Absence of Taiwanese POS Annotation DatasetsChao-Yang Chang, Yan-Ming Lin, Chih-Chung Kuo, Yen-Chun Lai, Chao-Shih Huang, Yuan-Fu Liao, Tsun-guan Thiann. 1-6 [doi]
- Multi-Resolution Singing Voice SeparationYih-Liang Shen, Ya-Ching Lai, Tai-Shih Chi. 1-6 [doi]
- Age-Related and Gender-Related Differences in Cantonese VowelsWai-Sum Lee. 1-6 [doi]
- VoxHakka: A Dialectally Diverse Multi-Speaker Text-to-Speech System for Taiwanese HakkaLi-Wei Chen, Hung-Shin Lee, Chen-Chi Chang. 1-6 [doi]
- A DEEP LEARNING BASED APPROACH WITH DATA AUGMENTATION FOR INFANT CRY SOUND VERIFICATIONNamita Gokavi, Padala Sri Ramulu, Kandregula Nanda Kishore, Sunil Saumya, Deepak K. T. 1-6 [doi]
- Acoustic Realization of /S/ Across Accents of UrduIram Fatima, Sahar Rauf. 1-6 [doi]
- An Evaluation of Neural Vocoder-Based Voice Cloning System for Dysphonia Speech DisorderDhiya Dewangga, Dessi Puji Lestari, Ayu Purwarianti, Dipta Tanaya, Kurniawati Azizah, Sakriani Sakti. 1-7 [doi]
- IIITSaint-EmoMDB: Carefully Curated Malayalam Speech Corpus with Emotion and Self-Reported Depression RatingsChrista Thomas, Guneesh Vats, Aravind Johnson, Ashin George, Talit Sara George, Reni K. Cherian, Priyanka Srivastava, Chiranjeevi Yarra. 1-6 [doi]
- IIIT-Speech Twins 1.0: An English-Hindi Parallel Speech Corpora for Speech-to-Speech Machine Translation and Automatic DubbingAnindita Mondal, Anil Kumar Vuppala, Chiranjeevi Yarra. 1-6 [doi]
- Analysis of Pathological Features for Spoof DetectionMyat Aye Aye Aung, Hay Mar Soe Naing, Aye Mya Hlaing, Win Pa Pa, Kasorn Galajit, Candy Olivia Mawalim. 1-8 [doi]
- 2024 Country Report Timor LesteSatoshi Tamura, Aristidis de Jesus Ornai. 1-6 [doi]
- InStant-EMDB: A Multi Model Spontaneous English and Malayalam Speech Corpora for Depression DetectionAnjali Mathew, Raniya, Harsha Sanjan, Amjith S. B, Reni K. Cherian, Starlet Ben Alex, Priyanka Srivastava, Chiranjeevi Yarra. 1-6 [doi]
- Infant Cry Verification with Multi-View Self-Attention Vision TransformersKartik Jagtap, Namita Gokavi, Sunil Saumya. 1-6 [doi]
- Continual Gated Adapter for Bilingual Codec Text-to-SpeechLi-Jen Yang, Jen-Tzung Chien. 1-6 [doi]
- A Feedback-Driven Self-Improvement Strategy and Emotion-Aware Vocoder for Emotional Voice ConversionZhanhang Zhang, Sakriani Sakti. 1-6 [doi]
- 2024 Philippine Country ReportNathaniel Oco, Kenichiro Kurusu. 1-6 [doi]
- UCSYSpoof: A Myanmar Language Dataset for Voice Spoofing DetectionHay Mar Soe Naing, Win Pa Pa, Aye Mya Hlaing, Myat Aye Aye Aung, Kasorn Galajit, Candy Olivia Mawalim. 1-5 [doi]
- Uncertainty-Based Ensemble Learning for Speech ClassificationBagus Tris Atmaja, Akira Sasou, Felix Burkhardt. 1-6 [doi]
- Exploring Impact of Prioritizing Intra-Singer Acoustic Variations on Singer Embedding Extractor Construction for Singer VerificationSayaka Toma, Tomoki Ariga, Yosuke Higuchi, Ichiju Hayasaka, Rie Shigyo, Tetsuji Ogawa. 1-6 [doi]
- Depression Classification Using Log-Mel Spectrograms: A Comparative Analysis of Window Size-Based Data Augmentation and Deep Learning ModelsLokesh Kumar, Kumar Kaustubh, Shashaank Aswatha Mattur, S. R. Mahadeva Prasanna. 1-6 [doi]
- Check Your Audio Data: Nkululeko for Bias DetectionFelix Burkhardt, Bagus Tris Atmaja, Anna Derington, Florian Eyben, Björn W. Schuller. 1-6 [doi]
- Modeling Response Relevance using Dialog Act and Utterance-Design Features: A Corpus-Based AnalysisMika Enomoto, Yuichi Ishimoto, Yasuharu Den. 1-6 [doi]
- Learning Contrastive Emotional Nuances in Speech SynthesisBryan Gautama Ngo, Mahdin Rohmatillah, Jen-Tzung Chien. 1-6 [doi]
- The Effectiveness of Audio-Visual Feedback for L2 Chinese Sentence Stress Perception and ProductionXingzi Gao, Yujie Gao, Sichang Gao. 1-6 [doi]
- An N-Best List Selection Framework for ASR N-Best RescoringChen-Han Wu, Kuan-Yu Chen. 1-6 [doi]
- Research on the Temporal Effect of Focus on Trisyllabic Sequences in Leizhou MinMaolin Wang, Ying Liu, Han Yu, Ziyu Xiong, Qiguang Lin. 1-6 [doi]
- A Parameter-Efficient Multi-Step Fine-Tuning of Multilingual and Multi-Task Learning Model for Japanese Dialect Speech RecognitionYuta Kamiya, Shogo Miwa, Atsuhiko Kai. 1-6 [doi]
- Enhancing Indonesian Automatic Speech Recognition: Evaluating Multilingual Models with Diverse Speech VariabilitiesAulia Adila, Dessi Puji Lestari, Ayu Purwarianti, Dipta Tanaya, Kurniawati Azizah, Sakriani Sakti. 1-6 [doi]
- Exploring Branchformer-Based End-to-End Speaker Diarization with Speaker-Wise VAD LossPei-Ying Lee, HauYun Guo, Tien-Hong Lo, Berlin Chen. 1-6 [doi]
- Comparative Study on the Phonetic Characteristics of Chinese Vowels Between Kyrgyz and Kirgiz LearnersYuan Jia, Mingshuai Yin. 1-6 [doi]
- Developing a Robust Mispronunciation Detection by Data Augmentation Based on Automatic Phone AnnotationJong-In Kim, SunHee Kim, Minhwa Chung. 1-5 [doi]
- AGENT-DRIVEN LARGE LANGUAGE MODELS FOR MANDARIN LYRIC GENERATIONHong-Hsiang Liu, Yi-Wen Liu. 1-6 [doi]
- Analysis and Detection of Differences in Spoken User Behaviors Between Autonomous and Wizard-of-Oz SystemsMikey Elmers, Koji Inoue, Divesh Lala, Keiko Ochi, Tatsuya Kawahara. 1-6 [doi]
- Exemplar-Based Methods for Mandarin Electrolaryngeal Speech Voice ConversionHsin-Te Hwang, Chia-Hua Wu, Ming-Chi Yen, Yu Tsao 0001, Hsin-Min Wang. 1-6 [doi]
- Benchmarking Clickbait Detection from News HeadlinesYing-Lung Lin, Shao-Ying Lu, Ling-Chih Yu. 1-5 [doi]
- WikiTND24: A Chinese Text Normalization DatabaseWu-Hao Li, Chen-Yu Chiang. 1-5 [doi]
- Exploration of Mongolian Word Stress Research Methods Based on Intonation Synthesis TechnologyAomin, Dahu Baiyila, Aijun Li. 1-7 [doi]
- A Preliminary Study On End-to-End Multimodal Subtitle Recognition for Taiwanese TV ProgramsPei-Chung Su, Cheng-Hsiu Cho, Chih-Chung Kuo, Yen-Chun Lai, Yan-Ming Lin, Chao-Shih Huang, Yuan-Fu Liao. 1-6 [doi]
- Clapping Hands to Word Stress Improves Children's L2 English Pronunciation Accuracy in a Word Imitation Task: Evidence from a Classroom StudyMeiyun Chen. 1-6 [doi]
- Enhancing Phoneme Recognition in the Bengali Language Through Fine-Tuning of Multilingual ModelAkash Deep, Puja Bharati, Sabyasachi Chandra, Debolina Pramanik, Korra Siva Naik, Shayamal Kumar Das Mandal. 1-5 [doi]
- The Development of LOTUS-TRD: A Thai Regional Dialect Speech CorpusSumonmas Thatphithakkul, Kwanchiva Thangthai, Vataya Chunwijitra. 1-6 [doi]
- Indonesian-English Code-Switching Speech Synthesizer Utilizing Multilingual STEN-TTS and Bert LIDAhmad Alfani Handoyo, Chung Tran, Dessi Puji Lestari, Sakriani Sakti. 1-6 [doi]
- Construction of Large Language Models for Taigi and Hakka Using Transfer LearningYen-Chun Lai, Yi-Jun Zheng, Wen-Han Hsu, Yan-Ming Lin, Cheng-Hsiu Cho, Chlh-Chung Kuo, Chao-Shih Huang, Yuan-Fu Liao. 1-6 [doi]
- Improving Speech Recognition by Enhancing Accent DiscriminationHao-Tian Zheng, Berlin Chen. 1-6 [doi]
- Chinese Psychological Counseling Corpus Construction for Valence-Arousal Sentiment Intensity PredictionHsiu-Min Shih, Tzu-Mi Lin, Yu-Wen Tzeng, Jung-Ying Chang, Kuo-Kai Shyu, Lung-Hao Lee. 1-5 [doi]
- Singer Separation for Karaoke Content GenerationHsuan-Yu Lin, Xuanjun Chen, Jyh-Shing Roger Jang. 1-5 [doi]
- An Investigation of Chinese Speech Under Alcohol Influence: Database Construction and Phonetic AnalysisPeppina Po-lun Lee, Mosi He, Bin Li 0003. 1-5 [doi]
- Overcoming The Impact of Different Materials on Optical Microphones For Speech Capture Using Deep LearningYi-Hao Jiang, Jia-Hui Li, Jia-Wei Chen, Yi-Chang Wu, Ying-Hui Lai. 1-5 [doi]
- Computer-Assisted Pronunciation Training System for Atayal, an Indigenous Language in TaiwanYu-Lan Chuang, Hsiu-Ray Hsu, Di Tam Luu, Yi-Wen Liu, Ching-Ting Hsin. 1-6 [doi]
- Analysis and Discussion of Feature Extraction Technology for Musical Genre ClassificationShu-Hua Chen, Wei-Ting Huang, Cheng-Hao Lai, Yu-Lun Lin, Ming-Hsiang Su. 1-4 [doi]
- Continual Learning in Machine Speech Chain Using Gradient Episodic MemoryGeoffrey Tyndall, Kurniawati Azizah, Dipta Tanaya, Ayu Purwarianti, Dessi Puji Lestari, Sakriani Sakti. 1-6 [doi]
- Comprehensive Benchmarking and Analysis of Open Pretrained Thai Speech Recognition ModelsPattara Tipakasorn, Oatsada Chatthong, Ren Yonehana, Kwanchiva Thangthai. 1-7 [doi]
- Proposal of Protocols for Speech Materials Acquisition and Presentation Assisted By Tools Based on Structured Test SignalsHideki Kawahara, Ken-Ichi Sakakibara, Mitsunori Mizumachi, Kohei Yatabe. 1-6 [doi]
- Effects of Multiple Japanese Datasets for Training Voice Activity Projection ModelsYuki Sato, Yuya Chiba, Ryuichiro Higashinaka. 1-6 [doi]
- Speech Watermarking for Tampering Detection Using Singular Spectrum Analysis With Quantization Index Modulation and Psychoacoustic ModelPantarat Vichathai, Puchit Bunpleng, Patharapol Laolakkana, Sasiporn Usanavasin, Phondanai Khanti, Kasorn Galajit, Jessada Karnjana. 1-6 [doi]
- Convcounsel: A Conversational Dataset for Student CounselingPo-Chuan Chen, Mahdin Rohmatillah, You-Teng Lin, Jen-Tzung Chien. 1-6 [doi]
- Multilingual Speech Translator for Medical ConsultationZhe-Jia Xu, Yeou-Jiunn Chen, Qian-Bei Hong. 1-5 [doi]
- Right-Prominent Trisyllabic Tone Sandhi in Taifeng ChineseXiaoyan Zhang, Aijun Li, Zhiqiang Li. 1-5 [doi]
- Fusion of Multiple Audio Descriptors for the Recognition of Dysarthric SpeechKomal Bharti, Pradip K. Das. 1-6 [doi]
- Benchmarking Cognitive Domains for LLMS: Insights from Taiwanese Hakka CultureChen-Chi Chang, Ching-Yuan Chen, Hung-Shin Lee, Chih-Cheng Lee. 1-6 [doi]
- A Study on the Acquisition of Triphthong Vowels by Altaic Chinese Learners Under the 'Belt and Road' InitiativeYuan Jia, Linjiao Pan. 1-6 [doi]
- Oriental COCOSDA - Country Report 2024 Language Resources Developed in TaiwanYuan-Fu Liao, Hsin-Min Wang. 1-6 [doi]