Abstract is missing.
- Preserving Language Heritage Through Speech Technology: The Case of Upper SorbianIvan Kraljevski, Frank Duckhorn, Daniel Sobe, Constanze Tschöpe, Matthias Wolff. 3-22 [doi]
- Retrospective and Perspectives of TTS & STT Technology Development and Implementation for South Slavic Under-Resourced LanguagesMilan Secujski, Branislav M. Popovic, Darko Pekar, Niksa Jakovljevic, Edvin Pakoci, Sinisa Suzic, Tijana V. Nosek, Nikola Simic, Vuk Stanojev, Vlado Delic. 23-42 [doi]
- Comparison of Well and Lower-Resourced Self-training in ASRYue Luo, Péter Mihajlik. 45-56 [doi]
- Towards a Livvi-Karelian End-to-End ASR SystemIrina S. Kipyatkova, Ildar Kagirov, Mikhail Dolgushin, Alexandra Rodionova. 57-68 [doi]
- Advances in OpenASR21 Evaluation with Increased Temporal Resolution for Speech Self-supervised Learning ModelsVishwa Gupta. 69-81 [doi]
- Benchmarking Whisper Under Diverse Audio Transformations and Real-Time ConstraintsSergei Katkov, Antonio Liotta, Alessandro Vietti. 82-91 [doi]
- AutoMode-ASR: Learning to Select ASR Systems for Better Quality and CostAhmet Gunduz, Yunsu Kim 0005, Kamer Ali Yuksel, Mohamed Al-Badrashiny, Thiago Castro Ferreira, Hassan Sawaf. 92-103 [doi]
- Pre-training and Adverse Audio Samples for Data-Efficient Wake Word DetectionManuel Torralbo, Ariane Méndez, Maia Agirre, Arantza del Pozo. 104-118 [doi]
- Cross-Lingual Summarization of Speech-to-Speech Translation: A BaselinePranav Karande, Balaram Sarkar, Chandresh Kumar Maurya. 119-133 [doi]
- The ParlaSpeech Collection of Automatically Generated Speech and Text Datasets from Parliamentary ProceedingsNikola Ljubesic, Peter Rupnik, Danijel Korzinek. 137-150 [doi]
- ESC Corpus of Spoken Russian: Everyday Student Conversations Captured Through Continuous Speech Recording in Natural Communicative EnvironmentsTatiana Y. Sherstinova, Irina Petrova 0001. 151-162 [doi]
- OpenAV: Bilingual Dataset for Audio-Visual Voice Control of a Computer for Hand Disabled PeopleDenis Ivanko, Dmitry Ryumin, Alexandr Axyonov, Alexey M. Kashevnik, Alexey Karpov 0001. 163-173 [doi]
- Bulgarian Speech Resources in the CHILDES SystemVelka Popova, Dimitar Popov. 174-186 [doi]
- Multiword Units in Russian Everyday Speech: Empirical Classification and Corpus-Based StudiesNatalia Bogdanova-Beglarian, Olga Blinova, Maria Khokhlova, Tatiana Y. Sherstinova, Tatiana I. Popova. 187-200 [doi]
- Neurophysiological Correlates of Textual Modulation in Visual Stimuli: An Experimental Study of Russian and English MemesRodmonga Potapova, Vsevolod Potapov, Ekaterina Karimova, Leonid Motovskikh, Nikolay Bobrov. 201-215 [doi]
- End-to-End Speech Synthesis for the Serbian Language Based on TacotronTijana V. Nosek, Sinisa Suzic, Milan Secujski, Vuk Stanojev, Darko Pekar, Vlado Delic. 219-229 [doi]
- ChildTinyTalks (CTT): A Benchmark Dataset and Baseline for Expressive Child Speech SynthesisShaimaa Alwaisi, Mohammed Salah Al-Radhi, Géza Németh. 230-240 [doi]
- Multidimensional Rhythm: Comparing Rhythmic Properties of Australian and New Zealand MonologuesAnna Borzykh, Tatiana Shevchenko. 241-250 [doi]
- Influence of Linguistic and Sociolinguistic Factors on Speech Rate PerceptionAnastasia Ananeva, Uliana E. Kochetkova. 251-264 [doi]
- Human and Machine Keyphrase Perception in Russian Text and SpeechDaria Guseva, Olga Mitrofanova, Mikhail Dolgushin. 265-280 [doi]
- Assessment of Children's Ability to Manifest Emotions in Facial Expressions, Voice and Speech by Humans, Automatic, and on a Likert ScaleElena E. Lyakso, Olga V. Frolova, Anton Matveev, Aleksandr Nikolaev, Ruban Nersisson. 281-294 [doi]
- Investigating the Utility of wav2vec 2.0 Hidden Layers for Detecting Multiple SclerosisGábor Gosztolya, László Tóth 0001, Veronika Svindt, Judit Bóna, Ildikó Hoffmann. 297-308 [doi]
- Cross-Cultural Automatic Depression Detection Based on Audio SignalsDanila Mamontov, Sebastian Zepf, Alexey Karpov 0001, Wolfgang Minker. 309-323 [doi]
- Depression Classification Using Token Merging-Based Speech Spectrotemporal TransformerLokesh Kumar, Kumar Kaustubh, S. R. Mahadeva Prasanna. 324-335 [doi]
- Detecting Depression from Audio DataMary Idamkina, Andrea Corradini. 336-351 [doi]
- Binary and Multiclass Classification of Dysphonia Using Whisper Encoder and One-Dimensional Convolutional Neural NetworkDosti Aziz, Dávid Sztahó. 352-366 [doi]
- Approach to Assessing the Quality of Syllable Pronunciation by Patients in the Process of Speech Rehabilitation Based on Comparison with Healthy SpeakersGerman Egle, Dariya Novokhrestova, Svetlana Tomilina, Evgeny Kostyuchenko. 367-376 [doi]
- A Comparative Study for Contextualized Spoken Answer Classification in German Medical QuestionnairesPhilipp L. Harnisch, Daniel Schuhmann, Stefan Hillmann. 377-391 [doi]