Speech and Computer - 25th International Conference, SPECOM 2023, Dharwad, India, November 29 - December 2, 2023, Proceedings, Part I - researchr publication

researchr

You are not signed in
Sign in
Sign up

Alexey Karpov 0001, K. Samudravijaya, K. T. Deepak, Rajesh M. Hegde, Shyam S. Agrawal, S. R. Mahadeva Prasanna, editors, Speech and Computer - 25th International Conference, SPECOM 2023, Dharwad, India, November 29 - December 2, 2023, Proceedings, Part I. Volume 14338 of Lecture Notes in Computer Science, Springer, 2023. [doi]

Conference: specom2023

Abstract is missing.

Extreme Learning Layer: A Boost for Spoken Digit Recognition with Spiking Neural NetworksIvan Peralta, Nanci Odetti, Hugo Leonardo Rufiner. 3-17 [doi]

EMO-AVSR: Two-Level Approach for Audio-Visual Emotional Speech RecognitionDenis Ivanko, Elena Ryumina, Dmitry Ryumin, Alexandr Axyonov, Alexey M. Kashevnik, Alexey Karpov 0001. 18-31 [doi]

Significance of Audio Quality in Speech-to-Text Translation SystemsTonmoy Rajkhowa, Amartya Chowdhury, S. R. Mahadeva Prasanna. 32-42 [doi]

Everyday Conversations: A Comparative Study of Expert Transcriptions and ASR Outputs at a Lexical LevelTatiana Y. Sherstinova, Rostislav Kolobov, Nikolay Mikhaylovskiy. 43-56 [doi]

Improving Automatic Speech Recognition with Dialect-Specific Language ModelsRaj Gothi, Preeti Rao. 57-67 [doi]

Emotional Speech Recognition of Holocaust Survivors with Deep Neural Network Models for Russian LanguageLiudmila Bukreeva, Daria Guseva, Mikhail Dolgushin, Vera Evdokimova, Vasilisa Obotnina. 68-76 [doi]

Aggregation Strategies of Wav2vec 2.0 Embeddings for Computational Paralinguistic TasksMercedes Vetráb, Gábor Gosztolya. 79-93 [doi]

Rhythm Formant Analysis for Automatic Depression ClassificationKumar Kaustubh, Parismita Gogoi, S. R. M. Prasanna. 94-106 [doi]

Determining Alcohol Intoxication Based on Speech and Neural NetworksPavel Laptev, Sergey Litovkin, Evgeny Kostyuchenko. 107-115 [doi]

Linear Frequency Residual Cepstral Coefficients for Speech Emotion RecognitionBaveet Singh Hora, S. Uthiraa, Hemant A. Patil. 116-129 [doi]

Enhancing Stutter Detection in Speech Using Zero Time Windowing Cepstral Coefficients and Phase InformationNarasinga Vamshi Raghu Simha, Mirishkar Sai Ganesh, Anil Kumar Vuppala. 130-141 [doi]

Source and System-Based Modulation Approach for Fake Speech DetectionRishith Sadashiv T. N., Devesh Kumar, Ayush Agarwal, Moakala Tzudir, Jagabandhu Mishra, S. R. Mahadeva Prasanna. 142-155 [doi]

Investigation of Different Calibration Methods for Deep Speaker Embedding Based Verification SystemsSergey Novoselov, Galina Lavrentyeva, Vladimir Volokhov, Marina Volkova, Nikita Khmelev, Artem Akulov. 159-168 [doi]

Learning to Predict Speech Intelligibility from Speech DistortionsPunnoose Kuriakose. 169-176 [doi]

Sparse Representation Frameworks for Acoustic Scene ClassificationAkansha Tyagi, Padmanabhan Rajan. 177-188 [doi]

Driver Speech Detection in Real Driving ScenarioMrinmoy Bhattacharjee, Shikha Baghel, S. R. Mahadeva Prasanna. 189-199 [doi]

Regularization Based Incremental Learning in TCNN for Robust Speech Enhancement Targeting Effective Human Machine InteractionKamini Sabu, Mukesh Sharma, Nitya Tiwari, M. Ali Basha Shaik. 200-209 [doi]

Candidate Speech Extraction from Multi-speaker Single-Channel Audio InterviewsMeghna Pandharipande, Sunil Kumar Kopparapu. 210-221 [doi]

Post-processing of Translated Speech by Pole Modification and Residual Enhancement to Improve Perceptual QualityLalaram Arya, S. R. Mahadeva Prasanna. 222-232 [doi]

Region Normalized Capsule Network Based Generative Adversarial Network for Non-parallel Voice ConversionMD. Tousin Akhter, Padmanabha Banerjee, Sandipan Dhar, Subhayu Ghosh, Nanda Dulal Jana. 233-244 [doi]

Speech Enhancement Using LinkNet ArchitectureAnuj Patel, G. Satya Prasad, Sabyasachi Chandra, Puja Bharati, Shyamal Kumar Das Mandal. 245-257 [doi]

ATT:Adversarial Trained Transformer for Speech EnhancementAniket Aitawade, Puja Bharati, Sabyasachi Chandra, G. Satya Prasad, Debolina Pramanik, Parth Sanjay Khadse, Shyamal Kumar Das Mandal. 258-270 [doi]

Human Identification by Dynamics of Changes in Brain Frequencies Using Artificial Neural NetworksDaniyar Wolf, Yaroslav Turovsky, Roman V. Meshcheryakov, Anastasia Iskhakova. 271-284 [doi]

Analysis of Formant Trajectories of a Speech Signal for the Purpose of Forensic Identification of a Foreign SpeakerRodmonga Potapova, Vsevolod Potapov, Irina Kuryanova. 287-300 [doi]

Gestures vs. Prosodic Structure in Laboratory Ironic SpeechPolina Vasileva, Uliana Kochetkova, Pavel A. Skrelin. 301-313 [doi]

Sounds of ence: Acoustics of Inhalation in Read SpeechPriyankoo Sarmah, Wendy Lalhminghlui, Neeraj Kumar Sharma 0001. 314-321 [doi]

Prolongations as Hesitation Phenomena in Spoken Speech in First and Second LanguageNatalia Bogdanova-Beglarian, Kristina Zaides, Daria Stoika, Xiaoli Sun. 322-338 [doi]

Study of Indian English Pronunciation Variabilities Relative to Received PronunciationPriyanshi Pal, Shelly Jain, Chiranjeevi Yarra, Prasanta Kumar Ghosh, Anil Kumar Vupalla. 339-349 [doi]

Multimodal Collaboration in Expository Discourse: Verbal and Nonverbal Moves AlignmentOlga Iriskhanova, Maria Kiose, Anna Leonteva, Olga Agafonova, Andrey Petrov. 350-363 [doi]

Association of Time Domain Features with Oral Cavity Configuration During Vowel Production and Its Application in Vowel RecognitionArup Saha, Tulika Basu, Bhaskar Gupta. 364-379 [doi]

Prosodic Interaction Models in a ConversationAnastasia Gorbyleva. 380-388 [doi]

Development and Research of Dialogue Agents with Long-Term Memory and Web SearchKirill Apanasovich, Olesia Makhnytkina, Yuri Matveev. 391-401 [doi]

Pre- and Post-Textual Contexts in Assessment of a Message as Offensive or Defensive Aggression VerbalizationLiliya Komalova. 402-414 [doi]

Boosting Rule-Based Grapheme-to-Phoneme Conversion with Morphological Segmentation and Syllabification in BengaliKrishnendu Ghosh, Sandipan Mandal, Nilay Roy. 415-429 [doi]

Revisiting Assessment of Text Complexity: Lexical and Syntactic Parameters FluctuationsAlexandra Vahrusheva, Valery D. Solovyev, Marina Solnyshkina, Elzara Gafiyatova, Svetlana Akhtyamova. 430-441 [doi]

Analysis of Natural Language Understanding Systems with L2 Learner Specific Synthetic Grammatical Errors Based on Parts-of-SpeechSnehal Ranjan, Sai Kalyan Nanduri, Prakul Virdi, Chiranjeevi Yarra. 442-454 [doi]

On the Most Frequent Sequences of Words in Russian Spoken Everyday Language (Bigrams and Trigrams): An Experience of ClassificationMaria Khokhlova, Olga Blinova, Natalia Bogdanova-Beglarian, Tatiana Y. Sherstinova. 455-466 [doi]

Recognition of the Emotional State of Children by Video and Audio Modalities by Indian and Russian ExpertsElena E. Lyakso, Olga V. Frolova, Aleksandr Nikolaev, Egor Kleshnev, Platon Grave, Abylay Ilyas, Olesia Makhnytkina, Ruban Nersisson, A. Mary Mekala, M. Varalakshmi. 469-482 [doi]

Effect of Linear Prediction Order to Modify Formant Locations for Children Speech RecognitionUdara Laxman Kumar, Mikko Kurimo, Hemant Kumar Kathania. 483-493 [doi]

Gammatone-Filterbank Based Pitch-Normalized Cepstral Coefficients for Zero-Resource Children's ASRSyed Shahnawazuddin, Ankita, Avinash Kumar, Hemant Kumar Kathania. 494-505 [doi]

System Assisted Vocal Response Analysis and Assessment of Autism in Children: A Machine Learning Based ApproachSoma Khan, Tulika Basu, Joyanta Basu, Madhab Pal, Rajib Roy. 506-519 [doi]

Addressing Effects of Formant Dispersion and Pitch Sensitivity for the Development of Children's KWS SystemJayant Kumar Rout, Gayadhar Pradhan. 520-534 [doi]

Emotional State of Children with ASD and Intellectual Disabilities: Perceptual Experiment and Automatic Recognition by Video, Audio and Text ModalitiesElena E. Lyakso, Olga V. Frolova, Aleksandr Nikolaev, Severin Grechanyi, Anton Matveev, Yuri Matveev, Olesia Makhnytkina, Ruban Nersisson. 535-549 [doi]

Linear Frequency Residual Features for Infant Cry ClassificationS. Uthiraa, Aastha Kachhi, Hemant A. Patil. 550-561 [doi]

Identification of Voice Disorders: A Comparative Study of Machine Learning AlgorithmsSharal Coelho, Hosahalli Lakshmaiah Shashirekha. 565-578 [doi]

Transfer Learning Using Whisper for Dysarthric Automatic Speech RecognitionSiddharth Rathod, Monil Charola, Hemant A. Patil. 579-589 [doi]

Significance of Duration Modification in Reducing Listening Effort of Slurred Speech from Patients with Traumatic Brain InjuryOindrila Banerjee, D. Govind 0001, Suryakanth V. Gangashetty, Akhilesh Kumar Dubey, Rajeev Aravindakshan, Sasikumar Panicker, K. Reshma. 590-600 [doi]

Speech Signal Segmentation into Silence, Unvoiced and Vocalized Sections in Speech RehabilitationDariya Novokhrestova, Evgeny Kostyuchenko, Ilya Krivoshein, Lidiya N. Balatskaya. 601-610 [doi]

Respiratory Sickness Detection from Audio Recordings Using CLIP ModelsChandra Mohan Bhuma. 611-625 [doi]

Investigating the Effect of Data Impurity on the Detection Performances of Mental Disorders Through Spoken DialoguesRohan Kumar Gupta, Rohit Sinha 0003. 626-637 [doi]

runs on WebDSL