Speech and Computer - 24th International Conference, SPECOM 2022, Gurugram, India, November 14-16, 2022, Proceedings - researchr publication

researchr

You are not signed in
Sign in
Sign up

S. R. Mahadeva Prasanna, Alexey Karpov 0001, K. Samudravijaya, Shyam S. Agrawal, editors, Speech and Computer - 24th International Conference, SPECOM 2022, Gurugram, India, November 14-16, 2022, Proceedings. Volume 13721 of Lecture Notes in Computer Science, Springer, 2022. [doi]

Conference: specom2022

Abstract is missing.

Thematic Diversity of Everyday Russian Discourse: A Case Study Based on the ORD CorpusEleonora Akinshina, Tatiana Y. Sherstinova. 1-9 [doi]

Neural Embedding Extractors for Text-Independent Speaker VerificationJahangir Alam, Woo Hyun Kang, Abderrahim Fathan. 10-23 [doi]

Deep Speaker Embeddings Based Online DiarizationAnastasia Avdeeva, Sergey Novoselov. 24-32 [doi]

Overlapped Speech Detection Using AM-FM Based Time-Frequency RepresentationsShikha Baghel, S. R. M. Prasanna, Prithwijit Guha. 33-43 [doi]

Significance of Dimensionality Reduction in CNN-Based Vowel Classification from Imagined Speech Using Electroencephalogram SignalsOindrila Banerjee, D. Govind, Akhilesh Kumar Dubey, Suryakanth V. Gangashetty. 44-55 [doi]

Study of Speech Recognition System Based on Transformer and Connectionist Temporal Classification Models for Low Resource LanguageShweta Bansal, Shambhu Sharan, Shyam S. Agrawal. 56-63 [doi]

An Initial Study on Birdsong Re-synthesis Using Neural VocodersRhythm Bhatia, Tomi H. Kinnunen. 64-74 [doi]

Speech Music Overlap Detection Using Spectral Peak EvolutionsMrinmoy Bhattacharjee, S. R. Mahadeva Prasanna, Prithwijit Guha. 75-86 [doi]

Influence of Accented Speech in Automatic Speech Recognition: A Case Study on Assamese L1 Speakers Speaking Code Switched Hindi-EnglishJoyshree Chakraborty, Rohit Sinha 0003, Priyankoo Sarmah. 87-98 [doi]

ClusterVote: Automatic Summarization Dataset Construction with Document ClustersDaniil Chernyshev, Boris V. Dobrov. 99-113 [doi]

Comparing Unsupervised Detection Algorithms for Audio Adversarial ExamplesShanatip Choosaksakunwiboon, Karla Pizzi, Ching-yu Kao. 114-127 [doi]

Celtic English Continuum in Pitch Patterns of Spontaneous Talk: Evidence of Long-Term ContactsMaria Chubarova, Tatiana Shevchenko. 128-138 [doi]

Coherence Based Automatic Essay Scoring Using Sentence Embedding and Recurrent Neural NetworksDadi Ramesh, Suresh Kumar Sanampudi. 139-154 [doi]

Analysis of Automatic Evaluation Metric on Low-Resourced Language: BERTScore vs BLEU ScoreGoutam Datta, Nisheeth Joshi, Kusum Gupta. 155-162 [doi]

DyCoDa: A Multi-modal Data Collection of Multi-user Remote Survival Game RecordingsDenis Dresvyanskiy, Yamini Sinha, Matthias Busch, Ingo Siegert, Alexey Karpov 0001, Wolfgang Minker. 163-177 [doi]

On the Use of Ensemble X-Vector Embeddings for Improved Sleepiness DetectionJosé Vicente Egas López, Róbert Busa-Fekete, Gábor Gosztolya. 178-187 [doi]

Multiresolution Decomposition Analysis via Wavelet Transforms for Audio Deepfake DetectionAbderrahim Fathan, Jahangir Alam, Woo Hyun Kang. 188-200 [doi]

Automatic Rhythm and Speech Rate Analysis of Mising Spontaneous SpeechParismita Gogoi, Priyankoo Sarmah, S. R. M. Prasanna. 201-213 [doi]

An Electroglottographic Method for Assessing the Emotional State of the SpeakerAleksey Grigorev, Anna V. Kurazhova, Egor Kleshnev, Aleksandr Nikolaev, Olga V. Frolova, Elena E. Lyakso. 214-225 [doi]

Significance of Distance on Pop Noise for Voice Liveness DetectionPriyanka Gupta, Hemant A. Patil. 226-237 [doi]

CRIM's Speech Recognition System for OpenASR21 Evaluation with Conformer and Voice Activity Detector EmbeddingsVishwa Gupta, Gilles Boulianne. 238-251 [doi]

Joint Changes in First and Second Formants of /a/, /i/, /u/ Vowels in Babble Noise - a New Statistical ApproachAlisa P. Gvozdeva, Alexander M. Lunichkin, Larisa G. Zaytseva, Elena A. Ogorodnikova, Irina G. Andreeva. 252-264 [doi]

Comparing NLP Solutions for the Disambiguation of French Heterophonic Homographs for End-to-End TTS SystemsMaria-Loulou Hajj, Martin Lenglet, Olivier Perrotin, Gérard Bailly. 265-278 [doi]

Detection of Speech Related Disorders by Pre-trained Embedding Models Extracted BiomarkersAttila Zoltán Jenei, Gábor Kiss, Dávid Sztahó. 279-289 [doi]

Multi-label Dysfluency ClassificationMelanie Jouaiti, Kerstin Dautenhahn. 290-301 [doi]

Harnessing Uncertainty - Multi-label Dysfluency Classification with Uncertain LabelsMelanie Jouaiti, Kerstin Dautenhahn. 302-311 [doi]

Continuous Wavelet Transform for Severity-Level Classification of DysarthriaAastha Kachhi, Anand Therattil, Priyanka Gupta, Hemant A. Patil. 312-324 [doi]

Significance of Energy Features for Severity Classification of DysarthriaAastha Kachhi, Anand Therattil, Ankur T. Patil, Hardik B. Sailor, Hemant A. Patil. 325-337 [doi]

An Analytic Study on Clustering-Based Pseudo-labels for Self-supervised Deep Speaker VerificationWoo Hyun Kang, Jahangir Alam, Abderrahim Fathan. 338-348 [doi]

Investigation of Transfer Learning for End-to-End Russian Speech RecognitionIrina S. Kipyatkova. 349-357 [doi]

Prosodic Features of Verbal Irony in Russian and French: Universal vs. Language-SpecificUliana Kochetkova, Pavel A. Skrelin, Rada German, Daria Novoselova. 358-371 [doi]

Categorization of Threatening Speech ActsLiliya Komalova, Lyubov Kalyuzhnaya. 372-381 [doi]

Assessment of Speech Quality During Speech Rehabilitation Based on the Solution of the Classification ProblemEvgeny Kostyuchenko, Ivan Rakhmanenko, Lidiya N. Balatskaya. 382-390 [doi]

Multi-level Fusion of Fisher Vector Encoded BERT and Wav2vec 2.0 Embeddings for Native Language IdentificationDani Krebbers, Heysem Kaya, Alexey Karpov 0001. 391-403 [doi]

Fake Speech Detection Using OpenSMILE FeaturesDevesh Kumar, Pavan Kumar V. Patil, Ayush Agarwal, S. R. Mahadeva Prasanna. 404-415 [doi]

Nonverbal Constituents of Argumentative Discourse: Gesture and Prosody InteractionAnna Leonteva, Tatiana Sokoreva. 416-425 [doi]

Classifying Mahout and Social Interactions of Asian Elephants Based on Trumpet CallsSeema Lokhandwala, Priyankoo Sarmah, Rohit Sinha 0003. 426-437 [doi]

Recognition of the Emotional State of Children with Down Syndrome by Video, Audio and Text Modalities: Human and AutomaticElena E. Lyakso, Olga V. Frolova, Anton Matveev, Yuri Matveev, Aleksey Grigorev, Olesia Makhnytkina, Nersisson Ruban. 438-450 [doi]

Fake Speech Detection Using Modulation SpectrogramRaghav Magazine, Ayush Agarwal, Anand Hedge, S. R. Mahadeva Prasanna. 451-463 [doi]

Self-Configuring Genetic Programming Feature Generation in Affect Recognition TasksDanila Mamontov, Wolfgang Minker, Alexey Karpov 0001. 464-476 [doi]

A Multi-modal Approach to Mining Intent from Code-Mixed Hindi-English Calls in the Hyperlocal-Delivery DomainJose Mathew 0003, Pranjal Sahu, Bhavuk Singhal, Aniket Joshi, Krishna Reddy Medikonda, Jairaj Sathyanarayana. 477-493 [doi]

Importance of Supra-Segmental Information and Self-Supervised Framework for Spoken Language Diarization TaskJagabandhu Mishra, S. R. Mahadeva Prasanna. 494-507 [doi]

Low-Resource Emotional Speech Synthesis: Transfer Learning and Data RequirementsAnton Nesterenko, Ruslan Akhmerov, Yulia Matveeva, Anna Goremykina, Dmitry Astankov, Evgeniy Shuranov, Alexandra Shirshova. 508-521 [doi]

Fuzzy Classifier for Speech Assessment in Speech RehabilitationDariya Novokhrestova, Ilya A. Hodashinsky, Evgeny Kostyuchenko, Konstantin S. Sarin, Marina Bardamova. 522-532 [doi]

Analysis-By-Synthesis Modeling of Bengali IntonationMoumita Pakrashi, Shakuntala Mahanta. 533-544 [doi]

Neural Network Based Curve Fitting to Enhance the Intelligibility of Dysarthric SpeechK. S. Pavithra, H. M. Chandrashekar, Veena Karjigi. 545-553 [doi]

Personalizing Retrieval-Based Dialogue AgentsPavel Posokhov, Anastasia Matveeva, Olesia Makhnytkina, Anton Matveev, Yuri Matveev. 554-566 [doi]

Forensic Identification of Foreign-Language Speakers by the Method of Structural-Melodic Analysis of PhonogramsRodmonga Potapova, Vsevolod Potapov, Irina Kuryanova. 567-578 [doi]

Logistics Translator. Concept Vision on Future Interlanguage Computer Assisted TranslationRodmonga Potapova, Vsevolod Potapov, Oleg Kuzmin. 579-589 [doi]

Analysis of Time-Averaged Feature Extraction Techniques on Infant Cry ClassificationAditya Pusuluri, Aastha Kachhi, Hemant A. Patil. 590-603 [doi]

Should We Believe Our Eyes or Our Ears? Processing Incongruent Audiovisual Stimuli by Russian ListenersElena I. Riekhakaynen, Elena Zatevalova. 604-615 [doi]

Emotional Speech Recognition Based on Lip-ReadingElena Ryumina, Denis Ivanko. 616-625 [doi]

Exploring the Use of Machine Learning for Resume RecommendationsAnna Shestakova, Andrea Corradini 0002. 626-640 [doi]

The Role of Pause in Interaction: A Case of PolylogueTatiana Sokoreva, Tatiana Shevchenko. 641-650 [doi]

Dictionary with the Evaluation of Positivity/Negativity Degree of the Russian WordsValery D. Solovyev, Musa Islamov, Venera Bayrasheva. 651-664 [doi]

Effects of Depth of Field on Focus Using a Virtual Reality Escape RoomNikolaos Tsiftsis, Konstantinos Moustakas, Nikolaos D. Fakotakis. 665-675 [doi]

Dynamics of Frequency Characteristics of Visually Evoked Potentials of Electroencephalography During the Work with Brain-Computer InterfacesYaroslav Turovsky, Daniyar Wolf, Roman V. Meshcheryakov, Anastasia Iskhakova. 676-687 [doi]

Device Robust Acoustic Scene Classification Using Adaptive Noise Reduction and Convolutional Recurrent Attention Neural NetworkSpoorthy Venkatesh, Shashidhar G. Koolagudi. 688-699 [doi]

Comparison of Word Embeddings of Unaligned Audio and Text Data Using Persistent HomologyZhandos Yessenbayev, Zhanibek Kozhirbayev. 700-711 [doi]

Low-Cost Training of Speech Recognition System for Hindi ASR Challenge 2022Alexander Zatvornitskiy. 712-718 [doi]

runs on WebDSL