Journal: IEEE Transactions on Audio, Speech & Language Processing

Volume 23, Issue 9

1389 -- 1420Lin-Shan Lee, James R. Glass, Hung-yi Lee, Chun-an Chan. Spoken Content Retrieval - Beyond Cascading Speech Recognition with Text Retrieval
1421 -- 1430Yishan Jiao, Visar Berisha, Ming Tu, Julie Liss. Convex Weighting Criteria for Speaking Rate Estimation
1431 -- 1444Jianjun He, Woon-Seng Gan, Ee-Leng Tan. Primary-Ambient Extraction Using Ambient Spectrum Estimation for Immersive Spatial Audio Reproduction
1445 -- 1456Qing Shen, Wei Liu, Wei Cui, Siliang Wu, Yimin D. Zhang, Moeness G. Amin. Low-Complexity Direction-of-Arrival Estimation Based on Wideband Co-Prime Arrays
1457 -- 1468Yu-Ren Chien, Hsin-Min Wang, Shyh-Kang Jeng. An Acoustic-Phonetic Model of F0 Likelihood for Vocal Melody Extraction
1469 -- 1477Xiaodong Cui, Vaibhava Goel, Brian Kingsbury. Data Augmentation for Deep Neural Network Acoustic Modeling
1478 -- 1492Enzo De Sena, Hüseyin Hacihabiboglu, Zoran Cvetkovic, Julius O. Smith III. Efficient Synthesis of Room Acoustics via Scattering Delay Networks
1493 -- 1508Lin Wang, Timo Gerkmann, Simon Doclo. Noise Power Spectral Density Estimation Using MaxNSR Blocking Matrix
1509 -- 1520Ante Jukic, Toon van Waterschoot, Timo Gerkmann, Simon Doclo. Multi-Channel Linear Prediction-Based Speech Dereverberation With Sparse Priors
1521 -- 1532Pejman Mowlaee, Josef Kulmer. Harmonic Phase Estimation in Single-Channel Speech Enhancement Using Phase Decomposition and SNR Information

Volume 23, Issue 8

1249 -- 1258Hajar Momeni, Hamid Reza Abutalebi, AliAkbar Tadaion. Joint Detection and Estimation of Speech Spectral Amplitude Using Noncontinuous Gain Functions
1259 -- 1272Jen-Tzung Chien. Hierarchical Pitman-Yor-Dirichlet Language Model
1273 -- 1282Mehdi Fallahpour, David Megías. Audio Watermarking Based on Fibonacci Numbers
1283 -- 1294Pejman Mowlaee, Josef Kulmer. Phase Estimation in Single-Channel Speech Enhancement: Limits-Potential
1295 -- 1308Mohamed Morchid, Mohamed Bouallegue, Richard Dufour, Georges Linarès, Driss Matrouf, Renato de Mori. Compact Multiview Representation of Documents Based on the Total Variability Space
1309 -- 1321Ryosuke Sugiura, Yutaka Kamamoto, Noboru Harada, Hirokazu Kameoka, Takehiro Moriya. Optimal Coding of Generalized-Gaussian-Distributed Frequency Spectra for Low-Delay Audio Coder With Powered All-Pole Spectrum Estimation
1322 -- 1334Kuan-Yu Chen, Shih-Hung Liu, Berlin Chen, Hsin-Min Wang, Ea-Ee Jan, Wen-Lian Hsu, Hsin-Hsi Chen. Extractive Broadcast News Summarization Leveraging Recurrent Neural Network Language Modeling Techniques
1335 -- 1347Zbynek Koldovský, Jirí Málek, Sharon Gannot. Spatial Source Subtraction Based on Incomplete Measurements of Relative Transfer Function
1348 -- 1357Dimitrios Dimitriadis, Enrico Bocchieri. Use of Micro-Modulation Features in Large Vocabulary Continuous Speech Recognition Tasks
1358 -- 1367Xun Wang, Yasuhisa Yoshida, Tsutomu Hirao, Xun Wang, Yasuhisa Yoshida, Tsutomu Hirao, Katsuhito Sudoh, Masaaki Nagata, Katsuhito Sudoh, Masaaki Nagata. Summarization Based on Task-Oriented Discourse Parsing
1368 -- 1380Carlos Spa, Anton Rey, Erwin Hernández. A GPU Implementation of an Explicit Compact FDTD Algorithm with a Digital Impedance Filter for Room Acoustics Applications

Volume 23, Issue 7

1105 -- 1117Matt McVicar, Satoru Fukayama, Masataka Goto. AutoGuitarTab: Computer-Aided Composition of Rhythm and Lead Guitar Parts in the Tablature Space
1118 -- 1129Maarten Van Segbroeck, Ruchir Travadi, Shrikanth S. Narayanan. Rapid Language Identification
1130 -- 1143Damián Marelli, Robert Baumgartner, Piotr Majdak. Efficient Approximation of Head-Related Transfer Functions in Subbands for Accurate Sound Localization
1144 -- 1159Ching-feng Yeh, Lin-Shan Lee. An Improved Framework for Recognizing Highly Imbalanced Bilingual Code-Switched Lectures with Cross-Language Acoustic Modeling and Frame-Level Language Identification
1160 -- 1171Dogac Basaran, Ali Taylan Cemgil, Emin Anarim. A Probabilistic Model-Based Approach for Aligning Multiple Audio Sequences
1172 -- 1183Dongpeng Chen, Brian Kan-Wing Mak. Multitask Learning of Deep Neural Networks for Low-Resource Speech Recognition
1184 -- 1197Thomas Meyer, Najeh Hajlaoui, Andrei Popescu-Belis. Disambiguating Discourse Connectives for Statistical Machine Translation
1198 -- 1208Ulpu Remes, Ana Ramirez Lopez, Kalle J. Palomäki, Mikko Kurimo. Bounded Conditional Mean Imputation with Observation Uncertainties and Acoustic Model Adaptation
1209 -- 1220Rui Wang, Hai Zhao, Bao-Liang Lu, Masao Utiyama, Eiichiro Sumita. Bilingual Continuous-Space Language Model Growing for Statistical Machine Translation
1221 -- 1232Tze Yuang Chong, Rafael E. Banchs, Engsiong Chng, Haizhou Li. Decoupling Word-Pair Distance and Co-occurrence Information for Effective Long History Context Language Modeling
1233 -- 1242Meng Sun, Yinan Li, Jort F. Gemmeke, Xiongwei Zhang. Speech Enhancement Under Low SNR Conditions Via Noise Estimation Using Sparse and Low-Rank NMF with Kullback-Leibler Divergence

Volume 23, Issue 6

957 -- 969Shih-Hung Liu, Kuan-Yu Chen, Berlin Chen, Hsin-Min Wang, Hsu-Chun Yen, Wen-Lian Hsu. Combining Relevance Language Modeling and Clarity Measure for Extractive Speech Summarization
970 -- 981Maciej Niedzwiecki, Marcin Ciolek, Krzysztof Cisowski. Elimination of Impulsive Disturbances From Stereo Audio Recordings Using Vector Autoregressive Modeling and Variable-order Kalman Filtering
982 -- 992Kun Han, Yuxuan Wang, DeLiang Wang, William S. Woods, Ivo Merks, Tao Zhang. Learning Spectral Mapping for Speech Dereverberation and Denoising
993 -- 1005Peter Foster, Simon Dixon, Anssi Klapuri. Identifying Cover Songs Using Information-Theoretic Measures of Similarity
1006 -- 1018Andreas Schwarz, Walter Kellermann. Coherent-to-Diffuse Power Ratio Estimation for Dereverberation
1019 -- 1030Milos Cernak, Philip N. Garner, Alexandros Lazaridis, Petr Motlícek, Xingyu Na. Incremental Syllable-Context Phonetic Vocoding
1031 -- 1041Mickael Rouvier, Stanislas Oger, Georges Linarès, Driss Matrouf, Bernard Mérialdo, Yingbo Li. Audio-Based Video Genre Identification
1042 -- 1053Hirokazu Kameoka, Kota Yoshizato, Tatsuma Ishihara, Kento Kadowaki, Yasunori Ohishi, Kunio Kashino. Generative Modeling of Voice Fundamental Frequency Contours
1054 -- 1067Dejan Markovic, Fabio Antonacci, Augusto Sarti, Stefano Tubaro. Multiview Soundfield Imaging in the Projective Ray Space
1068 -- 1081Alice P. Bates, Zubair Khalid, Rodney A. Kennedy. Novel Sampling Scheme on the Sphere for Head-Related Transfer Function Measurements
1082 -- 1095Mao-shen Jia, Ziyu Yang, Changchun Bao, Xiguang Zheng, Christian Ritz. Encoding Multiple Audio Objects Using Intra-Object Sparsity

Volume 23, Issue 5

817 -- 827Florian Krebs, Andre Holzapfel, Ali Taylan Cemgil, Gerhard Widmer. Inferring Metrical Structure in Music Using Particle Filters
828 -- 839Janghoon Cho, Chang D. Yoo. Underdetermined Convolutive BSS: Bayes Risk Minimization Based on a Mixture of Super-Gaussian Posterior Approximation
840 -- 850Hao Mu, Woon-Seng Gan, Ee-Leng Tan. An Objective Analysis Method for Perceptual Quality of a Virtual Bass System
851 -- 862Richard C. Hendriks, Joao B. Crespo, Jesper Jensen, Cees H. Taal. Optimal Near-End Speech Intelligibility Improvement Incorporating Additive Noise and Late Reverberation Under an Approximation of the Short-Time SII
863 -- 876Ahmed Hussen Abdelaziz, Steffen Zeiler, Dorothea Kolossa. Learning Dynamic Stream Weights For Coupled-HMM-Based Audio-Visual Speech Recognition
877 -- 886Reuven Berkun, Israel Cohen, Jacob Benesty. Combined Beamformers for Robust Broadband Regularized Superdirective Beamforming
887 -- 897Jeroen Breebaart. Evaluation of Statistical Inference Tests Applied to Subjective Audio Quality Data With Small Sample Size
898 -- 908Miroslav Zivanovic. Harmonic Bandwidth Companding for Separation of Overlapping Harmonics in Pitched Signals
909 -- 922Jen-Tzung Chien. Laplace Group Sensing for Acoustic Models
923 -- 931Ying Wei, Yinfeng Wang. Design of Low Complexity Adjustable Filter Bank for Personalized Hearing Aid Solutions
932 -- 940Alfonso Perez Carrillo, Marcelo M. Wanderley. Indirect Acquisition of Violin Instrumental Controls from Audio Signal with Hidden Markov Models
941 -- 950André Mansikkaniemi, Mikko Kurimo. Adaptation of Morph-Based Speech Recognition for Foreign Names and Acronyms

Volume 23, Issue 4

605 -- 618Langzhou Chen, Norbert Braunschweiler, Mark J. F. Gales. Speaker and Expression Factorization for Audiobook Data: Expressiveness and Transplantation
619 -- 630Xinjie Zhou, Xiaojun Wan, Jianguo Xiao. CLOpinionMiner: Opinion Target Extraction in a Cross-Language Scenario
631 -- 642Pan Zhou, Hui Jiang 0001, Li-Rong Dai, Yu Hu, Qingfeng Liu. State-Clustering Based Multiple Deep Neural Networks Modeling Approach for Speech Recognition
643 -- 653Ying Hu, Guizhong Liu. Separation of Singing Voice Using Nonnegative Matrix Partial Co-Factorization for Singer Identification
654 -- 669Daichi Kitamura, Hiroshi Saruwatari, Hirokazu Kameoka, Yu Takahashi, Kazunobu Kondo, Satoshi Nakamura. Multichannel Signal Separation Combining Directional Clustering and Nonnegative Matrix Factorization with Spectrogram Restoration
670 -- 682Van-Khanh Mai, Dominique Pastor, Abdeldjalil Aïssa-El-Bey, Raphaël Le Bidan. Robust Estimation of Non-Stationary Noise Power Spectrum for Speech Enhancement
683 -- 693Eduardo Blanco 0002, Dan I. Moldovan. A Semantic Logic-Based Approach to Determine Textual Similarity
694 -- 704Myung Jong Kim, Younggwan Kim, Hoirin Kim. Automatic Intelligibility Assessment of Dysarthric Speech Using Phonologically-Structured Sparse Linear Model
705 -- 717G. Aneeja, B. Yegnanarayana. Single Frequency Filtering Approach for Discriminating Speech and Nonspeech
718 -- 731Antoine Deleforge, Radu Horaud, Yoav Y. Schechner, Laurent Girin. Co-Localization of Audio Sources in Images Using Binaural Features and Locally-Linear Regression
732 -- 745David Dov, Ronen Talmon, Israel Cohen. Audio-Visual Voice Activity Detection Using Diffusion Maps
746 -- 759Maryam Habibi, Andrei Popescu-Belis. Keyword Extraction and Clustering for Document Recommendation in Conversations
760 -- 773Nursadul Mamun, Wissam A. Jassim, Muhammad S. A. Zilany. Prediction of Speech Intelligibility Using a Neurogram Orthogonal Polynomial Measure (NOPM)
774 -- 786Enzo De Sena, Niccolo Antonello, Marc Moonen, Toon van Waterschoot. On the Modeling of Rectangular Geometries in Room Acoustic Simulations
787 -- 797Hao Huang, Haihua Xu, Xianhui Wang, Wushour Silamu. Maximum F1-Score Discriminative Training Criterion for Automatic Mispronunciation Detection
798 -- 806Chung-Che Wang, Jyh-Shing Roger Jang. Improving Query-by-Singing/Humming by Combining Melody and Lyric Information

Volume 23, Issue 3

427 -- 430Haizhou Li, Marcello Federico, Xiaodong He, Helen M. Meng, Isabel Trancoso. Introduction to the Special Section on Continuous Space and Related Methods in Natural Language Processing
431 -- 440Heike Adel, Ngoc Thang Vu, Katrin Kirchhoff, Dominic Telaar, Tanja Schultz. Syntactic and Semantic Features For Code-Switching Factored Language Models
441 -- 450Xiaodong Zeng, Derek F. Wong, Lidia S. Chao, Isabel Trancoso. Graph-Based Lexicon Regularization for PCFG With Latent Annotations
451 -- 460Wenliang Chen, Min Zhang 0005, Yue Zhang. Distributed Feature Representations for Dependency Parsing
461 -- 471Ruiji Fu, Jiang Guo, Bing Qin, Wanxiang Che, Haifeng Wang, Ting Liu. Learning Semantic Hierarchies: A Continuous Vector Space Approach
472 -- 482Rafael E. Banchs, Luis F. D'Haro, Haizhou Li. Adequacy-Fluency Metrics: Evaluating MT in the Continuous Space Model Framework
483 -- 493Deyi Xiong, Min Zhang, Xing Wang. Topic-Based Coherence Modeling for Statistical Machine Translation
494 -- 504Brian Hutchinson, Mari Ostendorf, Maryam Fazel. A Sparse Plus Low-Rank Exponential Language Model for Limited Resource Scenarios
505 -- 516Mohsen Rashwan, Ahmad A. Al Sallab, Hazem M. Raafat, Ahmed Rafea. Deep Learning Framework with Confused Sub-Set Resolution Architecture for Automatic Arabic Diacritization
517 -- 529Martin Sundermeyer, Hermann Ney, Ralf Schlüter. From Feedforward to Recurrent LSTM Neural Networks for Language Modeling
530 -- 539Grégoire Mesnil, Yann Dauphin, Kaisheng Yao, Yoshua Bengio, Li Deng, Dilek Z. Hakkani-Tür, Xiaodong He, Larry Heck, Gökhan Tür, Dong Yu, Geoffrey Zweig. Using Recurrent Neural Networks for Slot Filling in Spoken Language Understanding
540 -- 552Ian Vince McLoughlin, Haomin Zhang, Zhipeng Xie, Yan Song, Wei Xiao. Robust Sound Event Classification Using Deep Neural Networks
553 -- 563Dusan Zahoransky, Ivan Polásek. Text Search of Surnames in Some Slavic and Other Morphologically Rich Languages Using Rule Based Phonetic Algorithms
564 -- 579Yow-Bang Wang, Lin-Shan Lee. Supervised Detection and Unsupervised Discovery of Pronunciation Error Patterns for Computer-Assisted Language Learning
580 -- 587Toru Nakashika, Tetsuya Takiguchi, Yasuo Ariki. Voice Conversion Using RNN Pre-Trained by Recurrent Temporal Restricted Boltzmann Machines
588 -- 599Nicolas Obin, Pierre Lanchantin. Symbolic Modeling of Prosody: From Linguistics to Statistics

Volume 23, Issue 2

227 -- 239Guang Hua, Jonathan Goh, Vrizlynn L. L. Thing. Time-Spread Echo-Based Audio Watermarking With Optimized Imperceptibility and Robustness
240 -- 251Ofer Schwartz, Sharon Gannot, Emanuel A. P. Habets. Multi-Microphone Speech Dereverberation and Noise Reduction Using Relative Early Transfer Functions
252 -- 263Emilio Molina, Lorenzo J. Tardón, Ana M. Barbancho, Isabel Barbancho. SiPTH: Singing Transcription Based on Hysteresis Defined on the Pitch-Time Curve
264 -- 277Haipeng Wang, Tan Lee, Cheung Chi Leung, Bin Ma, Haizhou Li. Acoustic Segment Modeling with Spectral Clustering Methods
278 -- 287Vipul Arora, Laxmidhar Behera. Multiple F0 Estimation and Source Clustering of Polyphonic Music Audio Using PLCA and HMRFs
288 -- 299Ryosuke Sugiura, Yutaka Kamamoto, Noboru Harada, Hirokazu Kameoka, Takehiro Moriya. Resolution Warped Spectral Representation for Low-Delay and Low-Bit-Rate Audio Coder
300 -- 312Chao Weng, Biing-Hwang Fred Juang. Discriminative Training Using Non-Uniform Criteria for Keyword Spotting on Spontaneous Speech
313 -- 326Yoichi Matsuyama, Akihiro Saito, Shinya Fujie, Tetsunori Kobayashi. Automatic Expressive Opinion Sentence Generation for Enjoyable Conversational Systems
327 -- 338Petko N. Petkov, W. Bastiaan Kleijn. Spectral Dynamics Recovery for Enhanced Speech Intelligibility in Noise
339 -- 350Ergun Biçici, Deniz Yuret. Optimizing Instance Selection for Statistical Machine Translation with Feature Decay Algorithms
351 -- 360Mengqiu Zhang, Rodney A. Kennedy, Thushara D. Abhayapala. Empirical Determination of Frequency Representation in Spherical Harmonics-Based HRTF Functional Modeling
361 -- 372Zu-ren Feng, Qing Zhou, Jun Zhang, Ping Jiang, Xue-Wen Yang. A Target Guided Subband Filter for Acoustic Event Detection in Noisy Environments Using Wavelet Packets
373 -- 382Naoki Hirayama, Koichiro Yoshino, Katsutoshi Itoyama, Shinsuke Mori, Hiroshi G. Okuno. Automatic Speech Recognition for Mixed Dialect Utterances by Mixing Dialect Language Models
383 -- 393Alexander Schasse, Timo Gerkmann, Rainer Martin, Wolfgang Sörgel, Thomas Pilgrim, Henning Puder. Two-Stage Filter-Bank System for Improved Single-Channel Noise Reduction in Hearing Aids
394 -- 406Boaz Schwartz, Sharon Gannot, Emanuel A. P. Habets. Online Speech Dereverberation Using Kalman Filter and EM Algorithm
407 -- 419Branislav Gerazov, Zoran A. Ivanovski. Kernel Power Flow Orientation Coefficients for Noise-Robust Speech Recognition

Volume 23, Issue 12

2111 -- 2124Wanxiang Che, Yanyan Zhao, Honglei Guo, Zhong Su, Ting Liu. Sentence Compression for Aspect-Based Sentiment Analysis
2125 -- 2135Jonathan Sheaffer, Maarten van Walstijn, Boaz Rafaely, Konrad Kowalczyk. Binaural Reproduction of Finite Difference Simulations Using Spherical Array Processing
2136 -- 2147Po-Sen Huang, Minje Kim, Mark Hasegawa-Johnson, Paris Smaragdis. Joint Optimization of Masks and Deep Recurrent Neural Networks for Monaural Source Separation
2148 -- 2161Aaron Heidel, Hsiang-Hung Lu, Lin-Shan Lee. Finding Complex Features for Guest Language Fragment Recovery in Resource-Limited Code-Mixed Speech Recognition
2162 -- 2176Daniel Marquardt, Volker Hohmann, Simon Doclo. Interaural Coherence Preservation in Multi-Channel Wiener Filtering-Based Noise Reduction for Binaural Hearing Aids
2177 -- 2188Kai Yu, Kai Sun, Lu Chen, Su Zhu. Constrained Markov Bayesian Polynomial for Efficient Dialogue State Tracking
2189 -- 2197Craig A. Anderson, Paul D. Teal, Mark A. Poletti. Spatially Robust Far-field Beamforming Using the von Mises(-Fisher) Distribution
2198 -- 2208Jens Schröder, Stefan Goetze, Jörn Anemüller. Spectro-Temporal Gabor Filterbank Features for Acoustic Event Detection
2209 -- 2216Inseok Heo, William A. Sethares. Classification Based on Speech Rhythm via a Temporal Alignment of Spoken Sentences
2217 -- 2227Prasanga N. Samarasinghe, Thushara D. Abhayapala, Mark A. Poletti, Terence Betlehem. An Efficient Parameterization of the Room Transfer Function
2228 -- 2237Yong Xiang, Iynkaran Natgunanathan, Yue Rong, Song Guo. Spread Spectrum-Based High Embedding Capacity Watermarking Method for Audio Signals
2238 -- 2245In-Chul Yoo, Hyeontaek Lim, Dongsuk Yook. Formant-Based Robust Voice Activity Detection
2246 -- 2259Thomas Hueber, Laurent Girin, Xavier Alameda-Pineda, Gérard Bailly. Speaker-Adaptive Acoustic-Articulatory Inversion Using Cascaded Gaussian Mixture Regression
2260 -- 2271Hequn Bai, Gaël Richard, Laurent Daudet. Late Reverberation Synthesis: From Radiance Transfer to Feedback Delay Networks
2272 -- 2285Ilker Bayram. A Multichannel Audio Denoising Formulation Based on Spectral Sparsity
2286 -- 2297Héctor Delgado, Xavier Anguera, Corinne Fredouille, Javier Serrano. Fast Single- and Cross-Show Speaker Diarization Using Binary Key Speaker Modeling
2298 -- 2310Winston S. Percybrooks, Elliot Moore. A New HMM-Based Voice Conversion Methodology Evaluated on Monolingual and Cross-Lingual Conversion Tasks
2311 -- 2321Marwa Graja, Maher Jaoua, Lamia Hadrich Belguith. Statistical Framework with Knowledge Base Integration for Robust Speech Understanding of the Tunisian Dialect
2322 -- 2333Falco Strasser, Henning Puder. Adaptive Feedback Cancellation for Realistic Hearing Aid Applications
2334 -- 2342Yu Ting Yeung, Tan Lee, Cheung Chi Leung. Supervised Single-Microphone Multi-Talker Speech Separation with Conditional Random Fields
2343 -- 2355Wenyu Jin, W. Bastiaan Kleijn. Theory and Design of Multizone Soundfield Reproduction Using Sparse Methods
2356 -- 2370Xionghu Zhong, James R. Hopgood. A Time-Frequency Masking Based Random Finite Set Particle Filtering Method for Multiple Acoustic Source Detection and Tracking
2371 -- 2383Karthika Vijayan, K. Sri Rama Murty. Analysis of Phase Spectrum of Speech Signals Using Allpass Modeling
2384 -- 2397Daniel Marquardt, Elior Hadad, Sharon Gannot, Simon Doclo. Theoretical Analysis of Linearly Constrained Multi-Channel Wiener Filtering Algorithms for Combined Noise Reduction and Binaural Cue Preservation in Binaural Hearing Aids
2398 -- 2409Matthias Zöhrer, Robert Peharz, Franz Pernkopf. Representation Learning for Single-Channel Source Separation and Bandwidth Extension
2410 -- 2421Hao Fang, Mari Ostendorf, Peter Baumann, Janet B. Pierrehumbert. Exponential Language Modeling Using Morphological Features and Multi-Task Learning
2422 -- 2433Michael A. Carlin, Mounya Elhilali. A Framework for Speech Activity Detection Using Adaptive Auditory Receptive Fields
2434 -- 2448Shinya Saito, Kunio Oishi, Toshihiro Furukawa. Convolutive Blind Source Separation Using an Iterative Least-Squares Algorithm for Non-Orthogonal Approximate Joint Diagonalization
2449 -- 2464Elior Hadad, Daniel Marquardt, Simon Doclo, Sharon Gannot. Theoretical Analysis of Binaural Transfer Function MVDR Beamformers with Interference Cue Preservation Constraints
2465 -- 2473Guang Yang, Richard F. Lyon, Emmanuel M. Drakakis. Psychophysical Evaluation of An Ultra-Low Power, Analog Biomimetic Cochlear Implant Processor Filterbank Architecture With Across Channels AGC

Volume 23, Issue 11

1713 -- 1726Auxiliadora Sarmiento, Iván Durán-Díaz, Andrzej Cichocki, Sergio Cruces. A Contrast Function Based on Generalized Divergences for Solving the Permutation Problem in Convolved Speech Mixtures
1727 -- 1736Xiaojia Zhao, Yuxuan Wang, DeLiang Wang. Cochannel Speaker Identification in Anechoic and Reverberant Conditions
1737 -- 1749Liang-Yu Chen, Jyh-Shing Roger Jang. Automatic Pronunciation Scoring with Score Combination by Learning to Rank and Class-Normalized DP-Based Quantization
1750 -- 1761Duyu Tang, Bing Qin, Furu Wei, Li Dong, Ting Liu, Ming Zhou. A Joint Segmentation and Classification Framework for Sentence Level Sentiment Classification
1762 -- 1774Falk-Martin Hoffmann, Filippo Maria Fazi. Theoretical Study of Acoustic Circular Arrays With Tangential Pressure Gradient Sensors
1775 -- 1787Nathan Souviraà-Labastie, Anaik Olivero, Emmanuel Vincent, Frédéric Bimbot. Multi-Channel Audio Source Separation Using Multiple Deformed References
1788 -- 1799Deepak Baby, Tuomas Virtanen, Jort F. Gemmeke, Hugo Van Hamme. Coupled Dictionaries for Exemplar-Based Speech Enhancement and Automatic Speech Recognition
1800 -- 1811Md Tauhidul Islam, Celia Shahnaz, Wei-Ping Zhu, M. Omair Ahmad. Speech Enhancement Based on Student t Modeling of Teager Energy Operated Perceptual Wavelet Packet Coefficients and a Custom Thresholding Function
1812 -- 1823Quynh Ngoc Thi Do, Steven Bethard, Marie-Francine Moens. Domain Adaptation in Semantic Role Labeling Using a Neural Language Model and Linguistic Resources
1824 -- 1834Haricharan Aragonda, Chandra Sekhar Seelamantula. Demodulation of Narrowband Speech Spectrograms Using the Riesz Transform
1835 -- 1846Dung T. Tran, Emmanuel Vincent, Denis Jouvet. Nonparametric Uncertainty Estimation and Propagation for Noise Robust ASR
1847 -- 1857Mei Tu, Yu Zhou, Chengqing Zong. Exploring Diverse Features for Statistical Machine Translation Model Pruning
1858 -- 1868Greg Okopal, Scott Wisdom, Les Atlas. Speech Analysis With the Strong Uncorrelating Transform
1869 -- 1878Marcos F. Simón Gálvez, Stephen J. Elliott, Jordan Cheer. Time Domain Optimization of Filters Used in a Loudspeaker Array for Personal Audio
1879 -- 1891Mohammad Hadi Bokaei, Hossein Sameti, Yang Liu. Linear Discourse Segmentation of Multi-Party Meetings Based on Local and Global Information
1892 -- 1903Chung-Hsien Wu, Han-Ping Shen, Chun-Shan Hsu. Code-Switching Event Detection by Using a Latent Language Space Model and the Delta-Bayesian Information Criterion
1904 -- 1916Zhangli Chen, Volker Hohmann. Online Monaural Speech Enhancement Based on Periodicity Analysis and A Priori SNR Estimation
1917 -- 1925Saeed Sarreshtedari, Mohammad Ali Akhaee, Aliazam Abbasfar. A Watermarking Method for Digital Speech Self-Recovery
1926 -- 1937Niko Moritz, Jörn Anemüller, Birger Kollmeier. An Auditory Inspired Amplitude Modulation Filter Bank for Robust Feature Extraction in Automatic Speech Recognition
1938 -- 1949Yajie Miao, Hao Zhang, Florian Metze. Speaker Adaptive Training of Deep Neural Network Acoustic Models Using I-Vectors
1950 -- 1962Veronica Morfi, Gilles Degottex, Athanasios Mouchtaris. Speech Analysis and Synthesis with a Computationally Efficient Adaptive Harmonic Model
1963 -- 1972Jonathan William Dennis, Tran Huy Dat, Haizhou Li. Generalized Hough Transform for Speech Pattern Classification
1973 -- 1987Feng Deng, Changchun Bao, W. Bastiaan Kleijn. Sparse Hidden Markov Models for Speech Enhancement in Non-Stationary Noise Environments
1988 -- 2002Rishabh Ranjan, Woon-Seng Gan. Natural Listening over Headphones in Augmented Reality Using Adaptive Filtering Techniques
2003 -- 2014Ling-Hui Chen, Tuomo Raitio, Cassia Valentini-Botinhao, Zhen-Hua Ling, Junichi Yamagishi. A Deep Generative Architecture for Postfiltering in Statistical Parametric Speech Synthesis
2015 -- 2025Ho Seon Shin, Tim Fingscheidt, Hong-Goo Kang. A Priori SNR Estimation Using Air- and Bone-Conduction Microphones
2026 -- 2035Ji Wu, Miao Li, Chin-Hui Lee. A Probabilistic Framework for Representing Dialog Systems and Entropy-Based Dialog Management Through Dynamic Stochastic State Evolution
2036 -- 2045Sandro Cumani. Fast Scoring of Full Posterior PLDA Models
2046 -- 2058Vladimir Tourbabin, Boaz Rafaely. Direction of Arrival Estimation Using Microphone Array Processing for Moving Humanoid Robots
2059 -- 2069Y. J. Chu, S. C. Chan. A New Local Polynomial Modeling-Based Variable Forgetting Factor RLS Algorithm and Its Acoustic Applications
2070 -- 2080Fernando de-la-Calle-Silos, Francisco J. Valverde-Albacete, Ascensión Gallardo-Antolín, Carmen Peláez-Moreno. Morphologically Filtered Power-Normalized Cochleograms as Robust, Biologically Inspired Features for ASR
2081 -- 2092Tsutomu Hirao, Masaaki Nishino, Yasuhisa Yoshida, Jun Suzuki, Norihito Yasuda, Masaaki Nagata. Summarizing a Document by Trimming the Discourse Tree
2093 -- 2105Chao Pan, Jingdong Chen, Jacob Benesty. Theoretical Analysis of Differential Microphone Array Beamforming and an Improved Solution

Volume 23, Issue 10

1539 -- 1551Sakari Tervo, Archontis Politis. Direction of Arrival Estimation of Reflections from Room Impulse Responses Using a Spherical Microphone Array
1552 -- 1562Jia-Ching Wang, Yu-Hao Chin, Bo-Wei Chen, Chang Hong Lin, Chung-Hsien Wu. Speech Emotion Verification Using Emotion Variance Modeling and Discriminant Scale-Frequency Maps
1563 -- 1575Antonio Canclini, Paolo Bestagini, Fabio Antonacci, Marco Compagnoni, Augusto Sarti, Stefano Tubaro. A Robust and Low-Complexity Source Localization Algorithm for Asynchronous Distributed Microphone Networks
1576 -- 1588Jianjun He, Woon-Seng Gan, Ee-Leng Tan. Time-Shifting Based Primary-Ambient Extraction for Spatial Audio Reproduction
1589 -- 1599Pratik Shah, Ian Lewis, Steven L. Grant, Sylvain Angrignon. Nonlinear Acoustic Echo Cancellation Using Voltage and Current Feedback
1600 -- 1612Li Su, Yi-Hsuan Yang. Combining Spectral and Temporal Representations for Multipitch Estimation of Polyphonic Music
1613 -- 1622Toyota Fujioka, Yoshifumi Nagata, Masato Abe. High-Precision Harmonic Distortion Level Measurement of a Loudspeaker Using Adaptive Filters in a Noisy Environment
1623 -- 1636Tsz-Kin Hon, Lin Wang, Joshua D. Reiss, Andrea Cavallaro. Audio Fingerprinting for Multi-Device Self-Localization
1637 -- 1647Ye Tian, Zhe Chen, Fuliang Yin. Distributed IMM-Unscented Kalman Filter for Speaker Tracking in Microphone Array Networks
1648 -- 1659Na Li, Man-Wai Mak. SNR-Invariant PLDA Modeling in Nonparametric Subspace for Robust Speaker Verification
1660 -- 1669Juha Vilkamo, Symeon Delikaris-Manias. Perceptual Reproduction of Spatial Sound Using Loudspeaker-Signal-Domain Parametrization
1670 -- 1679Chao Weng, Dong Yu, Michael L. Seltzer, Jasha Droppo. Deep Neural Networks for Single-Channel Multi-Talker Speech Recognition
1680 -- 1691Marco Ruhland, Jörg Bitzer, Matthias Brandt, Stefan Goetze. Reduction of Gaussian, Supergaussian, and Impulsive Noise by Interpolation of the Binary Mask Residual
1692 -- 1703Yuval Dorfan, Sharon Gannot. Tree-Based Recursive Expectation-Maximization Algorithm for Localization of Acoustic Sources

Volume 23, Issue 1

7 -- 19Yong Xu, Jun Du, Li-Rong Dai, Chin-Hui Lee. A Regression Approach to Speech Enhancement Based on Deep Neural Networks
20 -- 31Huy Phan, Marco Maaß, Radoslaw Mazur, Alfred Mertins. Random Regression Forests for Acoustic Event Detection and Classification
32 -- 45Yuntao Wu, Amir Leshem, Jesper Rindom Jensen, Guisheng Liao. Joint Pitch and DOA Estimation Using the ESPRIT Method
46 -- 56Remi Decorsiere, Peter L. Sondergaard, Ewen N. MacDonald, Torsten Dau. Inversion of Auditory Spectrograms, Traditional Spectrograms, and Other Envelope Representations
57 -- 68Johann Poignant, Laurent Besacier, Georges Quénot. Unsupervised Speaker Identification in TV Broadcast Based on Written Names
69 -- 79Renjie Tong, Yingyue Zhou, Long Zhang, Guangzhao Bao, Zhongfu Ye. A Robust Time-Frequency Decomposition Model for Suppression of Mixed Gaussian-Impulse Noise in Audio Signals
80 -- 91Soodeh Ahani, Shahrokh Ghaemmaghami, Z. Jane Wang. A Sparse Representation-Based Wavelet Domain Speech Steganography Method
92 -- 101Arun Narayanan, DeLiang Wang. Improving Robustness of Deep Neural Network Acoustic Models via Speech Separation and Joint Adaptive Training
102 -- 114Rongfeng Su, Xunying Liu, Lan Wang. Automatic Complexity Control of Generalized Variable Parameter HMMs for Noise Robust Speech Recognition
115 -- 126Zixing Zhang, Eduardo Coutinho, Jun Deng, Björn Schuller. Cooperative Learning and its Application to Emotion Recognition from Speech
127 -- 141Pei-hao Su, Chuan-Hsun Wu, Lin-Shan Lee. A Recursive Dialogue Game for Personalized Computer-Aided Pronunciation Training
142 -- 153Alain Rakotomamonjy, Gilles Gasso. Histogram of Gradients of Time-Frequency Representations for Audio Scene Classification
154 -- 161Soudeh A. Khoubrouy, Issa M. S. Panahi, John H. L. Hansen. Howling Detection in Hearing Aids Based on Generalized Teager-Kaiser Operator
162 -- 173Jens Brehm Bagger Nielsen, Jakob Nielsen, Jan Larsen. Perception-Based Personalization of Hearing Aids Using Gaussian Processes and Active Learning
174 -- 185Jesper Rindom Jensen, Mads Græsbøll Christensen, Jacob Benesty, Søren Holdt Jensen. Joint Spatio-Temporal Filtering Methods for DOA and Fundamental Frequency Estimation
186 -- 197Jesper Jensen, Zheng-Hua Tan. Minimum Mean-Square Error Estimation of Mel-Frequency Cepstral Features-A Theoretically Consistent Approach
198 -- 211Carlos D. Martínez-Hinarejos, José-Miguel Benedí, Vicent Tamarit. Unsegmented Dialogue Act Annotation and Decoding With N-Gram Transducers
212 -- 221Lin Wang, Zhe Chen, Fuliang Yin. A Novel Hierarchical Decomposition Vector Quantization Method for High-Order LPC Parameters