Improving Singing Voice Separation using {Attribute}-{Aware} Deep Network

Swaminathan, Rupak Vignesh, Alexander Lerch. Improving Singing Voice Separation using {Attribute}-{Aware} Deep Network. In Proceedings of the International Workshop on Multilayer Music Representation and Processing ({MMRP}). Milan, Italy, 2019.

Abstract

Singing Voice Separation (SVS) attempts to separate the predominant singing voice from a polyphonic musical mixture. In this paper, we investigate the effect of introducing attribute-specific information, namely, the frame level vocal activity information as an augmented feature input to a Deep Neural Network performing the separation. Our study considers two types of inputs, i.e, a ground-truth based ?oracle? input and labels extracted by a state-of-the-art model for singing voice activity detection in polyphonic music. We show that the separation network informed of vocal activity learns to differentiate between vocal and non-vocal regions. Such a network thus reduces interference and artifacts better compared to the network agnostic to this side information. Results on the MIR1K dataset show that informing the separation network of vocal activity improves the separation results consistently across all the measures used to evaluate the separation quality.