Singing Voice Extraction from Stereophonic Recordings

Stratis Sofianos

Research output: ThesisDoctoral Thesis


Singing voice separation (SVS) can be defined as the process of extracting the
vocal element from a given song recording. The impetus for research in this
area is mainly that of facilitating certain important applications of music
information retrieval (MIR) such as lyrics recognition, singer identification,
and melody extraction.
To date, the research in the field of SVS has been relatively limited, and mainly
focused on the extraction of vocals from monophonic sources. The general
approach in this scenario has been one of considering SVS as a blind source
separation (BSS) problem. Given the inherent diversity of music, such an
approach is motivated by the quest for a generic solution. However, it does not
allow the exploitation of prior information, regarding the way in which
commercial music is produced
To this end, investigations are conducted into effective methods for
unsupervised separation of singing voice from stereophonic studio recordings.
The work involves extensive literature review of existing methods that relate
to SVS, as well as commercial approaches. Following the identification of
shortcomings of the conventional methods, two novel approaches are
developed for the purpose of SVS. These approaches, termed SEMANICS and
SEMANTICS draw their motivation from statistical as well as spectral
properties of the target signal and focus on the separation of voice in the
frequency domain. In addition, a third method, named Hybrid SEMANTICS, is
introduced that addresses time‐, as well as frequency‐domain separation.
As there is lack of a concrete standardised music database that includes a large
number of songs, a dataset is created using conventional stereophonic mixing
methods. Using this database, and based on widely adopted objective metrics,
the effectiveness of the proposed methods has been evaluated through
thorough experimental investigations
Original languageEnglish
Awarding Institution
  • University of Hertfordshire
Publication statusPublished - 25 Sept 2012


Dive into the research topics of 'Singing Voice Extraction from Stereophonic Recordings'. Together they form a unique fingerprint.

Cite this