svcc logo

Singing Voice Conversion Challenge 2023

We invite everyone to participate in the first Singing Voice Conversion Challenge (SVCC)!

How to Participate?

There is no fee for registration. Please register your team at the following page if you want to participate in the challenge.

Voice conversion (VC) refers to the digital cloning of a person's voice; it can be used to modify audio waveform so that it appear as if spoken by someone else (target) than the original speaker (source). The voice conversion challenge (VCC) series aims to advance and compare different methods to approach the core VC technology using a common dataset, metrics and baseline systems provided by the organizers. With the rapid progress in the various essential modules in a VC system (including acoustic modeling, waveform synthesis, etc.), in the latest VCC, the top system showed an impressive performance, with its generated speech samples very close to human voice in terms of naturalness and similarity. We feel it is time to move our focus from fundamental technologies to more sophisticated applications.

Therefore, we are pleased to announce the first singing voice conversion challenge (SVCC). Singing voice conversion (SVC), extending the definition of normal VC, aims at converting the singing voice of a source singer to that of a target singer without changing the contents. The main applications of SVC lie in entertainment: new tools for virtual youtubers, singing voice beutifying in karaokes, or even singing-aid for the disabled. SVC is considered more challenging than VC, as singing voice is generally harder to model than speech, and data collection is more difficult. Moreover, during conversion, while the music score is considered part of the contents that must not be changed, certain singing styles such as viberato can be considered to be singer-dependent. Each of these prosody-related factors need to be modeled properly. From the community point of view, SVC is the intersection of speech processing and music process. We hope to attract attention from researcher in both communities to facilitate interdisciplinary research.

The previous VCCs can be accessed below:

Tasks of this Challenge

The objective is singer conversion. We plan to prepare two tasks:

We focus on 24 kHz singing voice and signal-to-signal conversion strategies. No transcriptions will be provided for the test set, and the use of manual annotations is NOT allowed. Please note that for this challenge, to facilitate reproducible research, any additional data used for training needs to be publically available. Please only use datasets described in a curated list maintained by the organizers.

Please check the rules section for more detailed information.


The tentative schedule is as follows: We are planning on holding a workshop but we are still working on the venue. The tentative date is around August to September. Please stay tuned!

Baseline Systems

We provide baseline systems. Participants that are new to the singing voice conversion field are welcomed to utilize the open-sourced starter kit for this challenge. We have prepared a few sets of the converted samples generated using these baselines to help participants develop their systems.


Following previous VCCs, the main evaluation campaign will be a large-scale subjective evaluation conducted by recruiting human listeners to assess the quality of all the submitted systems. We will be evaluating the naturalness and similarity of the converted samples.


Contact information: