Skip to content

Voice conversion

Definition

Transforming the timbre and identity of one speaker's voice to match another while preserving the linguistic content. Used in voice-cloning attacks to impersonate a target using only a short reference recording.

Related terms

Diffusion model
A generative neural network architecture (Ho et al., 2020; Stable Diffusion, Rombach et al., 2022) that learns to reverse a noise-addition process...
Encoder-decoder
A neural architecture where an encoder compresses an input into a compact latent representation and a decoder reconstructs an output image from...
GAN
Generative Adversarial Network. A framework with two neural networks, a generator that creates synthetic data and a discriminator that tries to distinguish...
Latent space
The compressed, lower-dimensional representation of data learned by a neural network's internal layers. Generative models sample from or navigate this space to...
NeRF (Neural Radiance Field)
A neural representation that encodes a 3-D scene as a continuous volumetric function, allowing novel viewpoints to be rendered. In talking-head systems,...

Explained in

Your journey to becoming a forensic professional starts here.

Practice with mock tests, learn from structured notes, and get your questions answered by a global forensic community, all in one place.