MelGAN Audio Samples

Audio samples from MelGAN vocoder

Disclaimer: This is a third-party implementation.

Implementation GitHub repo: https://github.com/seungwonpark/melgan

Contains pretrained model compatible with NVIDIA/tacotron2 on LJSpeech-1.1

Audio samples from original authors: https://melgan-neurips.github.io/
Official implementation: https://github.com/descriptinc/melgan-neurips

In summary, MelGAN can convert mel-spectrograms into raw audio at real-time on CPU, and it generalizes to unseen speakers with significantly fewer parameters than previous state-of-the-art, WaveGlow.

LJSpeech-1.1 (Updated 2019.12.02)

All audios below are unseen during training. We split LJSpeech-1.1 into 9:1 for train/validation. (Files with suffix "*5.wav" are for validation)

Epochs	LJ001-0005.wav	LJ001-0015.wav	LJ014-0285.wav
Original
Epoch 400
Epoch 800
Epoch 1600
Epoch 3200
Epoch 6400

All details are shown in GitHub repository's README. Thank you!

Implementation author: Seungwon Park, Myunchul Joe @ MINDsLab | Rishikesh @ DeepSync Technologies Pvt Ltd.