Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
mrfakenameΒ 
posted an update 3 days ago
Post
2811
Trained a model for emotion-controllable TTS based on MiMo audio on LAION's dataset.

Still very early and does have an issue with hallucinating but results seem pretty good so far, given that it is very early into the training run.

Will probably kick off a new run later with some settings tweaked.

Put up a demo here: mrfakename/EmoAct-MiMo

(Turn πŸ”Š on to hear audio samples)

wait how did you do that 🀯

Β·

Fine-tuned MiMo Audio to accept text/emotion captions (e.g. "intense fury, rage, hate") as input, trained a LoRA for 1k steps on LAION's voice acting dataset.

Thanks to HF for the GPUs to train πŸ€—

Whaaaaa damn thats really good!