XTTS v2 GGUF - Memory-Efficient TTS for Mobile

πŸš€ EXPERIMENTAL: GGUF format XTTS v2 with C++ inference engine for ultra-low memory usage on mobile devices.

⚠️ NOTE: This is a proof-of-concept. GGUF files require the included C++ inference engine to run.

🎯 Key Features

  • Memory-Mapped Loading: Only loads needed parts into RAM
  • Multiple Quantizations: Q4 (290MB), Q8 (580MB), F16 (1.16GB)
  • Low RAM Usage: 90-350MB vs 1.5-2.5GB for PyTorch
  • Fast Loading: <1 second vs 15-20 seconds
  • React Native Ready: Full mobile integration

πŸ“Š Model Variants

Variant Size RAM (mmap) Quality Best For
q4_k 290MB ~90MB Good Low-end devices
q8 580MB ~180MB Very Good Mid-range devices
f16 1.16GB ~350MB Excellent High-end devices

πŸš€ Quick Start

React Native

import XTTS from '@genmedlabs/xtts-gguf';

// Initialize (downloads model automatically)
await XTTS.initialize();

// Generate speech
const audio = await XTTS.speak("Hello world!", {
  language: 'en'
});

C++

#include "xtts_inference.h"

auto model = std::make_unique<xtts::XTTSInference>();
model->load_model("xtts_v2_q4_k.gguf", true);
auto audio = model->generate("Hello world!", xtts::LANG_EN);

πŸ“¦ Repository Structure

gguf/
β”œβ”€β”€ xtts_v2_q4_k.gguf   # 4-bit quantized model
β”œβ”€β”€ xtts_v2_q8.gguf     # 8-bit quantized model
β”œβ”€β”€ xtts_v2_f16.gguf    # 16-bit half precision
└── manifest.json       # Model metadata

cpp/
β”œβ”€β”€ xtts_inference.h    # C++ header
β”œβ”€β”€ xtts_inference.cpp  # Implementation
└── CMakeLists.txt      # Build configuration

react-native/
β”œβ”€β”€ XTTSModule.cpp      # Native module
└── XTTSModule.ts       # TypeScript interface

πŸ”§ Implementation Status

Completed βœ…

  • GGUF format export
  • C++ engine structure
  • React Native bridge
  • Memory-mapped loading

In Progress 🚧

  • Full transformer implementation
  • Hardware acceleration
  • Voice cloning support

TODO πŸ“‹

  • Production optimizations
  • Comprehensive testing
  • WebAssembly support

πŸ“„ License

Apache 2.0

πŸ™ Credits

Based on XTTS v2 by Coqui AI. Uses GGML library for efficient inference.


See full documentation in the repository for detailed usage and build instructions.

Downloads last month
182
GGUF
Hardware compatibility
Log In to view the estimation

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support