XTTS v2 GGUF - Memory-Efficient TTS for Mobile
π EXPERIMENTAL: GGUF format XTTS v2 with C++ inference engine for ultra-low memory usage on mobile devices.
β οΈ NOTE: This is a proof-of-concept. GGUF files require the included C++ inference engine to run.
π― Key Features
- Memory-Mapped Loading: Only loads needed parts into RAM
- Multiple Quantizations: Q4 (290MB), Q8 (580MB), F16 (1.16GB)
- Low RAM Usage: 90-350MB vs 1.5-2.5GB for PyTorch
- Fast Loading: <1 second vs 15-20 seconds
- React Native Ready: Full mobile integration
π Model Variants
| Variant | Size | RAM (mmap) | Quality | Best For | 
|---|---|---|---|---|
| q4_k | 290MB | ~90MB | Good | Low-end devices | 
| q8 | 580MB | ~180MB | Very Good | Mid-range devices | 
| f16 | 1.16GB | ~350MB | Excellent | High-end devices | 
π Quick Start
React Native
import XTTS from '@genmedlabs/xtts-gguf';
// Initialize (downloads model automatically)
await XTTS.initialize();
// Generate speech
const audio = await XTTS.speak("Hello world!", {
  language: 'en'
});
C++
#include "xtts_inference.h"
auto model = std::make_unique<xtts::XTTSInference>();
model->load_model("xtts_v2_q4_k.gguf", true);
auto audio = model->generate("Hello world!", xtts::LANG_EN);
π¦ Repository Structure
gguf/
βββ xtts_v2_q4_k.gguf   # 4-bit quantized model
βββ xtts_v2_q8.gguf     # 8-bit quantized model
βββ xtts_v2_f16.gguf    # 16-bit half precision
βββ manifest.json       # Model metadata
cpp/
βββ xtts_inference.h    # C++ header
βββ xtts_inference.cpp  # Implementation
βββ CMakeLists.txt      # Build configuration
react-native/
βββ XTTSModule.cpp      # Native module
βββ XTTSModule.ts       # TypeScript interface
π§ Implementation Status
Completed β
- GGUF format export
- C++ engine structure
- React Native bridge
- Memory-mapped loading
In Progress π§
- Full transformer implementation
- Hardware acceleration
- Voice cloning support
TODO π
- Production optimizations
- Comprehensive testing
- WebAssembly support
π License
Apache 2.0
π Credits
Based on XTTS v2 by Coqui AI. Uses GGML library for efficient inference.
See full documentation in the repository for detailed usage and build instructions.
- Downloads last month
- 182
							Hardware compatibility
						Log In
								
								to view the estimation
16-bit