KandirResearch
/

CiSiMi-v0.1

@@ -17,8 +17,8 @@ pipeline_tag: text-to-audio
 [![Buy Me A Coffee](https://img.shields.io/badge/Ko--fi-Support%20My%20Work-FF5E5B?style=for-the-badge&logo=ko-fi&logoColor=white)](https://ko-fi.com/lyte)
 [![Dataset](https://img.shields.io/badge/Dataset-KandirResearch/Speech2Speech-blue)](https://huggingface.co/datasets/KandirResearch/Speech2Speech)
-[![Model](https://img.shields.io/badge/Model-KandirResearch/CiSiMi-green)](https://huggingface.co/KandirResearch/CiSiMi)
-[![Demo](https://img.shields.io/badge/Demo-KandirResearch/CiSiMi--At--Home-orange)](https://huggingface.co/spaces/KandirResearch/CiSiMi-At-Home)
 ## Overview
@@ -36,7 +36,7 @@ This project demonstrates the power of open-source tools to create accessible sp
 - **Pipeline**: Text-to-audio
 - **Parameters**: 500M
 - **Training Dataset Size**: ~15k samples
-- **Future Goals**: Scale to 200k-500k dataset with multi-turn conversation using a 1B parameter model
 ### Training Methodology
@@ -77,20 +77,20 @@ from outetts.version.playback import ModelOutput
 # Download the model
 model_path = hf_hub_download(
-    repo_id="Lyte/CiSiMi",
     filename="unsloth.Q8_0.gguf",
 )
 # Configure the model
 model_config = outetts.GGUFModelConfig_v2(
     model_path=model_path,
-    tokenizer_path="Lyte/CiSiMi",
 )
 # Initialize components
 interface = outetts.InterfaceGGUF(model_version="0.3", cfg=model_config)
 audio_codec = AudioCodec()
-prompt_processor = PromptProcessor("Lyte/CiSiMi")
 device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
 gguf_model = interface.get_model()

 [![Buy Me A Coffee](https://img.shields.io/badge/Ko--fi-Support%20My%20Work-FF5E5B?style=for-the-badge&logo=ko-fi&logoColor=white)](https://ko-fi.com/lyte)
 [![Dataset](https://img.shields.io/badge/Dataset-KandirResearch/Speech2Speech-blue)](https://huggingface.co/datasets/KandirResearch/Speech2Speech)
+[![Model](https://img.shields.io/badge/Model-KandirResearch/CiSiMi-v0.1-green)](https://huggingface.co/KandirResearch/CiSiMi-v0.1)
+[![Demo](https://img.shields.io/badge/Demo-KandirResearch/CiSiMi-v0.1--At--Home-orange)](https://huggingface.co/spaces/KandirResearch/CiSiMi-At-Home)
 ## Overview
 - **Pipeline**: Text-to-audio
 - **Parameters**: 500M
 - **Training Dataset Size**: ~15k samples
+- **Future Goals**: Scale to 200k-500k dataset with multi-turn conversation using both a 500M and a 1B parameter model variants, plus adding streaming for realtime.
 ### Training Methodology
 # Download the model
 model_path = hf_hub_download(
+    repo_id="KandirResearch/CiSiMi-v0.1",
     filename="unsloth.Q8_0.gguf",
 )
 # Configure the model
 model_config = outetts.GGUFModelConfig_v2(
     model_path=model_path,
+    tokenizer_path="KandirResearch/CiSiMi-v0.1",
 )
 # Initialize components
 interface = outetts.InterfaceGGUF(model_version="0.3", cfg=model_config)
 audio_codec = AudioCodec()
+prompt_processor = PromptProcessor("KandirResearch/CiSiMi-v0.1")
 device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
 gguf_model = interface.get_model()