nanokimi-mini
This repository contains the nanoKimi model pre-trained on Shakespeare dataset. An upgraded version of nanokimi trained on OpenWebText will be up on HuggingFace in a few days.
Model Details
- Architecture: 12 layers, 12 heads, 768 embedding dimension
- Training Data: Shakespeare dataset
- Features: Mixture of Experts (8 experts), Latent Attention
- Model Type: Kimi-K2
Files
pytorch_model.bin- Model weightsconfig.json- Model configurationsrc/- Source code for model architecturemodeling_kimik2.py- HuggingFace wrapper
Usage
import torch
import json
from huggingface_hub import hf_hub_download
# Download files
config_path = hf_hub_download(repo_id="sohv/nanokimi-mini", filename="config.json")
weights_path = hf_hub_download(repo_id="sohv/nanokimi-mini", filename="pytorch_model.bin")
# Load config and weights
with open(config_path) as f:
config = json.load(f)
weights = torch.load(weights_path, map_location="cpu")
print("Model downloaded successfully!")
License
MIT License
Contact
Raise an issue in Files and Version or reach out to me here for any feedback or enquiry.
- Downloads last month
- 87