nanokimi-mini

This repository contains the nanoKimi model pre-trained on Shakespeare dataset. An upgraded version of nanokimi trained on OpenWebText will be up on HuggingFace in a few days.

Model Details

  • Architecture: 12 layers, 12 heads, 768 embedding dimension
  • Training Data: Shakespeare dataset
  • Features: Mixture of Experts (8 experts), Latent Attention
  • Model Type: Kimi-K2

Files

  • pytorch_model.bin - Model weights
  • config.json - Model configuration
  • src/ - Source code for model architecture
  • modeling_kimik2.py - HuggingFace wrapper

Usage

import torch
import json
from huggingface_hub import hf_hub_download

# Download files
config_path = hf_hub_download(repo_id="sohv/nanokimi-mini", filename="config.json")
weights_path = hf_hub_download(repo_id="sohv/nanokimi-mini", filename="pytorch_model.bin")

# Load config and weights
with open(config_path) as f:
    config = json.load(f)

weights = torch.load(weights_path, map_location="cpu")
print("Model downloaded successfully!")

License

MIT License

Contact

Raise an issue in Files and Version or reach out to me here for any feedback or enquiry.

Downloads last month
87
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support