llama8b-rust-qlora-phase1 (checkpoint 9000 / 12000)
This card describes checkpoint 9000 of the Phase 1 Rust QLoRA run.
For the full training plan, governance details, and final recommended checkpoints, see the root model card in the repository.
Important: This repository distributes LoRA adapter weights only, not the full
meta-llama/Meta-Llama-3.1-8B-Instructmodel.
To use these adapters, you must separately obtain access to the base model from Meta under the Llama 3.1 Community License and comply with Meta's license and acceptable-use policy. The adapters alone are not useful without the base model.
Model Description
This is a QLoRA fine-tuned LoRA adapter on top of meta-llama/Meta-Llama-3.1-8B-Instruct specifically trained on Rust code. The model uses 4-bit quantization with LoRA (Low-Rank Adaptation) adapters for efficient training and inference.
The primary modality is Rust code with English comments and explanations.
This checkpoint is part of the SigilDERG ecosystem and is intended as a building block for Rust-focused evaluation and governance tooling, not as a general-purpose all-domain assistant.
Training Details
Training Configuration
- Base Model:
meta-llama/Meta-Llama-3.1-8B-Instruct - Checkpoint: Phase 1, step 9,000 / 12,000
- Effective Batch Size: 16 × 4 (effective 64 tokens-per-step equivalent)
- Sequence Length: 4096
- Optimizer:
paged_adamw_8bit - LR Scheduler: cosine
- Peak Learning Rate: ~1.52e-5 (around this checkpoint)
- Warmup Steps: 250
- Weight Decay: 0.0
- Gradient Checkpointing: True
- BF16: True
- Quantization During Training: 4-bit QLoRA (NF4) with LoRA adapters
LoRA Configuration
- Rank (r): 16
- Alpha: 16
- Dropout: 0.05
- Target Modules:
q_proj, k_proj, v_proj, o_proj, up_proj, down_proj, gate_proj
These adapters are intended to be loaded on top of the unmodified base weights.
Quantization
- Method: 4-bit NF4 (BitsAndBytes)
- Compute Dtype: bfloat16
- Double Quantization: True
Datasets
Phase 1 was trained on:
ammarnasr/the-stack-rust-clean
Dataset configuration for this phase:
- Min Length: 64
- Max Length: 200000
- Exclude Tests: True
- Exclude Examples: False
- Exclude Benches: True
- Prefer Idiomatic: False
- Prefer Documented: False
Phase 1 is a broad-inhale pass over cleaned Rust from The Stack. Later phases are designed to be more selective and incorporate explicit evaluation feedback.
Training Metrics (around checkpoint 9000)
Latest logged training metrics in the vicinity of this checkpoint:
- loss: 0.645700
- grad_norm: 0.173596
- learning_rate: 1.5249989438168771e-05
- entropy: 0.669227
- num_tokens: 1,594,699,366
- mean_token_accuracy: 0.842414
- epoch: 2.777838
- log_step: 9,000
- checkpoint_step: 9,000
- step: 9,000
Note: Logging occurs every few steps, so
log_stepreflects the nearest logged step to the checkpoint.
Evaluation Results
All evaluation here is based on automatic Rust-focused checks (compile, clippy, idiomatic heuristics, doc comments, prompt adherence) over a small but structured evaluation set.
Aggregate Metrics (checkpoint 9000, 50 samples)
- Compilation Rate: 54.00%
- Average Clippy Warnings: 0.00
- Idiomatic Score: 0.1625
- Documentation Rate: 0.00%
- Test Rate: 0.00%
Functionality Coverage (approximate averages)
- Average Functions: 3.64
- Average Structs: 0.28
- Average Traits: 0.02
- Average Impls: 0.26
Evaluation Artifacts
- Full metrics (JSONL) – per-sample evaluation:
metrics.jsonl– compilation success, clippy warnings, idiomatic scores, doc detection, and structural stats
- Error logs (JSONL) – compiler and runtime errors:
errors.jsonl– rustc diagnostics, clippy output, and runtime error messages
(Replace these with your actual Hugging Face links as needed:)
Evaluation completed: 2025-11-20T05:03:45.292778
Governance and Intended Use
This checkpoint is part of the SigilDERG ecosystem and follows Rule Zero principles:
Primary Intended Use
- Rust code generation (functions, modules, small programs)
- Rust code explanation, refactoring, and review
- Tooling experiments for automated code evaluation, scoring, and self-improvement loops
Not Intended For
- Medical, legal, financial, or other high-stakes decision-making
- Safety-critical or life-critical systems without extensive human review
- Domains outside software engineering where the model hasn't been evaluated
Users remain responsible for:
- Reviewing and testing all generated code before use in production.
- Ensuring that their use of the combined base model + adapters complies with:
- Meta's Llama 3.1 Community License and acceptable-use policy.
- Any additional organizational or regulatory requirements.
This work is not affiliated with or endorsed by Meta.
Usage
Loading the Model (LoRA adapters on base)
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
# Load base model (requires access from Meta under the Llama 3.1 Community License)
base_model = AutoModelForCausalLM.from_pretrained(
"meta-llama/Meta-Llama-3.1-8B-Instruct",
device_map="auto",
torch_dtype=torch.bfloat16
)
# Load LoRA adapter (this checkpoint)
model = PeftModel.from_pretrained(
base_model,
"Superuser666-Sigil/Llama-3.1-8B-Instruct-Rust-QLora/checkpoint-9000" # or your local path
)
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Meta-Llama-3.1-8B-Instruct")
Generation Example
# Format prompt for the instruct model
messages = [
{"role": "system", "content": "You are a helpful Rust programming assistant."},
{"role": "user", "content": "Write a function that calculates Fibonacci numbers."}
]
# Apply chat template
prompt = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
# Generate
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=512,
temperature=0.7,
top_p=0.9
)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
Note: You must load the Meta base model first and then apply this LoRA checkpoint.
The base weights are not redistributed in this repository.
Limitations
This adapter is tuned specifically for Rust code; performance on other programming languages or general natural language tasks may be degraded relative to the base model.
The model inherits any limitations, biases, and failure modes from:
- The base
meta-llama/Meta-Llama-3.1-8B-Instructmodel. - The training data used for Rust fine-tuning (
ammarnasr/the-stack-rust-clean).
Evaluation so far is focused on:
- Compilation success.
- Static analysis (clippy).
- Simple idiomatic and documentation heuristics.
- A small prompt suite.
It should not be treated as a fully benchmarked or certified Rust expert.
Generated code should always be reviewed, tested, and security-audited (where relevant) before use.
Citation
If you use this model or its training pipeline, please cite:
@software{sigilderg_finetuner,
title = {SigilDERG Rust Code Fine-tuned Model},
author = {Dave Tofflemire (Superuser666-Sigil)},
year = {2025},
url = {https://github.com/Superuser666-Sigil/SigilDERG-Finetuner}
}
You should also follow any citation or attribution requirements specified in the Llama 3.1 Community License when referencing the base model.
License
This repository combines several components with different licenses:
Base Model (not included here)
meta-llama/Meta-Llama-3.1-8B-Instruct- Licensed under the Llama 3.1 Community License by Meta.
- See: https://github.com/meta-llama/llama-models/blob/main/models/llama3_1/LICENSE
LoRA Adapter Weights (this checkpoint)
The adapter weights in this repository are my original contribution and are provided under the MIT License, only to the extent compatible with the Llama 3.1 Community License.
You may not use the combined base model + adapters in ways that violate Meta's license or acceptable-use policy, even though the adapter deltas themselves are MIT.
Training & Evaluation Code (SigilDERG-Finetuner, configs, scripts)
- All original code in the SigilDERG ecosystem is released under the MIT License, unless otherwise noted in the specific repository.
Practical summary:
To actually run this model, you must:
- Have legitimate access to
meta-llama/Meta-Llama-3.1-8B-Instructunder Meta's terms. - Load these LoRA adapters on top of that base model.
- Your use of the combined system (base + adapters) is governed primarily by Meta's Llama 3.1 Community License.
The MIT terms apply to the adapters and the SigilDERG code, but do not override or relax Meta's license.
This project is independent and not affiliated with or endorsed by Meta.
Model tree for Superuser666-Sigil/Llama-3.1-8B-Instruct-Rust-QLora
Base model
meta-llama/Llama-3.1-8BDataset used to train Superuser666-Sigil/Llama-3.1-8B-Instruct-Rust-QLora
Evaluation results
- Compilation Rate on rust-code-evaluationSigilDERG Evaluation0.540
- Clippy Warnings (avg) on rust-code-evaluationSigilDERG Evaluation0.000
- Idiomatic Score on rust-code-evaluationSigilDERG Evaluation0.163
- Documentation Rate on rust-code-evaluationSigilDERG Evaluation0.000
- Avg Functions on rust-code-evaluationSigilDERG Evaluation3.640
- Avg Structs on rust-code-evaluationSigilDERG Evaluation0.280
- Avg Traits on rust-code-evaluationSigilDERG Evaluation0.020
- Test Rate on rust-code-evaluationSigilDERG Evaluation0.000
- Prompt Match Score on rust-code-evaluationSigilDERG Evaluation0.168