---
base_model: meta-llama/Meta-Llama-3.1-8B-Instruct
library_name: transformers
license: other  # Overall usage governed by Meta's Llama 3.1 Community License
adapter_license: mit
base_model_license: Llama 3.1 Community License
base_model_license_url: https://github.com/meta-llama/llama-models/blob/main/models/llama3_1/LICENSE
tags:
- rust
- rust-programming
- code-generation
- qlora
- lora
- peft
- llama
- meta-llama-3.1
- instruction-tuned
- text-generation
- sigilderg
- lora-adapter
- base-required
datasets:
- ammarnasr/the-stack-rust-clean
language:
- en
pipeline_tag: text-generation
model-index:
- name: llama8b-rust-qlora-phase1-step-9000
  results:
  - task:
      type: text-generation
    dataset:
      name: rust-code-evaluation
      type: code-generation
    metrics:
    - name: Compilation Rate
      type: compilation_rate
      value: 0.54
    - name: Clippy Warnings (avg)
      type: clippy_warnings
      value: 0.0
    - name: Idiomatic Score
      type: idiomatic_score
      value: 0.1625
    - name: Documentation Rate
      type: doc_comment_rate
      value: 0.0
    - name: Avg Functions
      type: avg_functions
      value: 3.64
    - name: Avg Structs
      type: avg_structs
      value: 0.28
    - name: Avg Traits
      type: avg_traits
      value: 0.02
    - name: Test Rate
      type: test_rate
      value: 0.0
    - name: Prompt Match Score
      type: prompt_match
      value: 0.1675
    source:
      name: SigilDERG Evaluation
      url: https://github.com/Superuser666-Sigil/SigilDERG-Finetuner
---

# llama8b-rust-qlora-phase1 (checkpoint 9000 / 12000)

> This card describes **checkpoint 9000** of the Phase 1 Rust QLoRA run.  
> For the full training plan, governance details, and final recommended checkpoints, see the **root model card** in the repository.

> **Important:** This repository distributes **LoRA adapter weights only**, **not** the full `meta-llama/Meta-Llama-3.1-8B-Instruct` model.  
> To use these adapters, you must separately obtain access to the base model from Meta under the **Llama 3.1 Community License** and comply with Meta's license and acceptable-use policy. The adapters alone are not useful without the base model.

## Model Description

This is a QLoRA fine-tuned **LoRA adapter on top of** `meta-llama/Meta-Llama-3.1-8B-Instruct` specifically trained on Rust code. The model uses 4-bit quantization with LoRA (Low-Rank Adaptation) adapters for efficient training and inference.

The primary modality is **Rust code with English comments and explanations**.

This checkpoint is part of the **SigilDERG** ecosystem and is intended as a building block for Rust-focused evaluation and governance tooling, not as a general-purpose all-domain assistant.

## Training Details

### Training Configuration

- **Base Model**: `meta-llama/Meta-Llama-3.1-8B-Instruct`
- **Checkpoint**: Phase 1, step 9,000 / 12,000
- **Effective Batch Size**: 16 × 4 (effective 64 tokens-per-step equivalent)
- **Sequence Length**: 4096
- **Optimizer**: `paged_adamw_8bit`
- **LR Scheduler**: cosine
- **Peak Learning Rate**: ~1.52e-5 (around this checkpoint)
- **Warmup Steps**: 250
- **Weight Decay**: 0.0
- **Gradient Checkpointing**: True
- **BF16**: True
- **Quantization During Training**: 4-bit QLoRA (NF4) with LoRA adapters

### LoRA Configuration

- **Rank (r)**: 16  
- **Alpha**: 16  
- **Dropout**: 0.05  
- **Target Modules**: `q_proj, k_proj, v_proj, o_proj, up_proj, down_proj, gate_proj`  

These adapters are intended to be loaded on top of the unmodified base weights.

### Quantization

- **Method**: 4-bit NF4 (BitsAndBytes)
- **Compute Dtype**: bfloat16
- **Double Quantization**: True

### Datasets

Phase 1 was trained on:

- `ammarnasr/the-stack-rust-clean`

**Dataset configuration for this phase:**

- **Min Length**: 64
- **Max Length**: 200000
- **Exclude Tests**: True
- **Exclude Examples**: False
- **Exclude Benches**: True
- **Prefer Idiomatic**: False
- **Prefer Documented**: False

Phase 1 is a broad-inhale pass over cleaned Rust from The Stack. Later phases are designed to be more selective and incorporate explicit evaluation feedback.

## Training Metrics (around checkpoint 9000)

Latest logged training metrics in the vicinity of this checkpoint:

- **loss**: 0.645700  
- **grad_norm**: 0.173596  
- **learning_rate**: 1.5249989438168771e-05  
- **entropy**: 0.669227  
- **num_tokens**: 1,594,699,366  
- **mean_token_accuracy**: 0.842414  
- **epoch**: 2.777838  
- **log_step**: 9,000  
- **checkpoint_step**: 9,000  
- **step**: 9,000  

> Note: Logging occurs every few steps, so `log_step` reflects the nearest logged step to the checkpoint.

## Evaluation Results

All evaluation here is based on **automatic Rust-focused checks** (compile, `clippy`, idiomatic heuristics, doc comments, prompt adherence) over a small but structured evaluation set.

### Aggregate Metrics (checkpoint 9000, 50 samples)

- **Compilation Rate**: 54.00%  
- **Average Clippy Warnings**: 0.00  
- **Idiomatic Score**: 0.1625  
- **Documentation Rate**: 0.00%  
- **Test Rate**: 0.00%  

### Functionality Coverage (approximate averages)

- **Average Functions**: 3.64  
- **Average Structs**: 0.28  
- **Average Traits**: 0.02  
- **Average Impls**: 0.26  

### Evaluation Artifacts

- **Full metrics (JSONL)** – per-sample evaluation:
  - `metrics.jsonl` – compilation success, clippy warnings, idiomatic scores, doc detection, and structural stats
- **Error logs (JSONL)** – compiler and runtime errors:
  - `errors.jsonl` – rustc diagnostics, clippy output, and runtime error messages

(Replace these with your actual Hugging Face links as needed:)

- [Metrics (JSONL)](https://huggingface.co/Superuser666-Sigil/Llama-3.1-8B-Instruct-Rust-QLora/blob/main/checkpoint-9000/metrics.jsonl)  
- [Error Logs (JSONL)](https://huggingface.co/Superuser666-Sigil/Llama-3.1-8B-Instruct-Rust-QLora/blob/main/checkpoint-9000/errors.jsonl)  

*Evaluation completed: 2025-11-20T05:03:45.292778*

## Governance and Intended Use

This checkpoint is part of the **SigilDERG** ecosystem and follows **Rule Zero** principles:

- **Primary Intended Use**  
  - Rust code generation (functions, modules, small programs)  
  - Rust code explanation, refactoring, and review  
  - Tooling experiments for automated code evaluation, scoring, and self-improvement loops

- **Not Intended For**  
  - Medical, legal, financial, or other high-stakes decision-making  
  - Safety-critical or life-critical systems without extensive human review  
  - Domains outside software engineering where the model hasn't been evaluated

Users remain responsible for:

- Reviewing and testing all generated code before use in production.
- Ensuring that their use of the **combined base model + adapters** complies with:
  - Meta's **Llama 3.1 Community License** and acceptable-use policy.
  - Any additional organizational or regulatory requirements.

This work is not affiliated with or endorsed by Meta.

## Usage

### Loading the Model (LoRA adapters on base)

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

# Load base model (requires access from Meta under the Llama 3.1 Community License)
base_model = AutoModelForCausalLM.from_pretrained(
    "meta-llama/Meta-Llama-3.1-8B-Instruct",
    device_map="auto",
    torch_dtype=torch.bfloat16
)

# Load LoRA adapter (this checkpoint)
model = PeftModel.from_pretrained(
    base_model,
    "Superuser666-Sigil/Llama-3.1-8B-Instruct-Rust-QLora/checkpoint-9000"  # or your local path
)

tokenizer = AutoTokenizer.from_pretrained("meta-llama/Meta-Llama-3.1-8B-Instruct")
```

### Generation Example

```python
# Format prompt for the instruct model
messages = [
    {"role": "system", "content": "You are a helpful Rust programming assistant."},
    {"role": "user", "content": "Write a function that calculates Fibonacci numbers."}
]

# Apply chat template
prompt = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)

# Generate
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
    **inputs,
    max_new_tokens=512,
    temperature=0.7,
    top_p=0.9
)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
```

> Note: You must load the Meta base model first and then apply this LoRA checkpoint.  
> The base weights are not redistributed in this repository.

## Limitations

This adapter is tuned specifically for Rust code; performance on other programming languages or general natural language tasks may be degraded relative to the base model.

The model inherits any limitations, biases, and failure modes from:

- The base `meta-llama/Meta-Llama-3.1-8B-Instruct` model.
- The training data used for Rust fine-tuning (`ammarnasr/the-stack-rust-clean`).

Evaluation so far is focused on:

- Compilation success.
- Static analysis (clippy).
- Simple idiomatic and documentation heuristics.
- A small prompt suite.

It should not be treated as a fully benchmarked or certified Rust expert.

Generated code should always be reviewed, tested, and security-audited (where relevant) before use.

## Citation

If you use this model or its training pipeline, please cite:

```bibtex
@software{sigilderg_finetuner,
  title  = {SigilDERG Rust Code Fine-tuned Model},
  author = {Dave Tofflemire (Superuser666-Sigil)},
  year   = {2025},
  url    = {https://github.com/Superuser666-Sigil/SigilDERG-Finetuner}
}
```

You should also follow any citation or attribution requirements specified in the Llama 3.1 Community License when referencing the base model.

## License

This repository combines several components with different licenses:

**Base Model (not included here)**

- `meta-llama/Meta-Llama-3.1-8B-Instruct`
- Licensed under the Llama 3.1 Community License by Meta.
- See: https://github.com/meta-llama/llama-models/blob/main/models/llama3_1/LICENSE

**LoRA Adapter Weights (this checkpoint)**

- The adapter weights in this repository are my original contribution and are provided under the MIT License,
only to the extent compatible with the Llama 3.1 Community License.

- You may not use the combined base model + adapters in ways that violate Meta's license or acceptable-use policy, even though the adapter deltas themselves are MIT.

**Training & Evaluation Code (SigilDERG-Finetuner, configs, scripts)**

- All original code in the SigilDERG ecosystem is released under the MIT License, unless otherwise noted in the specific repository.

**Practical summary:**

To actually run this model, you must:

1. Have legitimate access to `meta-llama/Meta-Llama-3.1-8B-Instruct` under Meta's terms.
2. Load these LoRA adapters on top of that base model.
3. Your use of the combined system (base + adapters) is governed primarily by Meta's Llama 3.1 Community License.

The MIT terms apply to the adapters and the SigilDERG code, but do not override or relax Meta's license.

This project is independent and not affiliated with or endorsed by Meta.