--- base_model: meta-llama/Meta-Llama-3.1-8B-Instruct library_name: transformers license: other # Overall usage governed by Meta's Llama 3.1 Community License adapter_license: mit base_model_license: Llama 3.1 Community License base_model_license_url: https://github.com/meta-llama/llama-models/blob/main/models/llama3_1/LICENSE tags: - rust - rust-programming - code-generation - qlora - lora - peft - llama - meta-llama-3.1 - instruction-tuned - text-generation - sigilderg - lora-adapter - base-required datasets: - ammarnasr/the-stack-rust-clean language: - en pipeline_tag: text-generation model-index: - name: llama8b-rust-qlora-phase1-step-9000 results: - task: type: text-generation dataset: name: rust-code-evaluation type: code-generation metrics: - name: Compilation Rate type: compilation_rate value: 0.54 - name: Clippy Warnings (avg) type: clippy_warnings value: 0.0 - name: Idiomatic Score type: idiomatic_score value: 0.1625 - name: Documentation Rate type: doc_comment_rate value: 0.0 - name: Avg Functions type: avg_functions value: 3.64 - name: Avg Structs type: avg_structs value: 0.28 - name: Avg Traits type: avg_traits value: 0.02 - name: Test Rate type: test_rate value: 0.0 - name: Prompt Match Score type: prompt_match value: 0.1675 source: name: SigilDERG Evaluation url: https://github.com/Superuser666-Sigil/SigilDERG-Finetuner --- # llama8b-rust-qlora-phase1 (checkpoint 9000 / 12000) > This card describes **checkpoint 9000** of the Phase 1 Rust QLoRA run. > For the full training plan, governance details, and final recommended checkpoints, see the **root model card** in the repository. > **Important:** This repository distributes **LoRA adapter weights only**, **not** the full `meta-llama/Meta-Llama-3.1-8B-Instruct` model. > To use these adapters, you must separately obtain access to the base model from Meta under the **Llama 3.1 Community License** and comply with Meta's license and acceptable-use policy. The adapters alone are not useful without the base model. ## Model Description This is a QLoRA fine-tuned **LoRA adapter on top of** `meta-llama/Meta-Llama-3.1-8B-Instruct` specifically trained on Rust code. The model uses 4-bit quantization with LoRA (Low-Rank Adaptation) adapters for efficient training and inference. The primary modality is **Rust code with English comments and explanations**. This checkpoint is part of the **SigilDERG** ecosystem and is intended as a building block for Rust-focused evaluation and governance tooling, not as a general-purpose all-domain assistant. ## Training Details ### Training Configuration - **Base Model**: `meta-llama/Meta-Llama-3.1-8B-Instruct` - **Checkpoint**: Phase 1, step 9,000 / 12,000 - **Effective Batch Size**: 16 × 4 (effective 64 tokens-per-step equivalent) - **Sequence Length**: 4096 - **Optimizer**: `paged_adamw_8bit` - **LR Scheduler**: cosine - **Peak Learning Rate**: ~1.52e-5 (around this checkpoint) - **Warmup Steps**: 250 - **Weight Decay**: 0.0 - **Gradient Checkpointing**: True - **BF16**: True - **Quantization During Training**: 4-bit QLoRA (NF4) with LoRA adapters ### LoRA Configuration - **Rank (r)**: 16 - **Alpha**: 16 - **Dropout**: 0.05 - **Target Modules**: `q_proj, k_proj, v_proj, o_proj, up_proj, down_proj, gate_proj` These adapters are intended to be loaded on top of the unmodified base weights. ### Quantization - **Method**: 4-bit NF4 (BitsAndBytes) - **Compute Dtype**: bfloat16 - **Double Quantization**: True ### Datasets Phase 1 was trained on: - `ammarnasr/the-stack-rust-clean` **Dataset configuration for this phase:** - **Min Length**: 64 - **Max Length**: 200000 - **Exclude Tests**: True - **Exclude Examples**: False - **Exclude Benches**: True - **Prefer Idiomatic**: False - **Prefer Documented**: False Phase 1 is a broad-inhale pass over cleaned Rust from The Stack. Later phases are designed to be more selective and incorporate explicit evaluation feedback. ## Training Metrics (around checkpoint 9000) Latest logged training metrics in the vicinity of this checkpoint: - **loss**: 0.645700 - **grad_norm**: 0.173596 - **learning_rate**: 1.5249989438168771e-05 - **entropy**: 0.669227 - **num_tokens**: 1,594,699,366 - **mean_token_accuracy**: 0.842414 - **epoch**: 2.777838 - **log_step**: 9,000 - **checkpoint_step**: 9,000 - **step**: 9,000 > Note: Logging occurs every few steps, so `log_step` reflects the nearest logged step to the checkpoint. ## Evaluation Results All evaluation here is based on **automatic Rust-focused checks** (compile, `clippy`, idiomatic heuristics, doc comments, prompt adherence) over a small but structured evaluation set. ### Aggregate Metrics (checkpoint 9000, 50 samples) - **Compilation Rate**: 54.00% - **Average Clippy Warnings**: 0.00 - **Idiomatic Score**: 0.1625 - **Documentation Rate**: 0.00% - **Test Rate**: 0.00% ### Functionality Coverage (approximate averages) - **Average Functions**: 3.64 - **Average Structs**: 0.28 - **Average Traits**: 0.02 - **Average Impls**: 0.26 ### Evaluation Artifacts - **Full metrics (JSONL)** – per-sample evaluation: - `metrics.jsonl` – compilation success, clippy warnings, idiomatic scores, doc detection, and structural stats - **Error logs (JSONL)** – compiler and runtime errors: - `errors.jsonl` – rustc diagnostics, clippy output, and runtime error messages (Replace these with your actual Hugging Face links as needed:) - [Metrics (JSONL)](https://huggingface.co/Superuser666-Sigil/Llama-3.1-8B-Instruct-Rust-QLora/blob/main/checkpoint-9000/metrics.jsonl) - [Error Logs (JSONL)](https://huggingface.co/Superuser666-Sigil/Llama-3.1-8B-Instruct-Rust-QLora/blob/main/checkpoint-9000/errors.jsonl) *Evaluation completed: 2025-11-20T05:03:45.292778* ## Governance and Intended Use This checkpoint is part of the **SigilDERG** ecosystem and follows **Rule Zero** principles: - **Primary Intended Use** - Rust code generation (functions, modules, small programs) - Rust code explanation, refactoring, and review - Tooling experiments for automated code evaluation, scoring, and self-improvement loops - **Not Intended For** - Medical, legal, financial, or other high-stakes decision-making - Safety-critical or life-critical systems without extensive human review - Domains outside software engineering where the model hasn't been evaluated Users remain responsible for: - Reviewing and testing all generated code before use in production. - Ensuring that their use of the **combined base model + adapters** complies with: - Meta's **Llama 3.1 Community License** and acceptable-use policy. - Any additional organizational or regulatory requirements. This work is not affiliated with or endorsed by Meta. ## Usage ### Loading the Model (LoRA adapters on base) ```python from transformers import AutoModelForCausalLM, AutoTokenizer from peft import PeftModel # Load base model (requires access from Meta under the Llama 3.1 Community License) base_model = AutoModelForCausalLM.from_pretrained( "meta-llama/Meta-Llama-3.1-8B-Instruct", device_map="auto", torch_dtype=torch.bfloat16 ) # Load LoRA adapter (this checkpoint) model = PeftModel.from_pretrained( base_model, "Superuser666-Sigil/Llama-3.1-8B-Instruct-Rust-QLora/checkpoint-9000" # or your local path ) tokenizer = AutoTokenizer.from_pretrained("meta-llama/Meta-Llama-3.1-8B-Instruct") ``` ### Generation Example ```python # Format prompt for the instruct model messages = [ {"role": "system", "content": "You are a helpful Rust programming assistant."}, {"role": "user", "content": "Write a function that calculates Fibonacci numbers."} ] # Apply chat template prompt = tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True ) # Generate inputs = tokenizer(prompt, return_tensors="pt").to(model.device) outputs = model.generate( **inputs, max_new_tokens=512, temperature=0.7, top_p=0.9 ) response = tokenizer.decode(outputs[0], skip_special_tokens=True) print(response) ``` > Note: You must load the Meta base model first and then apply this LoRA checkpoint. > The base weights are not redistributed in this repository. ## Limitations This adapter is tuned specifically for Rust code; performance on other programming languages or general natural language tasks may be degraded relative to the base model. The model inherits any limitations, biases, and failure modes from: - The base `meta-llama/Meta-Llama-3.1-8B-Instruct` model. - The training data used for Rust fine-tuning (`ammarnasr/the-stack-rust-clean`). Evaluation so far is focused on: - Compilation success. - Static analysis (clippy). - Simple idiomatic and documentation heuristics. - A small prompt suite. It should not be treated as a fully benchmarked or certified Rust expert. Generated code should always be reviewed, tested, and security-audited (where relevant) before use. ## Citation If you use this model or its training pipeline, please cite: ```bibtex @software{sigilderg_finetuner, title = {SigilDERG Rust Code Fine-tuned Model}, author = {Dave Tofflemire (Superuser666-Sigil)}, year = {2025}, url = {https://github.com/Superuser666-Sigil/SigilDERG-Finetuner} } ``` You should also follow any citation or attribution requirements specified in the Llama 3.1 Community License when referencing the base model. ## License This repository combines several components with different licenses: **Base Model (not included here)** - `meta-llama/Meta-Llama-3.1-8B-Instruct` - Licensed under the Llama 3.1 Community License by Meta. - See: https://github.com/meta-llama/llama-models/blob/main/models/llama3_1/LICENSE **LoRA Adapter Weights (this checkpoint)** - The adapter weights in this repository are my original contribution and are provided under the MIT License, only to the extent compatible with the Llama 3.1 Community License. - You may not use the combined base model + adapters in ways that violate Meta's license or acceptable-use policy, even though the adapter deltas themselves are MIT. **Training & Evaluation Code (SigilDERG-Finetuner, configs, scripts)** - All original code in the SigilDERG ecosystem is released under the MIT License, unless otherwise noted in the specific repository. **Practical summary:** To actually run this model, you must: 1. Have legitimate access to `meta-llama/Meta-Llama-3.1-8B-Instruct` under Meta's terms. 2. Load these LoRA adapters on top of that base model. 3. Your use of the combined system (base + adapters) is governed primarily by Meta's Llama 3.1 Community License. The MIT terms apply to the adapters and the SigilDERG code, but do not override or relax Meta's license. This project is independent and not affiliated with or endorsed by Meta.