--- license: apache-2.0 base_model: Qwen/Qwen3-0.6B-Base tags: - peft - lora - ai-detection - text-classification - raid-dataset - qwen - unsloth language: - en pipeline_tag: text-classification library_name: peft datasets: - liamdugan/raid metrics: - accuracy - precision - recall --- # Qwen3-0.6B AI Content Detector (LoRA) ## Model Description This is a LoRA (Low-Rank Adaptation) fine-tuned version of Qwen3-0.6B-Base for AI-generated content detection. The model is trained to classify text as either human-written (class 0) or AI-generated (class 1) using the RAID dataset. ## Model Details - **Base Model**: Qwen/Qwen3-0.6B-Base - **Fine-tuning Method**: LoRA (Low-Rank Adaptation) - **Task**: Binary text classification (Human vs AI content detection) - **Dataset**: RAID Dataset (train_none.csv) - **Training Framework**: Unsloth + Transformers - **Model Type**: Parameter-efficient fine-tuning adapter ## Training Details ### Dataset - **Source**: RAID Dataset for AI content detection - **Training Samples**: 24,000 (balanced: 12,000 human + 12,000 AI) - **Validation Samples**: 2,000 (balanced: 1,000 human + 1,000 AI) - **Class Balance**: 50% Human (class 0) / 50% AI (class 1) ### Training Configuration - **LoRA Rank**: 16 - **LoRA Alpha**: 16 - **Target Modules**: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj - **Learning Rate**: 1e-4 - **Batch Size**: 2 per device - **Epochs**: 1 - **Optimizer**: AdamW 8-bit - **Max Sequence Length**: 2048 ### Hardware - **GPU**: Tesla T4 (Google Colab) - **Precision**: FP16 - **Memory Optimization**: Gradient checkpointing enabled ## Usage ### Loading the Model ```python from unsloth import FastLanguageModel import torch # Load base model first model, tokenizer = FastLanguageModel.from_pretrained( model_name="subhashbs36/qwen3-0.6-ai-detector-merged", max_seq_length=4096, dtype=torch.float16, load_in_4bit=False, ) # Load your LoRA adapter # model.load_adapter("subhashbs36/qwen3-0.6-ai-detector-lora") # Enable inference mode FastLanguageModel.for_inference(model) ``` ```python import os import torch import torch.nn.functional as F # Enable CUDA debugging for accurate stack trace # os.environ['CUDA_LAUNCH_BLOCKING'] = '1' def classify_text_fixed(text_sample): prompt = f"""Here is a text sample: {text_sample} Classify this text into one of the following: class 0: Human class 1: AI SOLUTION The correct answer is: class """ inputs = tokenizer(prompt, return_tensors="pt") device = next(model.parameters()).device inputs = {k: v.to(device) for k, v in inputs.items()} with torch.no_grad(): outputs = model(**inputs) # Fix: Get the last token index as a scalar, not tensor last_token_idx = (inputs['attention_mask'].sum(1) - 1).item() last_logits = outputs.logits[0, last_token_idx, :] # Debug information print(f"Logits shape: {last_logits.shape}") print(f"Number token ids: {number_token_ids}") print(f"Vocab size: {last_logits.shape[0]}") # Check if any index is out of bounds vocab_size = last_logits.shape[0] for i, idx in enumerate(number_token_ids): if idx >= vocab_size: print(f"ERROR: Index {idx} (class {i}) is out of bounds for vocab size {vocab_size}") return None, None probs_all = F.softmax(last_logits, dim=-1) probs = probs_all[number_token_ids] predicted_class = torch.argmax(probs).item() confidence = probs[predicted_class].item() return predicted_class, confidence ``` ## Performance - **Task**: Binary classification (Human vs AI content detection) - **Classes**: - Class 0: Human-written content - Class 1: AI-generated content - **Evaluation**: Tested on balanced validation set from RAID dataset ## Limitations - Trained specifically on RAID dataset distribution - Performance may vary on out-of-domain text - Designed for English text classification - Requires specific prompt format for optimal performance ## Technical Implementation This model uses a custom approach with: - **Reduced vocabulary**: Only uses token IDs for classes 0 and 1 - **Custom data collator**: Trains only on the last token of sequences - **Token mapping**: Maps original vocabulary to reduced classification head - **Parameter-efficient training**: Uses LoRA for efficient fine-tuning ## Citation If you use this model in your research, please cite: ``` @misc{qwen3-ai-detector-2025, title={Qwen3-0.6B AI Content Detector}, author={subhashbs36}, year={2025}, howpublished={Hugging Face Model Hub}, url={https://huggingface.co/subhashbs36/qwen3-0.6-ai-detector-lora} } ``` ## License This model is released under the Apache 2.0 license, following the base model's licensing terms. ## Acknowledgments - Built using [Unsloth](https://github.com/unslothai/unsloth) for efficient training - Based on Qwen3-0.6B-Base by Alibaba Cloud - Trained on RAID dataset for AI content detection research - Utilizes LoRA (Low-Rank Adaptation) for parameter-efficient fine-tuning