Instructions to use outspark/ko-gemma-2b-lora-lbox-ljp-modified with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use outspark/ko-gemma-2b-lora-lbox-ljp-modified with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="outspark/ko-gemma-2b-lora-lbox-ljp-modified")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("outspark/ko-gemma-2b-lora-lbox-ljp-modified") model = AutoModelForCausalLM.from_pretrained("outspark/ko-gemma-2b-lora-lbox-ljp-modified") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use outspark/ko-gemma-2b-lora-lbox-ljp-modified with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "outspark/ko-gemma-2b-lora-lbox-ljp-modified" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "outspark/ko-gemma-2b-lora-lbox-ljp-modified", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/outspark/ko-gemma-2b-lora-lbox-ljp-modified
- SGLang
How to use outspark/ko-gemma-2b-lora-lbox-ljp-modified with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "outspark/ko-gemma-2b-lora-lbox-ljp-modified" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "outspark/ko-gemma-2b-lora-lbox-ljp-modified", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "outspark/ko-gemma-2b-lora-lbox-ljp-modified" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "outspark/ko-gemma-2b-lora-lbox-ljp-modified", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use outspark/ko-gemma-2b-lora-lbox-ljp-modified with Docker Model Runner:
docker model run hf.co/outspark/ko-gemma-2b-lora-lbox-ljp-modified
Model Card for Enhanced Language Model with LoRA
Model Description
This model, a LoRA-finetuned language model, is based on beomi/ko-gemma-2b. It was trained using the lbox/lbox_open and ljp_criminal datasets, specifically prepared by merging facts fields with ruling.text. This training approach aims to enhance the model's capability to understand and generate legal and factual text sequences. The fine-tuning was performed on two A100 GPUs.
LoRA Configuration
- LoRA Alpha: 32
- Rank (r): 16
- LoRA Dropout: 0.05%
- Bias Configuration: None
- Targeted Modules:
- Query Projection (
q_proj) - Key Projection (
k_proj) - Value Projection (
v_proj) - Output Projection (
o_proj) - Gate Projection (
gate_proj) - Up Projection (
up_proj) - Down Projection (
down_proj)
- Query Projection (
Training Configuration
- Training Epochs: 1
- Batch Size per Device: 2
- Optimizer: Optimized AdamW with paged 32-bit precision
- Learning Rate: 0.00005
- Max Gradient Norm: 0.3
- Learning Rate Scheduler: Constant
- Warm-up Steps: 100
- Gradient Accumulation Steps: 1
Model Training and Evaluation
The model was trained and evaluated using the SFTTrainer with the following parameters:
- Max Sequence Length: 4096
- Dataset Text Field:
training_text - Packing: Disabled
How to Get Started with the Model
Use the following code snippet to load the model with Hugging Face Transformers:
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("your_model_id")
tokenizer = AutoTokenizer.from_pretrained("your_model_id")
# Example usage
inputs = tokenizer("Example input text", return_tensors="pt")
outputs = model.generate(**inputs)
print(tokenizer.decode(outputs[0]))
- Downloads last month
- 6