Instructions to use outspark/ko-gemma-2b-lora-lbox-ljp-modified with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use outspark/ko-gemma-2b-lora-lbox-ljp-modified with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="outspark/ko-gemma-2b-lora-lbox-ljp-modified")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("outspark/ko-gemma-2b-lora-lbox-ljp-modified")
model = AutoModelForCausalLM.from_pretrained("outspark/ko-gemma-2b-lora-lbox-ljp-modified")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use outspark/ko-gemma-2b-lora-lbox-ljp-modified with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "outspark/ko-gemma-2b-lora-lbox-ljp-modified"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "outspark/ko-gemma-2b-lora-lbox-ljp-modified",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/outspark/ko-gemma-2b-lora-lbox-ljp-modified

SGLang

How to use outspark/ko-gemma-2b-lora-lbox-ljp-modified with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "outspark/ko-gemma-2b-lora-lbox-ljp-modified" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "outspark/ko-gemma-2b-lora-lbox-ljp-modified",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "outspark/ko-gemma-2b-lora-lbox-ljp-modified" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "outspark/ko-gemma-2b-lora-lbox-ljp-modified",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use outspark/ko-gemma-2b-lora-lbox-ljp-modified with Docker Model Runner:
```
docker model run hf.co/outspark/ko-gemma-2b-lora-lbox-ljp-modified
```

Model Card for Enhanced Language Model with LoRA

Model Description

This model, a LoRA-finetuned language model, is based on beomi/ko-gemma-2b. It was trained using the lbox/lbox_open and ljp_criminal datasets, specifically prepared by merging facts fields with ruling.text. This training approach aims to enhance the model's capability to understand and generate legal and factual text sequences. The fine-tuning was performed on two A100 GPUs.

LoRA Configuration

LoRA Alpha: 32
Rank (r): 16
LoRA Dropout: 0.05%
Bias Configuration: None
Targeted Modules:
- Query Projection (q_proj)
- Key Projection (k_proj)
- Value Projection (v_proj)
- Output Projection (o_proj)
- Gate Projection (gate_proj)
- Up Projection (up_proj)
- Down Projection (down_proj)

Training Configuration

Training Epochs: 1
Batch Size per Device: 2
Optimizer: Optimized AdamW with paged 32-bit precision
Learning Rate: 0.00005
Max Gradient Norm: 0.3
Learning Rate Scheduler: Constant
Warm-up Steps: 100
Gradient Accumulation Steps: 1

Model Training and Evaluation

The model was trained and evaluated using the SFTTrainer with the following parameters:

Max Sequence Length: 4096
Dataset Text Field: training_text
Packing: Disabled

How to Get Started with the Model

Use the following code snippet to load the model with Hugging Face Transformers:

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("your_model_id")
tokenizer = AutoTokenizer.from_pretrained("your_model_id")

# Example usage
inputs = tokenizer("Example input text", return_tensors="pt")
outputs = model.generate(**inputs)
print(tokenizer.decode(outputs[0]))

Downloads last month: 6

Safetensors

Model size

3B params

Tensor type

F32