Update README.md
Browse files
README.md
CHANGED
|
@@ -45,18 +45,18 @@ This model has been fine-tuned using LoRA (Low-Rank Adaptation) technique on a c
|
|
| 45 |
- **Learning Rate**: 2e-4
|
| 46 |
- **Batch Size**: 6 per device
|
| 47 |
- **Gradient Accumulation**: 1 step
|
| 48 |
-
- **Warmup Steps**:
|
| 49 |
- **Weight Decay**: 0.01
|
| 50 |
- **LR Scheduler**: linear
|
| 51 |
- **Optimizer**: paged_adamw_8bit
|
| 52 |
- **Precision**: bfloat16
|
| 53 |
|
| 54 |
### LoRA Configuration
|
| 55 |
-
- **LoRA Rank**:
|
| 56 |
- **LoRA Alpha**: 32
|
| 57 |
-
- **Target Modules**:
|
| 58 |
-
- **Dropout**: 0.
|
| 59 |
-
- **Max Sequence Length**:
|
| 60 |
|
| 61 |
### Dataset
|
| 62 |
- **Size**: 25,650 examples
|
|
|
|
| 45 |
- **Learning Rate**: 2e-4
|
| 46 |
- **Batch Size**: 6 per device
|
| 47 |
- **Gradient Accumulation**: 1 step
|
| 48 |
+
- **Warmup Steps**: 5
|
| 49 |
- **Weight Decay**: 0.01
|
| 50 |
- **LR Scheduler**: linear
|
| 51 |
- **Optimizer**: paged_adamw_8bit
|
| 52 |
- **Precision**: bfloat16
|
| 53 |
|
| 54 |
### LoRA Configuration
|
| 55 |
+
- **LoRA Rank**: 32
|
| 56 |
- **LoRA Alpha**: 32
|
| 57 |
+
- **Target Modules**: ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"]
|
| 58 |
+
- **Dropout**: 0.05
|
| 59 |
+
- **Max Sequence Length**: 4096
|
| 60 |
|
| 61 |
### Dataset
|
| 62 |
- **Size**: 25,650 examples
|