moogician
/

DSR1-Qwen-32B-scg

Text Generation

Generated from Trainer

text-generation-inference

Model card Files Files and versions

DSR1-Qwen-32B-scg

11.5 MB

1 contributor

History: 8 commits

moogician's picture

Upload tokenizer.json with huggingface_hub

39f67fa verified 8 months ago

.gitattributes

1.57 kB

Upload tokenizer.json with huggingface_hub 8 months ago
README.md

1.36 kB

Upload README.md with huggingface_hub 8 months ago
all_results.json

202 Bytes

Upload all_results.json with huggingface_hub 8 months ago
tokenizer.json

11.4 MB
xet

Upload tokenizer.json with huggingface_hub 8 months ago
train_results.json

202 Bytes

Upload train_results.json with huggingface_hub 8 months ago
trainer_state.json

12.9 kB

Upload trainer_state.json with huggingface_hub 8 months ago
training_args.bin
Detected Pickle imports (14)
- "transformers.trainer_pt_utils.AcceleratorConfig",
- "transformers.trainer_utils.SchedulerType",
- "transformers.trainer_utils.HubStrategy",
- "llamafactory.hparams.training_args.TrainingArguments",
- "torch.bfloat16",
- "accelerate.utils.dataclasses.DeepSpeedPlugin",
- "transformers.trainer_utils.IntervalStrategy",
- "transformers.integrations.deepspeed.HfTrainerDeepSpeedConfig",
- "transformers.training_args.OptimizerNames",
- "accelerate.utils.dataclasses.DistributedType",
- "transformers.integrations.deepspeed.HfDeepSpeedConfig",
- "transformers.trainer_utils.SaveStrategy",
- "torch.device",
- "accelerate.state.PartialState"
How to fix it?
7.74 kB
xet

Upload training_args.bin with huggingface_hub 8 months ago
training_loss.png

44.5 kB

Upload training_loss.png with huggingface_hub 8 months ago