runtime error

Exit code: 1. Reason: .py: 0%| | 0.00/20.0k [00:00<?, ?B/s] modeling_dots_vision.py: 100%|██████████| 20.0k/20.0k [00:00<00:00, 55.2MB/s] preprocessor_config.json: 0%| | 0.00/432 [00:00<?, ?B/s] preprocessor_config.json: 100%|██████████| 432/432 [00:00<00:00, 4.17MB/s] special_tokens_map.json: 0%| | 0.00/494 [00:00<?, ?B/s] special_tokens_map.json: 100%|██████████| 494/494 [00:00<00:00, 3.55MB/s] tokenizer.json: 0%| | 0.00/7.04M [00:00<?, ?B/s] tokenizer.json: 100%|██████████| 7.04M/7.04M [00:00<00:00, 67.9MB/s] tokenizer_config.json: 0%| | 0.00/9.31k [00:00<?, ?B/s] tokenizer_config.json: 100%|██████████| 9.31k/9.31k [00:00<00:00, 66.0MB/s] vocab.json: 0%| | 0.00/2.78M [00:00<?, ?B/s] vocab.json: 100%|██████████| 2.78M/2.78M [00:00<00:00, 18.5MB/s] Traceback (most recent call last): File "/home/user/app/app.py", line 331, in <module> model = AutoModelForCausalLM.from_pretrained( File "/usr/local/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 564, in from_pretrained return model_class.from_pretrained( File "/usr/local/lib/python3.10/site-packages/transformers/modeling_utils.py", line 279, in _wrapper return func(*args, **kwargs) File "/usr/local/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4336, in from_pretrained config = cls._autoset_attn_implementation( File "/usr/local/lib/python3.10/site-packages/transformers/modeling_utils.py", line 2109, in _autoset_attn_implementation cls._check_and_enable_flash_attn_2( File "/usr/local/lib/python3.10/site-packages/transformers/modeling_utils.py", line 2261, in _check_and_enable_flash_attn_2 raise ValueError( ValueError: FlashAttention2 has been toggled on, but it cannot be used due to the following error: Flash Attention 2 is not available on CPU. Please make sure torch can access a CUDA device.

Container logs:

Fetching error logs...