Missing `modeling_decilm.py` when loading the model
#1
by
shawn2333
- opened
Dear developer, would you kindly share the modeling_decilm.py as well, thanks a lot!
I would suggest substituing the missing files from the previous model nvidia/Llama-3_3-Nemotron-Super-49B-v1. It works for me!
Thanks for the reminder, we have uploaded the relevant files.
Had to delete NEED_SETUP_CACHE_CLASSES_MAPPING to make it work:
from transformers.generation.utils import NEED_SETUP_CACHE_CLASSES_MAPPING, GenerationMixin, GenerateOutput
...
NEED_SETUP_CACHE_CLASSES_MAPPING["variable"] = VariableCache