GGUF for Ollama
I would like to use this with ollama. How can i make the GGUF from this repo?
This is a new architecture and support hasn't been merged into Llama.cpp yet.
how can this be achieved? can i somehow make a GGUF myself and upload?
Nope, that means someone has to write the support for the model in the backend itself. You can probably sub to https://github.com/ggml-org/llama.cpp/issues/15748 to get updates.
Changes have been merged to llama.cpp and are hopefully coming to Ollama π₯³
I've been experimenting today and wrote about how I experimentally run Apertus in Ollama on my Mac here: https://gist.github.com/pd95/7841bb5d15220773c4ca8666f024c7c9
this is supported now in https://github.com/ggml-org/llama.cpp
you already find many GGUF quantizations on huggingface: https://huggingface.co/models?library=gguf&sort=trending&search=apertus
those will be working in ollama as well soon (but ollama has to first update to use the most recent llama.cpp code).
awesome!
Confirmed, thanks. See my blog post for more details & instructions https://log.alets.ch/110/#using-ollama

 
						