GGUF for Ollama

#3
by SwimTreeWire - opened

I would like to use this with ollama. How can i make the GGUF from this repo?

This is a new architecture and support hasn't been merged into Llama.cpp yet.

how can this be achieved? can i somehow make a GGUF myself and upload?

Nope, that means someone has to write the support for the model in the backend itself. You can probably sub to https://github.com/ggml-org/llama.cpp/issues/15748 to get updates.

Changes have been merged to llama.cpp and are hopefully coming to Ollama πŸ₯³

I've been experimenting today and wrote about how I experimentally run Apertus in Ollama on my Mac here: https://gist.github.com/pd95/7841bb5d15220773c4ca8666f024c7c9

Swiss AI Initiative org

this is supported now in https://github.com/ggml-org/llama.cpp

you already find many GGUF quantizations on huggingface: https://huggingface.co/models?library=gguf&sort=trending&search=apertus

those will be working in ollama as well soon (but ollama has to first update to use the most recent llama.cpp code).

Should now also work with ollama.
https://github.com/ollama/ollama/issues/12149

Confirmed, thanks. See my blog post for more details & instructions https://log.alets.ch/110/#using-ollama

Sign up or log in to comment