EmbeddingGemma 300M
					Collection
				
				3 items
				โข 
				Updated
					
				โข
					
					1
Recommended way to run this model:
llama-server -hf ggml-org/embeddinggemma-300M-GGUF --embeddings
Then the endpoint can be accessed at http://localhost:8080/embedding, for
example using curl:
curl --request POST \
    --url http://localhost:8080/embedding \
    --header "Content-Type: application/json" \
    --data '{"input": "Hello embeddings"}' \
    --silent
Alternatively, the llama-embedding command line tool can be used:
llama-embedding -hf ggml-org/embeddinggemma-300M-GGUF --verbose-prompt -p "Hello embeddings"
When a model uses pooling, or the pooling method is specified using --pooling,
the normalization can be controlled by the embd_normalize parameter.
The default value is 2 which means that the embeddings are normalized using
the Euclidean norm (L2). Other options are:
This can be passed in the request body to llama-server, for example:
    --data '{"input": "Hello embeddings", "embd_normalize": -1}' \
And for llama-embedding, by passing --embd-normalize <value>, for example:
llama-embedding -hf ggml-org/embeddinggemma-300M-GGUF  --embd-normalize -1 -p "Hello embeddings"
8-bit
Base model
google/embeddinggemma-300m