ReelevateLM-q4f16_1 / README.md

premalt

add model weights

2bd99c3 5 months ago

preview code

raw

history blame contribute delete

1.44 kB

metadata

library_name: mlc-llm
base_model: meta-llama/Meta-Llama-3.1-8B-Instruct
tags:
  - mlc-llm
  - web-llm
  - llama-3.1
  - instruct
  - q4f16_1

ReelevateLM-q4f16

This is the Meta Llama 3.1 Instruct model fine‑tuned with LoRA and converted to MLC format q4f16_1.

The model can be used in:

Example Usage

Before running any examples, install MLC LLM by following the installation documentation.

Chat (CLI)

mlc_llm chat HF://pr0methium/ReelevateLM-q4f16_1

REST Server

mlc_llm serve HF://pr0methium/ReelevateLM-q4f16_1

Python API

from mlc_llm import MLCEngine

model = "HF://pr0methium/ReelevateLM-q4f16_1"
engine = MLCEngine(model)

for response in engine.chat.completions.create(
    messages=[{"role": "user", "content": "Write me a 30 second reel story…"}],
    model=model,
    stream=True,
):
    for choice in response.choices:
        print(choice.delta.content, end="", flush=True)
print()

engine.terminate()

Documentation

For more information on the MLC LLM project, please visit the docs and the GitHub repo.