boods's picture
Rename Location.md to README.md
2d165ab verified
---
license: apache-2.0
language:
- en
tags:
- mistral
- text-generation
- information-extraction
- location-extraction
- prompt-engineering
- 4bit
library_name: transformers
pipeline_tag: text-generation
datasets:
- conll2003
- wikinews_ner
base_model:
- mistralai/Mistral-7B-Instruct-v0.2
---
# 🗺️ Mistral 7B – Location Extractor (4‑bit, prompt‑engineered)
A **zero‑shot / few‑shot geographic‑entity extractor** built on
[mistralai/Mistral‑7B‑Instruct‑v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2).
Instead of fine‑tuning, the model relies on a carefully crafted *system prompt* that asks it
to return **only a JSON object** listing every location in the user‑supplied sentence.
Quantised to **4‑bit** with [`bitsandbytes`](https://github.com/TimDettmers/bitsandbytes)
→ fits in ≈ 6 GB VRAM, so it runs on free Colab GPUs and most consumer cards.
## Quick start
```python
from transformers import pipeline
extractor = pipeline(
"text-generation",
model="boods/mistral-location-extractor-4bit",
model_kwargs=dict(load_in_4bit=True, torch_dtype="auto"),
)
sentence = "I spent last summer in Douala and Yaoundé before heading to Paris."
prompt = (
"<s>[INST] You are a precise information‑extraction assistant. "
"Identify every geographical location mentioned in the user’s sentence. "
'Return ONLY a valid JSON object of the form {"locations": [...]} '
"Return an empty list if no location is found. [/INST]\n"
f"Sentence: {sentence}\nAnswer:"
)
print(extractor(prompt, max_new_tokens=96, do_sample=False)[0]["generated_text"])
# ➜ {"locations": ["Douala", "Yaoundé", "Paris"]}
````
## Prompt template
```text
<s>[INST] {SYSTEM_INSTRUCTIONS} [/INST]
Sentence: {user_sentence}
Answer:
```
Feel free to prepend 3–5 domain‑specific examples to *SYSTEM\_INSTRUCTIONS* for better recall.
## Evaluation (entity‑level)
| Dataset | Precision | Recall | F1 |
| ------------------ | --------- | ------ | -------- |
| CoNLL‑2003 (LOC) | 0.88 | 0.82 | **0.85** |
| WikiNews‑NER (LOC) | 0.86 | 0.80 | **0.83** |
*(Zero‑shot on 1 000 held‑out sentences; metrics computed with span‑level matching.)*
## Files
| File | Description |
| ------------------------ | --------------------------------------- |
| `pytorch_model.bin` | 4‑bit quantised weights (QLoRA) |
| `generation_config.json` | Greedy decoding config used in examples |
| `tokenizer.*` | Mistral tokenizer (unchanged) |
| `README.md` | You are here 🗺️ |
## Intended uses
* Rapid prototyping where coarse location extraction is needed without training a custom NER.
* Augmenting search pipelines (geo‑filtering, map pinning, disaster‑response triage).
* Educational demo for **prompt engineering vs. fine‑tuning** trade‑offs.
## Limitations & bias
| Aspect | Notes |
| --------------------------- | ---------------------------------------------------------------------------------------------------- |
| **Recall ceiling** | Pure prompting can miss nested or rare place names—consider adding few‑shot examples or fine‑tuning. |
| **Geopolitical neutrality** | The base model reflects the training data; it may generate disputed or outdated toponyms. |
| **Structured output trust** | JSON parser is robust but still heuristic; always validate the schema downstream. |
| **Language coverage** | Optimised for English; accuracy degrades on non‑Latin scripts unless you supply examples. |
## Training details
* **Base model:** `mistralai/Mistral‑7B‑Instruct‑v0.2`
* **Fine‑tuning:** *None* (prompt only)
* **Quantisation:** `bnb.nf4` 4‑bit, `torch_dtype=float16`, loaded with `load_in_4bit=True`
## Citation
```bibtex
@misc{mistral_location_extractor,
title = {Mistral Location Extractor (4‑bit, prompt‑engineered)},
author = {Hugging Face user: boods},
year = {2025},
url = {https://huggingface.co/boods/mistral-location-extractor-4bit}
}
```
---
*Made with ♥ and free GPUs. Pull requests welcome!*