Add model card.
Browse files
README.md
CHANGED
|
@@ -1,3 +1,52 @@
|
|
| 1 |
---
|
| 2 |
license: cc-by-nc-sa-4.0
|
| 3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
license: cc-by-nc-sa-4.0
|
| 3 |
---
|
| 4 |
+
|
| 5 |
+
# LLMLingua-2-Bert-base-Multilingual-Cased-MeetingBank
|
| 6 |
+
|
| 7 |
+
This model was introduced in the paper [**LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression** (Pan et al, 2024)](). It is a [BERT multilingual base model (cased)](https://huggingface.co/google-bert/bert-base-multilingual-cased) finetuned to perform token classification for task agnostic prompt compression. The probability $p_{preserve}$ of each token $x_i$ is used as the metric for compression. This model is trained on an extractive text compression dataset constructed with the methodology proposed in the [LLMLingua-2], using training examples from [MeetingBank (Hu et al, 2023)](https://meetingbank.github.io/) as the seed data.
|
| 8 |
+
|
| 9 |
+
## Usage
|
| 10 |
+
```python
|
| 11 |
+
from llmlingua import PromptCompressor
|
| 12 |
+
|
| 13 |
+
compressor = PromptCompressor(
|
| 14 |
+
model_name="qianhuiwu/llmlingua-2-bert-base-multilingual-cased-meetingbank",
|
| 15 |
+
use_llmlingua2=True
|
| 16 |
+
)
|
| 17 |
+
|
| 18 |
+
original_prompt = """John: So, um, I've been thinking about the project, you know, and I believe we need to, uh, make some changes. I mean, we want the project to succeed, right? So, like, I think we should consider maybe revising the timeline.
|
| 19 |
+
Sarah: I totally agree, John. I mean, we have to be realistic, you know. The timeline is, like, too tight. You know what I mean? We should definitely extend it.
|
| 20 |
+
"""
|
| 21 |
+
results = compressor.compress_prompt_llmlingua2(
|
| 22 |
+
original_prompt,
|
| 23 |
+
rate=0.6,
|
| 24 |
+
force_tokens=['\n', '.', '!', '?', ','],
|
| 25 |
+
chunk_end_tokens=['.', '\n'],
|
| 26 |
+
return_word_label=True,
|
| 27 |
+
drop_consecutive=True
|
| 28 |
+
)
|
| 29 |
+
|
| 30 |
+
print(results.keys())
|
| 31 |
+
print(f"Compressed prompt: {results['compressed_prompt']}")
|
| 32 |
+
print(f"Original tokens: {results['origin_tokens']}")
|
| 33 |
+
print(f"Compressed tokens: {results['compressed_tokens']}")
|
| 34 |
+
print(f"Compression rate: {results['rate']}")
|
| 35 |
+
|
| 36 |
+
# get the annotated results over the original prompt
|
| 37 |
+
word_sep = "\t\t|\t\t"
|
| 38 |
+
label_sep = " "
|
| 39 |
+
lines = results["fn_labeled_original_prompt"].split(word_sep)
|
| 40 |
+
annotated_results = []
|
| 41 |
+
for line in lines:
|
| 42 |
+
word, label = line.split(label_sep)
|
| 43 |
+
annotated_results.append((word, '+') if label == '1' else (word, '-')) # list of tuples: (word, label)
|
| 44 |
+
print("Annotated results:")
|
| 45 |
+
for word, label in annotated_results[:10]:
|
| 46 |
+
print(f"{word} {label}")
|
| 47 |
+
```
|
| 48 |
+
|
| 49 |
+
## Citation
|
| 50 |
+
```
|
| 51 |
+
{}
|
| 52 |
+
```
|