microsoft
/

llmlingua-2-bert-base-multilingual-cased-meetingbank

Token Classification

Model card Files Files and versions

qianhuiwu commited on Mar 18, 2024

Commit

fcf4917

·

1 Parent(s): 0031208

Refine model card.

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -6,6 +6,8 @@ license: cc-by-nc-sa-4.0
 This model was introduced in the paper [**LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression** (Pan et al, 2024)](). It is a [BERT multilingual base model (cased)](https://huggingface.co/google-bert/bert-base-multilingual-cased) finetuned to perform token classification for task agnostic prompt compression. The probability $p_{preserve}$ of each token $x_i$ is used as the metric for compression. This model is trained on an extractive text compression dataset constructed with the methodology proposed in the [LLMLingua-2], using training examples from [MeetingBank (Hu et al, 2023)](https://meetingbank.github.io/) as the seed data.
 ## Usage
 ```python
 from llmlingua import PromptCompressor

 This model was introduced in the paper [**LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression** (Pan et al, 2024)](). It is a [BERT multilingual base model (cased)](https://huggingface.co/google-bert/bert-base-multilingual-cased) finetuned to perform token classification for task agnostic prompt compression. The probability $p_{preserve}$ of each token $x_i$ is used as the metric for compression. This model is trained on an extractive text compression dataset constructed with the methodology proposed in the [LLMLingua-2], using training examples from [MeetingBank (Hu et al, 2023)](https://meetingbank.github.io/) as the seed data.
+For more details, please check the home page of [LLMLingua-2]() and [LLMLingua Series](https://llmlingua.com/).
 ## Usage
 ```python
 from llmlingua import PromptCompressor