Update README.md
Browse files
README.md
CHANGED
|
@@ -11,16 +11,17 @@ base_model:
|
|
| 11 |
## Introduction
|
| 12 |
This model is a fine-tuned version of AlephBERT, designed to restore punctuation in Hebrew spoken language transcripts. It is specifically trained as a post-processing step for Automatic Speech Recognition (ASR) outputs, where punctuation is often missing in raw transcriptions.
|
| 13 |
|
| 14 |
-
##
|
| 15 |
-
For now this is the recommended way to use this model:
|
| 16 |
-
|
| 17 |
```
|
| 18 |
git lfs install
|
| 19 |
git clone https://huggingface.co/verbit/hebrew_punctuation
|
| 20 |
cd hebrew_punctuation
|
|
|
|
|
|
|
|
|
|
| 21 |
```
|
| 22 |
-
|
| 23 |
-
|
| 24 |
|
| 25 |
```
|
| 26 |
from transformers import BertTokenizer
|
|
@@ -32,9 +33,9 @@ model = BertForPunctuation.from_pretrained("verbit/hebrew_punctuation")
|
|
| 32 |
tokenizer = BertTokenizer.from_pretrained("verbit/hebrew_punctuation")
|
| 33 |
model.eval()
|
| 34 |
|
| 35 |
-
text =
|
| 36 |
-
|
| 37 |
-
|
| 38 |
punct_text = get_prediction(
|
| 39 |
model=model,
|
| 40 |
text=text,
|
|
|
|
| 11 |
## Introduction
|
| 12 |
This model is a fine-tuned version of AlephBERT, designed to restore punctuation in Hebrew spoken language transcripts. It is specifically trained as a post-processing step for Automatic Speech Recognition (ASR) outputs, where punctuation is often missing in raw transcriptions.
|
| 13 |
|
| 14 |
+
## Install
|
|
|
|
|
|
|
| 15 |
```
|
| 16 |
git lfs install
|
| 17 |
git clone https://huggingface.co/verbit/hebrew_punctuation
|
| 18 |
cd hebrew_punctuation
|
| 19 |
+
python -m venv .env
|
| 20 |
+
source .env/bin/activate
|
| 21 |
+
pip install -r requirements.txt
|
| 22 |
```
|
| 23 |
+
## Usage
|
| 24 |
+
For now this is the recommended way to use this model:
|
| 25 |
|
| 26 |
```
|
| 27 |
from transformers import BertTokenizer
|
|
|
|
| 33 |
tokenizer = BertTokenizer.from_pretrained("verbit/hebrew_punctuation")
|
| 34 |
model.eval()
|
| 35 |
|
| 36 |
+
text = """讞讘专转 讜专讘讬讟 驻讬转讞讛 诪注专讻转 诇转诪诇讜诇 讛诪讘讜住住转 注诇 讘讬谞讛 诪诇讗讻讜转讬转 讜讙讜专诐 讗谞讜砖讬 讜砖讜拽讚转 注诇 转诪诇讜诇 注讚讜讬讜转 谞讬爪讜诇讬 砖讜讗讛
|
| 37 |
+
讗转 讛转讜爪讗讜转 讗驻砖专 诇专讗讜转 讻讘专 讘专砖转 讘讛谉 讞诇拽讬诐 诪注讚讜转讜 砖诇 讟讜讘讬讛 讘讬讬诇住拽讬 砖讛讬讛 诪驻拽讚 讙讚讜讚 讛驻专讟讬讝谞讬诐 讛讬讛讜讚讬诐 讘讘讬讬诇讜专讜住讬讛"""
|
| 38 |
+
|
| 39 |
punct_text = get_prediction(
|
| 40 |
model=model,
|
| 41 |
text=text,
|