verbit-research commited on
Commit
bdc1c8f
verified
1 Parent(s): 5af7e8d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -8
README.md CHANGED
@@ -11,16 +11,17 @@ base_model:
11
  ## Introduction
12
  This model is a fine-tuned version of AlephBERT, designed to restore punctuation in Hebrew spoken language transcripts. It is specifically trained as a post-processing step for Automatic Speech Recognition (ASR) outputs, where punctuation is often missing in raw transcriptions.
13
 
14
- ## Usage
15
- For now this is the recommended way to use this model:
16
-
17
  ```
18
  git lfs install
19
  git clone https://huggingface.co/verbit/hebrew_punctuation
20
  cd hebrew_punctuation
 
 
 
21
  ```
22
-
23
- Once you are in the folder you could do the following:
24
 
25
  ```
26
  from transformers import BertTokenizer
@@ -32,9 +33,9 @@ model = BertForPunctuation.from_pretrained("verbit/hebrew_punctuation")
32
  tokenizer = BertTokenizer.from_pretrained("verbit/hebrew_punctuation")
33
  model.eval()
34
 
35
- text = ("讞讘专转 讜专讘讬讟 驻讬转讞讛 诪注专讻转 诇转诪诇讜诇 讛诪讘讜住住转 注诇 讘讬谞讛 诪诇讗讻讜转讬转 讜讙讜专诐 讗谞讜砖讬 讜砖讜拽讚转 注诇 转诪诇讜诇 注讚讜讬讜转 谞讬爪讜诇讬 砖讜讗讛 讗转 "
36
- "讛转讜爪讗讜转 讗驻砖专 诇专讗讜转 讻讘专 讘专砖转 讘讛谉 讞诇拽讬诐 诪注讚讜转讜 砖诇 讟讜讘讬讛 讘讬讬诇住拽讬 砖讛讬讛 诪驻拽讚 讙讚讜讚 讛驻专讟讬讝谞讬诐 讛讬讛讜讚讬诐 "
37
- "讘讘讬讬诇讜专讜住讬讛")
38
  punct_text = get_prediction(
39
  model=model,
40
  text=text,
 
11
  ## Introduction
12
  This model is a fine-tuned version of AlephBERT, designed to restore punctuation in Hebrew spoken language transcripts. It is specifically trained as a post-processing step for Automatic Speech Recognition (ASR) outputs, where punctuation is often missing in raw transcriptions.
13
 
14
+ ## Install
 
 
15
  ```
16
  git lfs install
17
  git clone https://huggingface.co/verbit/hebrew_punctuation
18
  cd hebrew_punctuation
19
+ python -m venv .env
20
+ source .env/bin/activate
21
+ pip install -r requirements.txt
22
  ```
23
+ ## Usage
24
+ For now this is the recommended way to use this model:
25
 
26
  ```
27
  from transformers import BertTokenizer
 
33
  tokenizer = BertTokenizer.from_pretrained("verbit/hebrew_punctuation")
34
  model.eval()
35
 
36
+ text = """讞讘专转 讜专讘讬讟 驻讬转讞讛 诪注专讻转 诇转诪诇讜诇 讛诪讘讜住住转 注诇 讘讬谞讛 诪诇讗讻讜转讬转 讜讙讜专诐 讗谞讜砖讬 讜砖讜拽讚转 注诇 转诪诇讜诇 注讚讜讬讜转 谞讬爪讜诇讬 砖讜讗讛
37
+ 讗转 讛转讜爪讗讜转 讗驻砖专 诇专讗讜转 讻讘专 讘专砖转 讘讛谉 讞诇拽讬诐 诪注讚讜转讜 砖诇 讟讜讘讬讛 讘讬讬诇住拽讬 砖讛讬讛 诪驻拽讚 讙讚讜讚 讛驻专讟讬讝谞讬诐 讛讬讛讜讚讬诐 讘讘讬讬诇讜专讜住讬讛"""
38
+
39
  punct_text = get_prediction(
40
  model=model,
41
  text=text,