verbit
/

hebrew_punctuation

Model card Files Files and versions

verbit-research commited on Oct 6, 2024

Commit

bdc1c8f

·

verified ·

1 Parent(s): 5af7e8d

Update README.md

Files changed (1) hide show

README.md +9 -8

README.md CHANGED Viewed

@@ -11,16 +11,17 @@ base_model:
 ## Introduction
 This model is a fine-tuned version of AlephBERT, designed to restore punctuation in Hebrew spoken language transcripts. It is specifically trained as a post-processing step for Automatic Speech Recognition (ASR) outputs, where punctuation is often missing in raw transcriptions.
-## Usage
-For now this is the recommended way to use this model:
 ```
 git lfs install
 git clone https://huggingface.co/verbit/hebrew_punctuation
 cd hebrew_punctuation
 ```
-Once you are in the folder you could do the following:
 ```
 from transformers import BertTokenizer
@@ -32,9 +33,9 @@ model = BertForPunctuation.from_pretrained("verbit/hebrew_punctuation")
 tokenizer = BertTokenizer.from_pretrained("verbit/hebrew_punctuation")
 model.eval()
-text = ("חברת ורביט פיתחה מערכת לתמלול המבוססת על בינה מלאכותית וגורם אנושי ושוקדת על תמלול עדויות ניצולי שואה את "
-        "התוצאות אפשר לראות כבר ברשת בהן חלקים מעדותו של טוביה ביילסקי שהיה מפקד גדוד הפרטיזנים היהודים "
-        "בביילורוסיה")
 punct_text = get_prediction(
     model=model,
     text=text,

 ## Introduction
 This model is a fine-tuned version of AlephBERT, designed to restore punctuation in Hebrew spoken language transcripts. It is specifically trained as a post-processing step for Automatic Speech Recognition (ASR) outputs, where punctuation is often missing in raw transcriptions.
+## Install
 ```
 git lfs install
 git clone https://huggingface.co/verbit/hebrew_punctuation
 cd hebrew_punctuation
+python -m venv .env
+source .env/bin/activate
+pip install -r requirements.txt
 ```
+## Usage
+For now this is the recommended way to use this model:
 ```
 from transformers import BertTokenizer
 tokenizer = BertTokenizer.from_pretrained("verbit/hebrew_punctuation")
 model.eval()
+text = """חברת ורביט פיתחה מערכת לתמלול המבוססת על בינה מלאכותית וגורם אנושי ושוקדת על תמלול עדויות ניצולי שואה
+את התוצאות אפשר לראות כבר ברשת בהן חלקים מעדותו של טוביה ביילסקי שהיה מפקד גדוד הפרטיזנים היהודים בביילורוסיה"""
 punct_text = get_prediction(
     model=model,
     text=text,