File size: 2,524 Bytes
89ef5a0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
# if you dont use pipenv uncomment the following:
# from dotenv import load_dotenv
# load_dotenv()

#Step1: Setup Audio recorder (ffmpeg & portaudio)
# ffmpeg, portaudio, pyaudio
import logging
import speech_recognition as sr
from pydub import AudioSegment
from io import BytesIO

logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')

def record_audio(file_path, timeout=5, phrase_time_limit=10):
    """

    Simplified function to record audio from the microphone and save it as an MP3 file.



    Args:

    file_path (str): Path to save the recorded audio file.

    timeout (int): Maximum time to wait for a phrase to start (in seconds).

    phrase_time_lfimit (int): Maximum time for the phrase to be recorded (in seconds).

    """
    recognizer = sr.Recognizer()

    try:
        with sr.Microphone() as source:
            logging.info("Adjusting for ambient noise...")
            recognizer.adjust_for_ambient_noise(source, duration=1)
            logging.info("Start speaking now...")
            
            # Record the audio
            logging.info(f"Recording for {phrase_time_limit} seconds...")
            audio_data = recognizer.record(source, duration=phrase_time_limit)
            # audio_data = recognizer.listen(source,  )
            logging.info("Recording complete.")
            
            # Convert the recorded audio to an MP3 file
            wav_data = audio_data.get_wav_data()
            audio_segment = AudioSegment.from_wav(BytesIO(wav_data))
            audio_segment.export(file_path, format="mp3", bitrate="128k")
            
            logging.info(f"Audio saved to {file_path}")

    except Exception as e:
        logging.error(f"An error occurred: {e}")

audio_filepath="patient_message.mp3"

#Step2: Setup Speech to text–STT–model for transcription
def transcription(stt_model, audio_filepath, GROQ_API_KEY):
    import os
    from groq import Groq
    from dotenv import load_dotenv
    load_dotenv()

    GROQ_API_KEY=os.environ.get("GROQ_API_KEY")
    stt_model="whisper-large-v3-turbo"

    if GROQ_API_KEY is None:
        raise ValueError("GROQ_API_KEY is not set! Add it to your environment or .env file.")
    client=Groq(api_key=GROQ_API_KEY)
    
    audio_file=open(audio_filepath, "rb")
    transcription=client.audio.transcriptions.create(
        model=stt_model,
        file=audio_file,
        language="en"
    )

    return transcription.text