Developing speech to text transcription on python
Briefly

The script aims to transcribe audio into text using the Whisper model. It checks for CUDA availability to leverage GPU support. After loading the model, it transcribes the specified audio file in Greek by default. The program expects an audio file path as a command-line argument and prints the transcribed text. However, there are syntax errors related to quotes that need to be corrected to run properly, especially with the use of quotes and the import statement structure.
import whisper import torch import sys def transcribe_audio(audio_path, language='el'): """ Transcribes the given audio file to text in the specified language using Whisper. Args: audio_path (str): Path to the audio file. language (str): Language code for transcription (default is 'el' for Greek). Returns: str: The transcribed text. """ device = 'cuda' if torch.cuda.is_available() else 'cpu' print(f'Using device: {device}') model = whisper.load_model('small', device=device) print(f'Transcribing {audio_path} in language: {language}...') result = model.transcribe(audio_path, language=language, task='transcribe') return result['text']
if __name__ == '__main__': if len(sys.argv) < 2: print('Usage: python speech_to_text.py <audio_file_path>') sys.exit(1) audio_file = sys.argv[1] greek_text = transcribe_audio(audio_file, language='el') print('Transcribed Greek Text:') print(greek_text)
Read at SitePoint Forums | Web Development & Design Community
[
|
]