Using Whisper transcription in FAB Subtitler

FAB Subtitler can use whisper audio transcription to recognize spoken text and create subtitles. Currently whisper has to be installed manually and started from command line to create a JSON transcript file which can be opened in FAB Subtitler. FAB Subtitler will create subtitles from texts and timestamps which have been written by whisper transcription into the JSON file.

Installation and usage of Whisper on Windows 11

Prerequisites:

  • Required is a PC with Windows 11 and NVidia RTX 3060 (or better) graphics card.
  • Make sure that the latest NVidia graphics driver is installed.
  • Make sure to install NVidia Cuda version 11 from here or https://developer.nvidia.com/cublas or https://developer.nvidia.com/cudnn
  • Start regedit.exe and set LongPathsEnabled to 1 in HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\FileSystem
  • Restart Windows

Installation of git, python and ffmpeg and required libraries:

  • Open Command Prompt as Administrator and type:
winget install Git.Git
winget install -e --id Python.Python.3.11 --scope machine
winget install ffmpeg --scope machine
  • Close Command Prompt window.
  • Open Command Prompt as Administrator and type:
python -m pip install --upgrade pip
pip3 install matplotlib onnxruntime torchaudio transformers setuptools-rust
pip3 uninstall torch
pip3 cache purge
pip3 install torch -f https://download.pytorch.org/whl/torch_stable.html
  • Close Command Prompt.

There are different versions of Whisper which can be installed:

Note that all commands listed below (pip3, …) must be executed in command prompt window.

Install faster-whisper:

  • pip3 install git+https://github.com/Softcatala/whisper-ctranslate2.git
  • Upgrade to latest version later: pip3 install --upgrade --no-deps --force-reinstall git+https://github.com/Softcatala/whisper-ctranslate2.git
  • Transcribe: whisper-ctranslate2 "\\srv2\Data\Video\Transcription\DE\ZDF\20230718-MarkusLanz.mp4" --language German --model large-v2 --verbose True --word_timestamps True --output_format json --pretty_json True --vad_filter True --output_dir c:\0

Install whisper-timestamped:

  • pip3 install git+https://github.com/linto-ai/whisper-timestamped
  • Upgrade to latest version later: pip3 install --upgrade --no-deps --force-reinstall git+https://github.com/linto-ai/whisper-timestamped
  • Transcribe: whisper_timestamped "\\srv2\Data\Video\Transcription\DE\ZDF\20230718-MarkusLanz.mp4" --language German --model large --verbose True --accurate --compute_confidence True --punctuations_with_words True --vad True --output_format json --output_dir c:\0

Install OpenAI whisper:

  • pip3 install git+https://github.com/openai/whisper.git
  • Upgrade to latest version later: pip3 install --upgrade --no-deps --force-reinstall git+https://github.com/openai/whisper.git
  • In case of problems after update execute again: pip3 install git+https://github.com/openai/whisper.git
  • Transcribe: whisper "\\srv2\Data\Video\Transcription\DE\ZDF\20230718-MarkusLanz.mp4" --language German --model large-v3 --verbose True --word_timestamps True --output_format json --output_dir c:\0

Troubleshooting

faster-whisper runs on most computers without problems and it is also much faster than the original whisper because of optimizations. In case of problems please consider the following:

  • Whisper requires RAM on the graphics card. 12GB RAM is necessary so that everything works correctly. faster-whisper allows using the large model also with less GB. In case of error messages try using the model medium or small.
  • Sometimes it may be necessary to reinstall some packages required by whisper to avoid error messages. You can do that by using the --force-reinstall option with pip3:
pip3 install --force-reinstall matplotlib onnxruntime torchaudio transformers setuptools-rust
pip3 uninstall torch
pip3 cache purge
pip3 install --force-reinstall torch -f https://download.pytorch.org/whl/torch_stable.html

This page was last updated on 2023-11-24