This project is divided into two parts: one running on the user's machine and the other that can run in the cloud or locally. The user can send audio from their microphone to the cloud-based machine, which transcribes the audio and returns the transcribed text.
To install the project, follow these steps:
- Create and activate a virtual environment:
python -m venv venv- Activate the virtual environment:
.\venv\Scripts\activateor on Linux/Mac:
source venv/bin/activate- Install required packages:
pip install -r requirements.txtNote: Installing dependencies without a virtual environment is not recommended.
To ensure secure communication between the client and server, the project uses a token-based authentication system. The token is stored in the .env file and is validated on both the server and client sides.
A Python script (generate_token.py) is available to generate and update the token directly in the .env file. Run the following command:
python generate_token.pyThis will generate a new token and automatically update it in the .env file.
The server and client validate the token during the connection. Ensure that the token in the .env file is the same on both sides.
CPU support is native and requires no additional installation. However, if you want to accelerate processing using CUDA cores, you need to install the compatible version of PyTorch with CUDA. You can find the appropriate version for your system on the PyTorch Get Started page.
Windows:
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124Linux:
pip3 install torch torchvision torchaudioBefore installing PyTorch, you can verify your current CUDA version using the provided check_cuda_version.py script. Run the following command:
python check_cuda_version.pyThis will help you determine the correct CUDA version to install from the PyTorch website.
We are developing an easy setup system to simplify project configuration for end users. This system will automate dependency installation, token configuration, and environment setup. Stay tuned for updates!
To start the server, run:
python server.pyTo start the client, run:
python client.pyTo run the client on Google Colab, use the Whisper.ipynb script and follow the instructions in the notebook.
The system supports different levels of redundancy to handle data transmission and processing based on your requirements.
-
Level 1: Single Client
- Description: Data is sent to a single client.
- Use Case: Suitable for environments where only one client is available or desired.
- Configuration: Set
redundancy_levelto1in the configuration file.
-
Level 2: Full Redundancy
- Description: Data is sent to all connected clients for redundancy.
- Use Case: Ideal for ensuring data integrity by sending data to multiple clients.
- Configuration: Set
redundancy_levelto2in the configuration file.
-
Level 3: Redundancy with Load Balancing
- Description: Data is sent to clients in a round-robin fashion, balancing the load across multiple clients.
- Use Case: Recommended for distributed systems where load balancing is necessary.
- Configuration: Set
redundancy_levelto3in the configuration file.
This project uses Whisper from OpenAI, licensed under the MIT License.
This project is licensed under the MIT License.
Developed by Leandro Gonçalves. For more information:
- Email: contato@znix.com.br
- GitHub: github.com/lunikdev
This project is constantly evolving. Contributions and suggestions are welcome! 🚀