Download AiWhisper (https://github.com/openai/whisper) by importing
pip install git+https://github.com/openai/whisper.git
and add the Environmental Variable to path.
Also download FFmpeg from the website https://github.com/BtbN/FFmpeg-Builds/releases this file: ffmpeg-master-latest-win64-gpl-shared.zip. Then extract it and add the bin to the Environmental Variable path.
Right now you are all set up for executing the application.
It is on localhost. Run the file. It may take a while, because we train the data model with each start.
app.py
in root/UI and go to localhost http://127.0.0.1:5000.
Right at the start you will see a pop-up with our policy - how do we define extremist view and bad language.
When you close the pop-up you can choose a file to upload. And when you are ready click the button 'Upload' to start the algorithm. It may run for a little while, but we want it to be thorough. When it is ready it will list out all of the timestamps at which inappropriate language was used and label them accordingly. The timestamps are in seconds.
If you get "File is empty" feedback, it was because uploaded files were empty or had no audio.
If you get "Short file with problems!" feedback, it means that the file only had a short phrase and it was containing extreme views or inappropriate language.
If you get "No problems found" feedback, it means that the file doesn't contain any extreme views or inappropriate language.
Note: The system marks files in a safe way, meaning it will prefer to flag some text that looks suspicious over not doing it.
In root/app/speech_convert.py we used:
- whisper (https://github.com/openai/whisper) - MIT License
- FFmpeg (https://ffmpeg.org/) - LGPL/GPL License
In root/app/find_extremes.py apart from some python libraries we used:
In root/data/database we used two databases:
- https://github.com/LDNOOBW/List-of-Dirty-Naughty-Obscene-and-Otherwise-Bad-Words/blob/master/en database en.txt - Creative Commons Attribution 4.0 International License
- the database extremist_vs_appropriate_dataset.csv was created by OpenAI ChatGPT
- Video from platform YouTube from creator About Munich with title OKTOBERFEST: All the major beer tents explained - and tips on how to behave posted on 24.09.2024 with License Creative Commons link
- Video from platform YouTube from creator The Humanist Report with title Anti-Gay Republicans Can’t Stop Getting Caught Up in the GAYEST Scandals Imaginable posted on 30.09.2025 with License Creative Commons link
- Video from platform YouTube from creator The Mercurial Cyclist with title I Shortened My Cranks Like Tadej Pogačar and Wout Van Aert, Here's What Happened.... posted on 27.07.2025 with License Creative Commons link
- Video from platform YouTube from creator Reagan Library with title President Reagan's Remarks Announcing the Drug Abuse Initiative on August 4, 1986 posted on 20.08.2016 with License Creative Commons link
- Video from platform YouTube from creator mhmdbbn with title margot robbie and greta gerwig on that barbie shoe shot posted on 30.06.2023 with License Creative Commons link
The directory transcripts contains transcriptions of audios and videos. The directory inputs contains uploaded media.
In UI/static we used:
- tuDelft.jpg picture with License Creative Commons link
In UI/templates we used:
- Flask with BSD-3-Clause License link
Moreover OpenAI ChatGPT was used for inspiration.