You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add transcript save/reuse with automatic detection
- Add --save-transcript flag to save transcripts as JSON
- Add --input-transcript flag to reuse existing transcripts
- Add --force-retranscribe flag to ignore cached transcripts
- Implement automatic transcript detection and reuse
- Include test audio file for real-world validation
Copy file name to clipboardExpand all lines: README.md
+39-3Lines changed: 39 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,9 +5,10 @@
5
5
**monkeyplug** is a little script to censor profanity in audio files (intended for podcasts, but YMMV) in a few simple steps:
6
6
7
7
1. The user provides a local audio file (or a URL pointing to an audio file which is downloaded)
8
-
2. Either [Whisper](https://openai.com/research/whisper) ([GitHub](https://github.com/openai/whisper)) or the [Vosk](https://alphacephei.com/vosk/)-[API](https://github.com/alphacep/vosk-api) is used to recognize speech in the audio file
8
+
2. Either [Whisper](https://openai.com/research/whisper) ([GitHub](https://github.com/openai/whisper)) or the [Vosk](https://alphacephei.com/vosk/)-[API](https://github.com/alphacep/vosk-api) is used to recognize speech in the audio file (or a pre-generated transcript can be loaded)
9
9
3. Each recognized word is checked against a [list](./src/monkeyplug/swears.txt) of profanity or other words you'd like muted
10
10
4.[`ffmpeg`](https://www.ffmpeg.org/) is used to create a cleaned audio file, muting or "bleeping" the objectional words
11
+
5. Optionally, the transcript can be saved for reuse in future processing runs
11
12
12
13
You can then use your favorite media player to play the cleaned audio file.
13
14
@@ -62,10 +63,14 @@ options:
62
63
Input file (or URL)
63
64
-o <string>, --output <string>
64
65
Output file
65
-
--output-json <string>
66
-
Output file to store transcript JSON
67
66
-w <profanity file>, --swears <profanity file>
68
67
text file containing profanity (default: "swears.txt")
68
+
--output-json <string>
69
+
Output file to store transcript JSON
70
+
--input-transcript <string>
71
+
Load existing transcript JSON instead of performing speech recognition
72
+
--save-transcript Automatically save transcript JSON alongside output audio file
73
+
--force-retranscribe Force new transcription even if transcript file exists (overrides automatic reuse)
69
74
-a <str>, --audio-params <str>
70
75
Audio parameters for ffmpeg (default depends on output audio codec)
71
76
-c <int>, --channels <int>
@@ -137,6 +142,37 @@ Alternately, a [Dockerfile](./docker/Dockerfile) is provided to allow you to run
137
142
138
143
then run [`monkeyplug-docker.sh`](./docker/monkeyplug-docker.sh) inside the directory where your audio files are located.
139
144
145
+
## Transcript Workflow
146
+
147
+
**monkeyplug** supports saving and reusing transcripts to improve workflow efficiency:
0 commit comments