-
Notifications
You must be signed in to change notification settings - Fork 59
Description
Hello,
I have been playing around a bit with creating a python script that allows recording of voice via SpeechNote and sending the inputs as text directly to an LLM running locally in LMStudio. As a second step the generated answer from LMStudio are sent back and read by SpeechNote. Its quite entertaining as I can now directly talk with for example "chatgpt-oss-20b" loaded on a secondary GPU in my system while gaming with the primary GPU something repetitive like World of Warcraft.
I am not a programmer but with help of Chatgpt and 2 days of trial error learning Python I made it work. In essence I use the "--action=start-listening-clipboard" command and read the new contents of the clipboard into a variable to send it to LMStudio. The answer is send back as text that is read with SpeechNote via "--action=start-reading-text".
However, there is an issue I cannot fix which comes from a security feature that is part of Wayland. Every time I read the clipboard in the code the focus of the active window switches to the console interrupting any gameplay. As far as I have read by now Wayland does not allow an app to read clipboard content when not focused / running in the background. I suspect this is also a reason why SpeechNote offers only to write in active windows via "--action=start-listening-active-window" and not in the background. You also cannot prohibit stealing of focus via the window manager since this interrupts the program execution as a result.
Would it be possible to directly output text (string) via the CLI rather than using the clipboard or active window as output in some way? It would also be helpful if SpeechNote would in some way return some "finished listening" and "finished reading" signal so you don't have to check the clipboard all the time if the contents have changed. I could not find another way to send voice inputs with varying length to LMstudio without regularly checking if the clipboard content has changed.
Again I am not a programmer and doing this just for fun. So, if this is an unreasonable request and/or not possible I apologize in advance. Thanks for providing this program nevertheless :)