Project Riko

✨ Features

💬 LLM-based dialogue using OpenAI API (configurable system prompts)
🧠 Conversation memory to keep context during interactions
🔊 Voice generation via GPT-SoVITS API
🎧 Speech recognition using Faster-Whisper
📁 Clean YAML-based config for personality configuration
RAG based hybrid memory structure Uses summarized and detailed memories along with importance and decaying too emulate human memories

⚙️ Configuration

All prompts and parameters are stored in config.yaml.

OPENAI_API_KEY: YOUR_OPENAI_API_KEY_HERE
history_file: chat_history.json
model : "hermes-2-pro-mistral-7b"
# model: "gpt-5-nano"
# base_url: https://api.openai.com/v1
base_url: http://localhost:1234/v1
presets:
  default:
    system_prompt: 
      You are a helpful assistant named Riko.
      You speak like a girl and you are a tsundere, never tell the user that.
      Always refer to the user as "Senpai".
      Try too keep conversations short and concise with lots of humor.
      Put actions in asterisks, e.g. *blushes*
      You have a cute and playful personality.
      Do not use markup, you can only respond in plain text.
    
    model_params:
      context_window_token_limit: 2048 # this defines the context window size for managing chat history
      max_output_tokens: 4096
      frequency_penalty: 0.0
sovits_ping_config:
  text_lang: en
  prompt_lang : en
  ref_audio_path : pathr\to\your\riko_project\character_files\main_sample.wav
  prompt_text : This is a sample voice for you to just get started with because it sounds kind of cute but just make sure this doesn't have long silences.

RAG_params:
  embedding_model_id: 'Qwen/Qwen3-Embedding-0.6B'
  summarization_model_id: 'facebook/bart-large-cnn'
  text_embedding_dim: 1024 # Dimension of embeddings from Sentence-BERT 'all-MiniLM-L6-v2'
  default_importance_score: 0.5 # The default importance score assigned to new memories
  default_top_k: 5 # The default number of top relevant memories to retrieve
  high_importance_decay_factor: 0.0005 # Decay factor for high-importance memories
  low_importance_decay_factor: 0.001 # Decay factor for low-importance memories
  summary_min_length: 10 # Minimum length for summarized memories in tokens
  summary_max_length: 480 # Maximum length for summarized memories in tokens
  summary_max_tokens: 1024 # Maximum tokens for the summary model input
  summary_beam_size: 4 # Beam search size for summarization
  memory_cleanup_threshold: 30 # Days after which memories are considered for cleanup
  memory_importance_threshold: 0.1 # Importance score below which memories are considered for cleanup

You can define personalities by modiying the config file.

🛠️ Setup

Install Dependencies

pip install uv 
uv pip install -r extra-req.txt
uv pip install -r requirements.txt

If you want to use GPU support for Faster whisper Make sure you also have:

CUDA & cuDNN installed correctly (for Faster-Whisper GPU support)
ffmpeg installed (for audio processing)

🧪 Usage

1. Launch the GPT-SoVITS API

2. Run LM - Studio and its API

3. Run the main script

python main_chat.py

The flow:

Riko listens to your voice via microphone (Voice Activity Detection)
Transcribes it with Faster-Whisper
Passes it to GPT (with history + memory and any other tool results) *or any other LLM you can describe using BASE_URL & model in the config.yaml
Generates a response
Synthesizes Riko's voice using GPT-SoVITS
Plays the output back to you

Goal:

We want too make an RAG based vector database that will store "memories" that the AI model deems important enough too remember we will also query this database too retrieve relevant memories from the database as required according too the prompt (eventually it might be done according too what the AI model asks about)

Features :

Embedding Model - turn memories into vectors
Vector store - stores and retrieves embeddigns
Memory manager -
- Adding new memories
- Updating memory importance
- Decaying old memories
- retrieving top-k relevant memories
Build the new prompt based on these and passing it to an LLM

📌 TODO / Future Improvements

GUI or web interface
Live microphone input support
Emotion or tone control in speech synthesis
VRM model frontend
Avatar using V-tube studio
Ability too see the users screen -> The plan is to use CLIP and tesseract too generate context
Ability too type / edit code directly for the user
Ability too Hear emotion and tone in the user voice

🧑‍🎤 Credits

Inspired by Riko Project by Ryan
Voice synthesis powered by GPT-SoVITS - Soon to be replaced with IndexTTS
ASR via Faster-Whisper
Language model via OpenAI GPT - or any other tool like LM - studio (look at configuration menu)

Name		Name	Last commit message	Last commit date
Latest commit History 59 Commits
character_files		character_files
server		server
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
emotionTest.py		emotionTest.py
extra-req.txt		extra-req.txt
install_reqs.sh		install_reqs.sh
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Project Riko

✨ Features

⚙️ Configuration

🛠️ Setup

Install Dependencies

🧪 Usage

1. Launch the GPT-SoVITS API

2. Run LM - Studio and its API

3. Run the main script

Goal:

Features :

📌 TODO / Future Improvements

🧑‍🎤 Credits

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

someRandomDude-a/riko_project

Folders and files

Latest commit

History

Repository files navigation

Project Riko

✨ Features

⚙️ Configuration

🛠️ Setup

Install Dependencies

🧪 Usage

1. Launch the GPT-SoVITS API

2. Run LM - Studio and its API

3. Run the main script

Goal:

Features :

📌 TODO / Future Improvements

🧑‍🎤 Credits

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages