Skip to content

Add Gemini and OpenAI chat providers#40

Open
xPoleStarx wants to merge 1 commit intofarzaa:mainfrom
xPoleStarx:feature/add-gemini-openai-chat
Open

Add Gemini and OpenAI chat providers#40
xPoleStarx wants to merge 1 commit intofarzaa:mainfrom
xPoleStarx:feature/add-gemini-openai-chat

Conversation

@xPoleStarx
Copy link
Copy Markdown

Summary

This PR adds first-class OpenAI and Gemini chat support alongside the existing Claude integration.

The main goal was to keep the macOS app's voice flow unchanged while making the chat layer provider-agnostic. The app can now route multimodal screenshot + transcript requests through the Cloudflare Worker to Claude, OpenAI, or Gemini using a shared request/response shape.

What Changed

  • Added a provider-agnostic chat client on the macOS side
  • Expanded the model picker to support:
    • claude-sonnet-4-6
    • claude-opus-4-6
    • gpt-5.4
    • gemini-2.5-flash
  • Updated CompanionManager to use a model-agnostic chat pipeline instead of a Claude-specific one
  • Reworked the Cloudflare Worker /chat route to support:
    • Anthropic Messages API
    • OpenAI Responses API
    • Gemini generateContent
  • Normalized all provider responses into the SSE shape already expected by the macOS client
  • Updated the standalone OpenAI helper to use the current Responses API
  • Updated docs for the new worker secrets and multi-provider chat architecture

Why

The existing app flow was tightly coupled to Claude. This change keeps the user-facing interaction model the same while allowing Clicky to switch between major multimodal providers with minimal client-side branching.

That makes it easier to compare model behavior, keep provider flexibility, and continue evolving the app without hard-coding one vendor into the primary chat path.

Notes

New worker secrets required:

  • OPENAI_API_KEY
  • GEMINI_API_KEY

Existing secrets still required:

  • ANTHROPIC_API_KEY
  • ASSEMBLYAI_API_KEY
  • ELEVENLABS_API_KEY

Testing

I did not run xcodebuild because the repo instructions explicitly say not to run it from the terminal due to TCC permission resets.

Static validation performed:

  • reviewed integration points in the macOS client
  • updated the worker routing logic end-to-end
  • checked staged changes and commit integrity
  • ran git diff --check

Follow-ups

Potential follow-up work, if desired:

  • add provider-specific fallback/error messaging in the UI
  • move more legacy direct-provider helpers behind the worker
  • add a lightweight smoke-test path for worker chat providers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant