Skip to content

CaseyHillers/magic-cursor

Repository files navigation

✨ Magic Cursor

Give your browser a sixth sense. Interact with anything on your screen using the power of multimodal AI.

Magic Cursor Demo

Magic Cursor is a Chrome extension that bridges the gap between your screen and AI. Select any region of a webpage, capture it instantly, and start a conversation with Gemini (or any OpenAI-compatible model) about what you see.


🚀 Quick Start

1. Build the Extension

First, clone this repository and build the production bundle:

# Clone the repository
git clone https://github.com/your-username/magic-cursor.git
cd magic-cursor

# Install dependencies
npm install

# Build the sidepanel bundle
npm run build

2. Import to Chrome

  1. Open Chrome and navigate to chrome://extensions/.
  2. Enable "Developer mode" in the top right corner.
  3. Click "Load unpacked".
  4. Select the magic-cursor folder (the root directory of this project).

3. Set Up Your API Key

  1. Click the Magic Cursor icon in your extensions toolbar to open the Side Panel.
  2. Click the Settings (⚙️) icon in the top right.
  3. Select your provider (e.g., Gemini) and paste your API key.
  4. The extension will automatically save your settings and load available models.

💡 Example Prompts to Try

Magic Cursor isn't just for text—it's for understanding pixels. Try these:

  • Shopping Helper: Select a piece of furniture or clothing.

    "Where can I buy this or something similar? What is this style called?"

  • Code Architect: Highlight a complex UI component or a snippet of code.

    "Explain how this layout is achieved with CSS Grid." or "Refactor this logic into a React hook."

  • Study Buddy: Capture a diagram or a math problem from a PDF.

    "Walk me through the steps to solve this." or "Explain this diagram like I'm five."

  • Data Miner: Select a chart or table from a news article.

    "Summarize the key trends shown in this graph."


🛠 Features

  • 🎯 Precision Selection: Click and drag to capture exactly what matters.
  • 💬 Multimodal Chat: Send text and images simultaneously to the AI.
  • 📜 Smart History: Multi-turn conversations with automatic chat summarization.
  • 🔑 Bring Your Own Key: Supports Gemini, OpenAI, Anthropic, and OpenRouter.
  • 🌓 Adaptive UI: Beautiful light and dark modes that follow your system theme.

🧰 Tech Stack

About

Chrome extension for asking LLMs about a part of a web page

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors