macos-vision-mcp

Local OCR & image analysis for any MCP client — private, offline, no API keys.

Pre-extracts text and image data locally before your AI ever sees it — cutting token usage by ~97% on real documents. Files never leave your Mac: no cloud API, no API keys, no network requests.

What you get

OCR for images and PDFs (JPG, PNG, HEIC, TIFF, multi-page PDF) via Apple Vision Framework.
~97% token reduction: a 44-page PDF costs ~2,400 tokens instead of ~73,500.
Face detection, barcode/QR reading, and image classification — all on-device.
Full document pipeline: OCR + faces + barcodes + rectangles in a single tool call.
Works with Claude Code, Claude Desktop, and Cursor — any MCP-compatible client.
No files uploaded to any server — processing stays entirely on your Mac.
100% offline after npm install — powered by Apple Vision Framework, same engine as Live Text in Photos.app.

❌ Without / ✅ With

❌ Without macos-vision-mcp:

Sending a 44-page PDF costs ~73,500 tokens
Every image, invoice, or contract goes through a cloud API
Sensitive documents leave your machine on every request

✅ With macos-vision-mcp:

Local Apple Vision pre-extracts text before Claude ever sees it
~2,400 tokens for the same 44-page PDF — 97% fewer
Files never leave your Mac

Privacy layer

macos-vision-mcp acts as a local pre-processing layer between your documents and the cloud. Useful for:

Legal documents, contracts, NDAs
Financial reports, invoices, internal spreadsheets
Medical records or any GDPR-sensitive content
Any situation where you want to extract structured data locally before deciding what (if anything) to send upstream

Instead of sending the raw document to your AI, you extract the text and structure locally first. The model then works only with the extracted text — never the original file.

Quick Start

Step 1 — Install the package:

npm install -g macos-vision-mcp

Step 2 — Add to your MCP client (example for Claude Code):

claude mcp add macos-vision-mcp -- macos-vision-mcp

Restart your client. The tools appear automatically.

Note: The native module macos-vision compiles against your local Node.js at install time. If you switch Node versions, run npm rebuild inside the package directory.

Available Tools

Tool	What it does	Example prompt
`ocr_image`	Extract text from an image or PDF (JPG, PNG, HEIC, TIFF, PDF). Returns plain text or structured blocks with bounding boxes.	"Read the text from ~/Desktop/screenshot.png"
`detect_faces`	Detect human faces and return their count and positions.	"How many people are in this photo?"
`detect_barcodes`	Read QR codes, EAN, UPC, Code128, PDF417, Aztec, and other 1D/2D codes.	"What does the QR code in /tmp/qr.jpg say?"
`classify_image`	Classify image content into 1000+ categories with confidence scores.	"What is in this image?"
`analyze_document`	Full pipeline: OCR + faces + barcodes + rectangles in one call.	"Extract everything from this scanned invoice"

Usage

Use the tool name explicitly in your prompt to guarantee local processing:

Extract text from an image or PDF:

Use ocr_image to extract text from ~/Desktop/invoice.pdf

Detect faces in a photo:

Use detect_faces on ~/Photos/team.jpg and tell me how many people are in it

Classify image content:

Use classify_image on ~/Downloads/unknown.jpg

Full document analysis (OCR + faces + barcodes in one call):

Use analyze_document on ~/Desktop/scan.pdf and extract everything you can find

Configuration

Claude Code

claude mcp add macos-vision-mcp -- macos-vision-mcp

Claude Desktop

Edit ~/Library/Application Support/Claude/claude_desktop_config.json:

{
  "mcpServers": {
    "macos-vision-mcp": {
      "command": "macos-vision-mcp"
    }
  }
}

Cursor

Add to ~/.cursor/mcp.json:

{
  "mcpServers": {
    "macos-vision-mcp": {
      "command": "macos-vision-mcp"
    }
  }
}

If you installed with npx rather than globally, replace "command": "macos-vision-mcp" with "command": "npx", "args": ["macos-vision-mcp"].

Contributing

Contributions are welcome. Please follow Conventional Commits for commit messages — this project uses release-it with @release-it/conventional-changelog to automate releases.

git clone <repo>
cd macos-vision-mcp
npm install
npm run dev   # watch mode

License

MIT — Adrian Wolczuk

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.husky		.husky
src		src
.commitlintrc.json		.commitlintrc.json
.gitignore		.gitignore
.prettierignore		.prettierignore
.prettierrc		.prettierrc
.release-it.json		.release-it.json
CHANGELOG.md		CHANGELOG.md
README.md		README.md
eslint.config.js		eslint.config.js
package-lock.json		package-lock.json
package.json		package.json
server.json		server.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

macos-vision-mcp

What you get

❌ Without / ✅ With

Privacy layer

Quick Start

Available Tools

Usage

Configuration

Claude Code

Claude Desktop

Cursor

Contributing

License

About

Uh oh!

Releases 3

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

macos-vision-mcp

What you get

❌ Without / ✅ With

Privacy layer

Quick Start

Available Tools

Usage

Configuration

Claude Code

Claude Desktop

Cursor

Contributing

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages