Skip to content

vcon-dev/vcon-laptop

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

vcon-laptop

Record laptop activity — screenshots, microphone audio, and webcam video — into spec-compliant vCon objects. Optionally transcribe audio, describe screenshots with AI, and export session notes to an Obsidian vault.

Features

  • Screen capture — periodic screenshots via mss (cross-platform) with macOS screencapture fallback
  • Audio recording — continuous microphone capture via sounddevice
  • Video recording — webcam capture via OpenCV at configurable FPS (default 1 fps for activity logging)
  • Capture budget — stop recording automatically when a size limit is hit (MAX_CAPTURE_MB)
  • AI analysis — fully local by default, no API keys required:
    • Audio transcription: mlx-whisper (Apple Silicon)
    • Screenshot descriptions: Ollama with any vision model (default gemma3:4b)
    • Session summary: generated from transcription + descriptions
    • Fallback chain: Ollama → OpenAI → Anthropic (when API keys are configured)
  • Obsidian export — markdown note with YAML frontmatter, ![[wikilink]] image embeds, transcription, and AI descriptions
  • Conserver posting — POST vCon JSON to a conserver endpoint with API token auth and ingress routing
  • vCon compliantextensions, parties with validation, sha512-base64url content hashes, purpose-based attachments

Install

pip (recommended):

pip install vcon-laptop
pip install "vcon-laptop[analysis]"   # includes mlx-whisper, anthropic, openai

Homebrew (macOS):

brew tap vcon-dev/vcon
brew install vcon-laptop

pipx (isolated CLI):

pipx install vcon-laptop

From source:

git clone https://github.com/vcon-dev/vcon-laptop.git
cd vcon-laptop
pip install -e ".[analysis]"

Quick start

# Copy and edit config (optional — works without it)
curl -O https://raw.githubusercontent.com/vcon-dev/vcon-laptop/main/.env.example
mv .env.example .env

# Record for 30 seconds (screenshots only)
vcon-laptop --sources screenshot --duration 30

# Record everything for 2 minutes with a 100 MB cap
vcon-laptop --duration 120 --max-mb 100

Configuration

All settings are loaded from environment variables or a .env file. See .env.example for the full reference.

Capture

Variable Default Description
CAPTURE_SOURCES screenshot,audio,video Comma-separated sources to enable
SCREENSHOT_INTERVAL 5 Seconds between screenshots
VIDEO_FPS 1 Webcam frames per second
MAX_CAPTURE_MB 0 Stop recording at this size (0 = unlimited)
MAX_SESSION_DURATION 0 Stop after this many seconds (0 = unlimited)

Analysis

Variable Default Description
OLLAMA_URL http://localhost:11434 Ollama server for vision + summary
OLLAMA_MODEL gemma3:4b Ollama model (must support vision)
WHISPER_MODEL mlx-community/whisper-small-mlx HuggingFace model for mlx-whisper
OPENAI_API_KEY Fallback for vision + Whisper transcription
OPENAI_VISION_MODEL gpt-4o-mini OpenAI model for image description
ANTHROPIC_API_KEY Fallback for image description + summary
ANALYSIS_MODEL claude-sonnet-4-20250514 Anthropic model for vision + summary

Obsidian

Variable Default Description
OBSIDIAN_VAULT_PATH Path to vault root (blank = skip export)
OBSIDIAN_FOLDER Recordings Subfolder for recording notes

CLI options

vcon-laptop [--sources screenshot,audio,video]
            [--duration SECONDS]
            [--max-mb MB]
            [--output DIRECTORY]
            [--env PATH]
            [-v|--verbose]

Press Ctrl+C to stop recording. The session is saved, analyzed, and exported on stop.

vCon output

Each session produces a session.vcon.json file:

{
  "vcon": "0.0.1",
  "uuid": "...",
  "created_at": "2026-04-12T17:45:29+00:00",
  "parties": [{"name": "anonymous", "validation": "anonymous", "role": "agent"}],
  "dialog": [
    {"type": "recording", "mediatype": "image/png", "filename": "screen_00000.png", ...},
    {"type": "recording", "mediatype": "audio/wav", "duration": 13.36, ...},
    {"type": "recording", "mediatype": "video/mp4", "duration": 8.47, ...}
  ],
  "analysis": [
    {"type": "transcript", "vendor": "mlx-community", "body": "..."},
    {"type": "report", "vendor": "Ollama", "schema": "screenshot-description-v1", "body": "..."},
    {"type": "summary", "vendor": "Ollama", "body": "..."}
  ],
  "attachments": [
    {"purpose": "tags", "body": "{\"source\": \"laptop_adapter\", ...}", "encoding": "json"}
  ]
}

Analysis performance

Provider Per screenshot Summary Total (2 shots) Cost
Ollama gemma3:4b ~60s ~100s ~7 min Free
OpenAI gpt-4o-mini ~4s ~2s ~10s $
Anthropic Sonnet ~5s ~3s ~13s $$

Tests

pip install -e ".[dev]"
pytest tests/ -v

69 tests covering config, builder, storage, poster, analysis (with mocked APIs), Obsidian export, and session lifecycle.

Architecture

vcon_laptop/
├── main.py          CLI entry point
├── config.py        Env-based configuration
├── session.py       Session lifecycle + size monitoring
├── builder.py       Assembles media into vCon JSON
├── analyze.py       Transcription + image description + summary
├── obsidian.py      Obsidian vault markdown export
├── storage.py       Save vCon to disk
├── poster.py        POST to conserver
└── capture/
    ├── screenshot.py   Periodic screen capture (mss / screencapture)
    ├── audio.py        Mic recording (sounddevice callback)
    └── video.py        Webcam recording (OpenCV)

Related projects

License

MIT

About

Record laptop activity (screenshots, audio, webcam) into spec-compliant vCon objects

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages