Skip to content

Synthetic data generator for the iMessage / AddressBook sqlite databases (dependent on the MacOS schema)

License

Notifications You must be signed in to change notification settings

johnlarkin1/imessage-data-foundry

Repository files navigation

iMessage Data Foundry

Generate realistic iMessage and Address Book sqlite databases for mock testing and demo applications.

Overview

iMessage Data Foundry creates synthetic chat.db SQLite databases that exactly mirror the macOS iMessage schema. You have the option to use Ai to generate the personas and conversations. You can either use local models through mlx (note, really only for apple silicon though given the dependency on mlx-lm.), or OpenAI / Anthropic endpoints.

Good use cases for this synthetic data are:

  • Testing iMessage analysis tools and exporters
  • Development of applications that interface with iMessage data
  • Demos and documentation requiring realistic message data

Demo

iMessage Data Foundry Demo

Features

  • Full Schema Replication — Exact match of macOS iMessage chat.db structure
  • Multi-Version Support — Sonoma (14.x), Sequoia (15.x), and Tahoe (26.x)
  • AI-Powered Personas — Generate realistic personas with distinct personalities
  • Natural Conversations — LLM-generated messages that feel authentic
  • Persona Library — Save and reuse personas across database generations
  • Group Chats — Support for both 1:1 and group conversations
  • Realistic Timestamps — Natural message timing with conversation batching
  • Attachment Stubs — Placeholder attachments with proper database records

Installation

From PyPI (recommended)

pip install imessage-data-foundry

With uvx (no install required)

uvx imessage-data-foundry

With pipx

pipx install imessage-data-foundry

With uv tool

uv tool install imessage-data-foundry

From source

git clone https://github.com/johnlarkin1/imessage-data-foundry.git
cd imessage-data-foundry
pip install -e .

Quick Start

# Launch the TUI
uvx imessage-data-foundry

# Or run directly
uv run python -m imessage_data_foundry

CLI Options

imessage-data-foundry --help
imessage-data-foundry --version
imessage-data-foundry --output ~/Desktop/chat.db
imessage-data-foundry # interactive mode (this is the default)

Configuration

LLM Provider

Select your preferred LLM provider from the Settings menu in the app. Available providers depend on which API keys you have configured.

API Keys

Set API keys as environment variables before running the app:

export OPENAI_API_KEY="sk-..."
# or
export ANTHROPIC_API_KEY="sk-ant-..."

Local Models

For local inference on Apple Silicon, install mlx-lm:

pip install mlx-lm

Settings are stored in ~/.config/imessage-data-foundry/foundry.db.

Usage

Creating Personas

Personas can be created manually or generated by AI:

┌─ Create Persona ─────────────────────────────────┐
│                                                  │
│  Name: Sarah Chen                                │
│  Phone: +1 (555) 123-4567                        │
│  Relationship: Close friend from college         │
│                                                  │
│  Personality:                                    │
│  Outgoing, tech-savvy software engineer who      │
│  loves hiking and craft coffee. Quick to         │
│  respond with enthusiasm and emoji.              │
│                                                  │
│  [Generate with AI]  [Save]  [Cancel]            │
│                                                  │
└──────────────────────────────────────────────────┘

Generating Conversations

  1. Select personas to include
  2. Choose chat type (1:1 or group)
  3. Set message count target
  4. Optionally provide a conversation seed
  5. Generate!

Output

The generated chat.db file can be used with any tool that reads iMessage databases:

# Example: Use with imessage-exporter
imessage-exporter -p ./output/chat.db -f html -o ./export/

API Keys

You'll need an API key from one of the supported providers:

Provider Get API Key
OpenAI https://platform.openai.com/api-keys
Anthropic https://console.anthropic.com/settings/keys

Project Structure

imessage-data-foundry/
├── imessage_data_foundry/
│   ├── cli/               # CLI application and flows
│   ├── db/                # Database schema and building
│   ├── personas/          # Persona management
│   ├── conversations/     # Conversation generation
│   ├── llm/               # LLM provider integrations
│   ├── settings/          # Settings storage
│   └── utils/             # Utility functions
├── tests/                 # Test suite
└── docs/                  # Documentation

Schema Compatibility

Generated databases are tested for compatibility with:

Limitations

  • Text-only focus — Reactions, read receipts, and message effects are not simulated
  • Placeholder attachments — Attachments are stubs, not real media files
  • macOS-centric — Schema targets macOS; iOS backup databases may differ slightly

Development

# Install dev dependencies
uv sync --dev

# Run tests
uv run pytest

# Run type checking
uv run mypy src/

# Format code
uv run ruff format src/

License

MIT License — See LICENSE for details.

Acknowledgments

About

Synthetic data generator for the iMessage / AddressBook sqlite databases (dependent on the MacOS schema)

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •