Understand what your system is doing, why it failed, and what changed — from one tool.
TraceLens is a local-first diagnostics platform for Linux and macOS. It captures system evidence, diagnoses incidents, generates reports, and provides a browser dashboard — all without requiring any cloud service or AI API key.
At a glance
- Capture logs, kernel messages, service state, resource pressure, and package changes into a structured case
- Diagnose common incident patterns offline, then optionally enrich results with AI
- Generate Markdown, HTML, JSON, or text reports for sharing and archiving
- Explore cases in a browser with Overview, Timeline, Services, Reports, Diff, Terminal, and AI Insights pages
Linux debugging is powerful but fragmented. When something goes wrong — a crash, a hang, a slow boot, a service that keeps restarting — you're left jumping between a dozen disconnected tools:
You reboot. Now what? You run journalctl -b -1 hoping the previous boot's logs survived. You scan dmesg for hardware errors. You check which services failed. You cross-reference timestamps manually. You try to remember if you updated any packages yesterday. There's no single place to see what happened.
You run systemctl status myservice. Failed. You run journalctl -u myservice. Wall of text. Was it an OOM kill? A dependency that isn't starting? A config change? You check dmesg for memory pressure. You check disk space. You check if something else restarted at the same time. Every clue is in a different tool with a different interface.
Load average is high. But why? You open top — nothing obvious. Check iostat — disk seems fine. Check journalctl — thousands of lines. Is a service flapping? Is the kernel warning about something? Did a recent update change something? You spend 30 minutes detective-working across tools before you even form a hypothesis.
Your server went down. Your team asks what happened. You paste journalctl output into a chat. Then dmesg. Then systemctl list-units --failed. Then some df and free output. There's no structured incident report. Just scattered terminal output.
Something changed. A package update, a config edit, a kernel upgrade. But you have no baseline. You can't diff your system state from yesterday against today. Linux doesn't snapshot its own diagnostic state for you.
Services restart. Disks fill up. Network hiccups cause cascading failures. You wake up and everything looks fine now, but something clearly happened at 3am. Without continuous monitoring, transient failures leave no trace.
TraceLens unifies the fragmented Linux/macOS debugging experience into one coherent workflow:
┌──────────────────────────────────────────────────────────────┐
│ Your System │
│ │
│ journalctl ─┐ │
│ dmesg ──────┤ │
│ systemctl ──┤ │
│ /proc ──────┼──► TraceLens ──► Structured ──► Diagnosis │
│ package mgr ┤ Evidence Report │
│ boot info ──┤ Case Dashboard │
│ disk/mem ───┘ Timeline Diff │
│ │
└──────────────────────────────────────────────────────────────┘
One capture command collects evidence from all sources into a structured case. One diagnosis command analyzes that evidence and tells you what's likely wrong. One report command produces a clean, shareable document. One dashboard lets you explore everything visually in your browser.
| Problem | Without TraceLens | With TraceLens |
|---|---|---|
| Service keeps crashing | Manually cross-reference systemctl, journalctl, dmesg |
tracelens diagnose detects restart loops and correlates with OOM/resource pressure |
| Slow boot | Run systemd-analyze blame, guess which services are slow |
tracelens capture --boot current captures full boot evidence with timeline |
| System froze yesterday | Hope logs survived, manually search previous boot | tracelens diagnose --boot previous analyzes previous boot automatically |
| Need to explain an outage | Copy-paste terminal output into Slack | tracelens report latest --format md generates a structured incident report |
| Something changed after update | No baseline to compare against | tracelens diff case-a case-b shows what changed between captures |
| Recurring 3am failures | No visibility without monitoring setup | tracelens service enable runs background capture that catches transient issues |
| Disk filling up slowly | Notice when it's too late | Diagnosis engine flags near-full disks and growth trends |
| OOM kills happening silently | Buried in kernel logs | Kernel collector surfaces OOM events and correlates with affected services |
# Install uv if needed
curl -LsSf https://astral.sh/uv/install.sh | sh
# Clone the repo
git clone https://github.com/agodianel/Trace-Lens-Linux.git
cd Trace-Lens-Linux
# Install dependencies
uv sync --extra dev
# Run TraceLens (from inside the repo)
uv run tracelens doctorAll tracelens commands are run via uv run tracelens ... from the project directory.
To install it globally so tracelens works from anywhere:
# Install as a global tool from the local source
uv tool install -e /path/to/Trace-Lens-LinuxLinux gets the most complete support today, especially on systemd-based distributions. macOS is supported automatically with unified logs, launchd service discovery, Homebrew package detection, and native storage paths. Background service mode remains systemd-only.
uv run tracelens doctor╭─ TraceLens Doctor ────────────────────────╮
│ ✓ Python 3.11+ │
│ ✓ journalctl accessible │
│ ✓ systemctl accessible │
│ ✓ dmesg accessible │
│ ✓ Data directory writable │
│ ✗ AI provider: not configured (optional) │
│ │
│ Status: Ready │
╰───────────────────────────────────────────╯
# Capture current state
uv run tracelens capture
# Capture last hour
uv run tracelens capture --since "1 hour ago"
# Capture specific boot
uv run tracelens capture --boot previous
# Capture specific service
uv run tracelens capture --unit docker.service# Diagnose current system
uv run tracelens diagnose
# Diagnose a captured case
uv run tracelens diagnose latest
# Diagnose previous boot
uv run tracelens diagnose --boot previous╭─ Diagnosis ──────────────────────────────────────────╮
│ │
│ Severity: WARNING │
│ │
│ Findings: │
│ ⚠ Service restart loop: docker.service │
│ Restarted 4 times in 15 minutes │
│ Correlated with memory pressure spike at 14:32 │
│ │
│ ⚠ Near-full filesystem: /var (91% used) │
│ Growth rate suggests full in ~3 days │
│ │
│ ℹ 12 kernel warnings (nouveau driver) │
│ Recurring GPU timeout — likely driver issue │
│ │
│ Suggested commands: │
│ journalctl -u docker.service --since "14:00" │
│ df -h /var │
│ dmesg | grep nouveau │
│ │
╰──────────────────────────────────────────────────────╯
uv run tracelens report latest --format md
uv run tracelens report latest --format html
uv run tracelens report latest --format jsonuv run tracelens ui --openOpens a local browser dashboard with:
- Overview — system health, quick actions (capture/diagnose/doctor), recent cases
- Timeline — system events plotted by severity over time with stats summary
- Services — clickable service list with name/state filters and log viewer
- Logs — filterable log explorer with priority, unit, and keyword search
- Kernel — hardware events, driver issues, OOM detection
- Reports — interactive case browser with inline findings viewer and action buttons
- Diff — compare two captures or boots
- Terminal — run TraceLens commands directly from the dashboard
- AI Insights — optional AI-powered root cause analysis viewer
uv run tracelens diff case-2026-04-01_001 case-2026-04-02_001
uv run tracelens diff --boot previous --boot currentEnable TraceLens as a systemd service for continuous, lightweight evidence capture:
# Install and enable
uv run tracelens service install
uv run tracelens service enable
# Or manually
sudo systemctl enable --now tracelens.service
# Check status
uv run tracelens service statusThe service runs with minimal overhead: periodic snapshots, failure detection, and rolling history — all stored locally.
Gather evidence from system subsystems:
- journald — system logs with priority, unit, and boot filtering
- dmesg — kernel messages, hardware events, OOM detection
- systemd — unit states, failures, restart counts
- processes — CPU/memory consumers, load average
- resources — disk usage, memory pressure
- boot — boot sessions, timing, previous boot analysis
- packages — update history (distro-aware, Arch supported)
Pattern-based analysis that runs entirely offline:
- Service restart loop detection
- OOM kill correlation
- Boot failure analysis
- Disk pressure warnings
- Kernel warning clustering
- Package-update-to-failure correlation
- Authentication failure detection
- Network service instability
Structured outputs in Markdown, HTML, JSON, or plain text — ready to attach to a GitHub issue, send to a team, or archive.
Local Dash/Plotly web app with interactive charts, filterable logs, service health visualization, timeline correlation, a built-in command runner, and AI insights viewer. Every diagnostic action can be triggered from the browser.
TraceLens works completely offline with zero API keys. The diagnosis engine uses deterministic pattern matching and rules — not LLM prompts.
AI support is optional and additive. If you configure an API key, TraceLens can:
- Summarize incidents in plain English
- Suggest likely root causes
- Generate improved report narratives
- Cluster similar incidents
AI never modifies raw evidence, never deletes data, and all AI outputs are clearly marked as advisory.
# 1. Install the AI dependencies
uv sync --extra ai
# 2. Activate with one command (prompts for your API key)
tracelens ai activate
# 3. Verify
tracelens doctor
# The AI line should show ✓ instead of "not configured"The activate command writes your settings to settings.toml and saves the API key to your shell config (~/.bashrc or ~/.zshrc). If ANTHROPIC_API_KEY is already in your environment, it detects it automatically.
# Check AI status
tracelens ai status
# Disable AI
tracelens ai deactivateEverything is stored locally in human-inspectable formats:
~/.local/share/tracelens/
cases/
2026-04-02_001/
metadata.json
journal.jsonl
kernel.jsonl
services.json
system_snapshot.json
findings.json
report.md
config/
settings.toml
logs/
No databases. No opaque blobs. You can inspect, copy, or delete any case directly.
# ~/.local/share/tracelens/config/settings.toml
[dashboard]
host = "127.0.0.1"
port = 8765
[capture]
default_window = "6h"
storage_path = "~/.local/share/tracelens"
[service]
polling_interval = 300 # seconds
[ai]
enabled = false
provider = "none" # "anthropic", "openai"
# API keys via environment: ANTHROPIC_API_KEY, OPENAI_API_KEY- Python 3.11+
- Linux with systemd (most modern distributions) or macOS
journalctl,systemctl,dmesgaccessible (Linux);log,sysctl(macOS)- No root required for basic usage (some collectors benefit from elevated access)
- Local-first: all data stays on your machine
- No telemetry: nothing is sent anywhere
- No cloud dependency: works fully offline
- Redaction support: sensitive fields can be masked in exported reports
- AI is opt-in: you choose if/when data goes to an AI provider
MIT
