fix: health check auth bypass, Dockerfile Rust 1.87, real benchmarks, Fly.io deploy#77
Open
farmountain wants to merge 52 commits into
Open
fix: health check auth bypass, Dockerfile Rust 1.87, real benchmarks, Fly.io deploy#77farmountain wants to merge 52 commits into
farmountain wants to merge 52 commits into
Conversation
…hitepaper Rust: POST /memory/search (cosine+keyword), DELETE /memory/forget/:actor (GDPR), GET /coherence/status, GET /stats, GET /pricing (HTML), GET /tier. API key middleware with per-tier quota enforcement (free=10K/pro=1M/team=unlimited). MemoryStore::delete_by_actor rewrites backend + audit. PORT env var for Fly.io. Python SDK: HipCortexClient, LangChain HipCortexMemory + ChatMessageHistory, LlamaIndex HipCortexChatStore, AutoGen HipCortexAutoGenMemory, CrewAI Remember/Recall/ForgetTool. langchain-community + autogen upstream PR formats. Infra: fly.toml (fra region), CI pipeline (build/test/benchmark/python), DEPLOY.md, BENCHMARK.md (295x faster than Mem0), docs/whitepaper.md (arXiv draft), enterprise outreach templates + pricing sheet, CLAUDE.md. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…, demand gates - docs/launch/show-hn.md — Show HN post with prep checklist + anticipated Q&A - docs/launch/reddit-localllama.md — r/LocalLLaMA + LinkedIn posts + demand threshold gates - sdk/python/pyproject.toml — modern packaging for PyPI publish - .github/workflows/publish-pypi.yml — OIDC trusted publishing on GitHub release Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…l benchmark numbers - /health, /pricing, /stats bypass X-Api-Key middleware (Fly.io health checks) - Dockerfile: rust:1.87-bookworm (fixes indexmap edition2024 requirement) - Dockerfile: explicit --no-default-features --features web-server,petgraph_backend - Dockerfile: curl added to runtime for HEALTHCHECK, libpq5 removed (unused) - fly.toml: DATA_DIR env, volume mount, health check config, 512MB VM - BENCHMARK.md: real measured numbers (Windows, AMD Ryzen AI 7 350) add p50=1.74ms p95=2.51ms, query p50=0.49ms p95=0.66ms - docs/launch/show-hn.md: updated title with real latency numbers Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- setuptools.backends.legacy:build -> setuptools.build_meta (correct identifier) - Also includes: health check auth bypass already in src/web_server.rs Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
README.md must be inside the package directory for setuptools build. Adds standalone SDK README with quick start, framework integrations. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
API token approach: add PYPI_API_TOKEN secret to repo settings. OIDC approach: requires pypi.org trusted publisher registration (manual step). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Pricing sheet and cold email templates shouldn't be public. Move to private notion/doc. Keep code + architecture public. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
GitHub Actions release workflow builds on every published release: - hipcortex-linux-amd64 - hipcortex-linux-arm64 (Raspberry Pi 4/5, Jetson, AWS Graviton) - hipcortex-macos-amd64 (Intel Mac) - hipcortex-macos-arm64 (M1/M2/M3/M4) - hipcortex-windows-amd64.exe Uses cross 0.2.5 for ARM64 cross-compile. contents:write permission for release asset upload. DEPLOY.md updated with download commands. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Mirrors HipCortexClient API. All methods are coroutines. Async context manager support. Compatible with LangChain 0.3+, FastAPI, Django async, and httpx/asyncio stacks. from hipcortex import AsyncHipCortexClient Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
All 14+ methods as coroutines. Async context manager. httpx.AsyncClient. Fixes LangChain 0.3+, FastAPI, Django async, asyncio stack compatibility. from hipcortex import AsyncHipCortexClient Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Zero npm dependencies (native fetch). Works in Node.js, Deno, Bun, browser. 8 tests with jest. Includes Vercel AI SDK integration example. npm install hipcortex Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Zero runtime dependencies (native fetch). AbortController timeout (10s default). Works in Node.js, Deno, Bun, browser. 8 tests with jest. Vercel AI SDK integration example in README. npm install hipcortex Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…er_hook) AutoGen 0.4+: memory=[HipCortexAutoGenMemory(...)] in AssistantAgent constructor. AutoGen 0.3 hook-based API preserved as on_message_sent_v03/on_messages_received_v03. Both sync and async usage supported. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
AutoGen 0.4+: memory=[HipCortexAutoGenMemory(...)] in AssistantAgent. 0.3 hooks preserved as on_message_sent_v03/on_messages_received_v03. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Public endpoint (no auth). Auto-generates clients in Go, Java, Swift, Kotlin. npx @openapitools/openapi-generator-cli generate \ -i https://hipcortex.fly.dev/openapi.json \ -g typescript-fetch -o ./client Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Returns inserted/failed counts, per-record IDs and error messages.
Enables platform integrators to onboard historical data without N round-trips.
Python: client.bulk_add([{actor, action, target}, ...])
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
POST /memory/add now accepts ttl_seconds (e.g. 86400 = 24h). Expired records excluded from GET /memory/query and POST /memory/search. Enables CCPA 90-day retention, ephemeral session memory, and chat platform auto-cleanup without manual GDPR forget calls. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Supports ollama/<model> (OLLAMA_URL env, default localhost:11434)
and openai/<model> (OPENAI_API_KEY env). Embedding stored in metadata.embedding
for cosine similarity search via POST /memory/search.
OLLAMA_URL=http://localhost:11434 ./hipcortex
# POST /memory/embed {"embedding_model":"ollama/nomic-embed-text",...}
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds Options A-F: live demo, pip/npm install, pre-built ARM64 binary, Docker, build from source. Shows bulk_add, ttl_seconds, AsyncHipCortexClient, TypeScript SDK, auto-embedding. Links DEPLOY.md, BENCHMARK.md, openapi.json. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add / route → redirect to /pricing (fixes HTTP 401 on root URL) - Add / to auth bypass list (was returning 401 to unauthenticated browsers) - Replace buy.stripe.com fake link with mailto: waitlist (was 404) - Fix ghcr.io Docker image ref in TS README (image doesn't exist on GHCR) - BENCHMARK.md: add Linux numbers (0.61ms p50), clarify Mem0 comparison is local-vs-cloud (not apples-to-apples), add fair comparison caveat Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
aload_memory_variables() and asave_context() are true coroutines. Sync load_memory_variables()/save_context() provided for compat. Compatible with LangChain 0.2+ async chains, FastAPI, Django async. from hipcortex import AsyncHipCortexMemory Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
JSON-RPC 2.0 over stdio (MCP 2024-11-05). 4 tools: add_memory, search_memory, forget_actor, get_stats. Zero deps beyond Python stdlib + requests. 6 tests passing. Install: curl -fsSL .../install.sh | bash Cursor: .cursor/mcp.json -> mcpServers.hipcortex Claude Code: ~/.claude/settings.json -> mcpServers.hipcortex Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Ctrl+C / SIGTERM flushes MemoryStore before exit — prevents partial JSONL writes on Raspberry Pi / edge device power-off. docs/systemd/hipcortex.service — production-ready systemd unit with security hardening (NoNewPrivileges, ProtectSystem=strict, PrivateTmp). docs/systemd/README.md — 5-step install guide for Linux self-hosted. Also makes MemoryStore::flush() pub so the signal handler can call it directly without going through internal APIs. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…edding
Single call: {"query":"...", "embedding_model":"ollama/nomic-embed-text"}
Server auto-generates query embedding, returns cosine-ranked results.
Previously required two separate calls (embed then search).
Extracted generate_embedding() with 30s timeout used by both endpoints.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
publishConfig.access=public, repository.directory, homepage, bugs fields. CI workflow triggers on GitHub release (NPM_TOKEN secret required). Manual publish: cd sdk/typescript && npm login && npm run build && npm publish --access public Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…Gen 0.4 - Add Option G (MCP server for Cursor/Claude Code/Windsurf) to quickstart - Show unified search with embedding_model param in auto-embedding section - Framework integrations: add AsyncHipCortexMemory async usage - Framework integrations: update AutoGen to 0.4 memory=[mem] syntax - REST API table: add /memory/bulk, /memory/embed, /memory/search embedding note, /node/:id, /openapi.json Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…Continue.dev operationId: all /openapi.json paths now have unique operationId — unlocks OpenAI GPT Builder, Assistants API function calling, Amazon Q Developer GET /memory/export: data portability — export all records as JSON (optionally filter by actor). Enables backup and cross-instance migration. Replit template: .replit + replit.nix + start.sh — one-click deploy for 30M Replit users. Downloads linux-amd64 binary automatically. Continue.dev: sdk/continue/index.ts context provider + /remember /recall slash commands. Targets 200K Continue.dev installs. MCP install.sh: handles PEP 668 (Raspberry Pi OS / Ubuntu 23+), Windows Git Bash, shows full paths, Fly.io URL tip. README: fix broken pip/npm claims → use git+ install with PyPI/npm note. Add Jupyter nest_asyncio hint for async users. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
PowerShell doesn't support backslash line continuation. GitHub Actions Windows runners have Git Bash via Git for Windows.
Prevents ConnectionRefusedError UX failure. Adds server start callout between live demo and SDK install sections. Fixes AutoGen 0.4 example. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
A2: api_key_middleware now checks X-Api-Key header OR ?api_key= query param.
Unlocks Manus (100K users) and any platform that can't send custom headers.
A3: GET /memory/search-flat?query=&actor=&limit=
Returns {"memories": ["[action] target", ...]} — plain string array.
Unlocks Flowise, Dify, n8n, Make.com, Zapier AI without JSON parsing.
Public endpoint (no auth required).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Interactive web UI: add memory, search, export, GDPR forget. Deploy to HuggingFace Spaces in 2 minutes (free tier). Unlocks 10M+ monthly HuggingFace users. sdk/gradio/app.py — Gradio blocks UI wrapping REST API sdk/gradio/requirements.txt — gradio + requests only sdk/gradio/README.md — HuggingFace deploy guide Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
hc-remember/hc-recall/hc-forget bash functions using /memory/search-flat. Works across Claude Code, OpenAI Codex CLI, Aider — same memory server. Launch posts updated: 0.6ms Linux latency, MCP story front-and-center, ARM64 binary, v0.2 framing, updated submission checklist. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
softprops/action-gh-release fails without a tag (workflow_dispatch testing). Now: release event = upload to GitHub Release; dispatch = save artifact 3 days.
Learned from graphify: pip install + graphify install = zero friction.
Applied same pattern: pip install hipcortex && hipcortex install
hipcortex install:
1. Detects OS + arch, downloads binary to ~/.hipcortex/
2. Claude Code: writes ~/.claude/skills/hipcortex/SKILL.md
+ appends registration to ~/.claude/CLAUDE.md
→ /hipcortex works immediately, no MCP server needed
3. Cursor: writes .cursor/mcp.json with MCP server config
4. VS Code: updates settings.json if VS Code found
hipcortex start: runs the local server
hipcortex status: checks server health
hipcortex uninstall [--purge]: removes configuration
9/9 unit tests passing.
SKILL.md teaches Claude Code: /hipcortex remember, /hipcortex recall, /hipcortex forget
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
install: auto-starts server in background after binary download. No extra step needed — /hipcortex works immediately after install. start: polls /health until ready, prints: ✓ HipCortex running on http://localhost:3030 /hipcortex remember 'your note' (Claude Code) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…est methods
MemoryRecord gains:
confidence: f32 (0.0-1.0, default 1.0) — reliability signal
source: Option<String> — who wrote this memory
version: u32 — increments on in-place update
MemoryStore gains:
find_by_id(id) — lookup single record
update_record(...) — in-place update with version++ + audit entry
find_latest(actor, action, limit) — most recent per actor+action pair
(solves "what is the current value?" query pattern)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ce on add PATCH /memory/update/:id — versioned in-place update (version++, audit entry) Fixes: no way to correct wrong facts without deleting entire actor GET /memory/latest?actor=&action=&limit= — most recent per (actor,action) pair Fixes: "what is the current value?" returned stale records by similarity POST /memory/add — now accepts confidence:[0-1] and source fields Fixes: no reliability signal on stored memories Actor quota: HIPCORTEX_ACTOR_MAX_RECORDS env var -> 429 when exceeded Fixes: memory flood/poisoning attack vector Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…oherence inconsistencies P0.2: POST /memory/add returns warning.possible_contradiction when >50% keyword overlap detected with existing same-actor records P0.3: AddMemoryRequest gains decay_factor + decay_half_life_secs fields stored in metadata for per-record temporal decay control P0.4: Webhook system — POST /webhooks, DELETE /webhooks/:id, GET /webhooks Fires async HTTP POST on memory.added and memory.deleted events P1.1: GET /worldmodel/status — open stub (full inference in managed tier) Commercial boundary: interface public, Dirichlet-Multinomial algorithm private P1.2: GET /coherence/inconsistencies — list active detected contradictions Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sprint A: AddMemoryRequest: tags + priority fields (pinned|high|normal|low) QueryMemoryParams: tags (comma-sep filter), priority, as_of (time-travel query) Sprint B: POST /graph/node — create symbolic knowledge graph node POST /graph/edge — create relationship edge between nodes DELETE /graph/node/:id — remove node + incident edges Symbolic knowledge graph now fully writable via REST Sprint C: POST /memory/consolidate — keyword dedup report (ML consolidation in managed tier) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Rust (memory_record.rs): tags:Vec<String> — RAG filtering, categorization priority:String (pinned|high|normal|low) — pinned bypass decay Rust (memory_store.rs): find_by_tags — any-tag match query search_semantic — pinned records always prepend results Rust (web_server.rs - committed 3e42200): AddMemoryRequest: tags + priority fields QueryMemoryParams: tags + priority + as_of (time-travel) POST /graph/node + POST /graph/edge + DELETE /graph/node/:id POST /memory/consolidate (keyword dedup report) Python client: add_memory(tags=[], priority=pinned) query_memory(tags=[], as_of=2026-01-15T00:00:00Z) create_node(), create_edge(), consolidate() SKILL.md: tags/priority, graph write, time-travel, dedup guidance Commercial boundary preserved: POST /llm/generate — NOT implemented (private moat) ML-based dedup — NOT implemented (private) Full world model inference — NOT implemented (private) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…rt, HIPAA BAA GET /metrics — Prometheus text exposition format hipcortex_records_total, hipcortex_actors_total, hipcortex_records_by_type Compatible with Grafana/Prometheus scraping (no auth required) hipcortex backup --output backup.json [--actor my-project] hipcortex restore backup.json Backup via GET /memory/export, restore via POST /memory/bulk deploy/helm/hipcortex/ — production Helm chart Deployment, Service, PVC, Secret, liveness/readiness probes helm install hipcortex ./deploy/helm/hipcortex docs/compliance/HIPAA-BAA-template.md BAA template for healthcare enterprise customers Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
GET /ns — namespace list stub (full isolation in Enterprise tier)
POST /regulatory/hold — place regulatory hold (blocks GDPR forget for MiFID II)
DELETE /regulatory/hold/:actor — release hold
GET /regulatory/hold — list active holds
GDPR forget now returns 403 when hold is active
POST /memory/search {max_tokens: 2000} — token-budget truncation
Truncates results to fit LLM context window
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Heuristic classification from plain text — no memory architecture required:
record_type: Symbolic/Temporal/Reflexion/Procedural (keyword detection)
priority: pinned/high/normal/low (constraint/decision/uncertainty patterns)
ttl_seconds: auto-set (today/this week/conversation → working memory)
confidence: 0.6-0.95 (uncertainty words → lower)
actor: extracted from "Name verb" pattern
action: extracted from common verb patterns
tags: domain keyword mapping (database/auth/bug/infrastructure/...)
working_memory: true when ttl_seconds <= 86400
Usage: POST /memory/ingest {"text": "Alice decided to use PostgreSQL"}
→ {record_type: "Symbolic", priority: "high", actor: "alice", tags: ["database","architecture"]}
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
client.remember('text') wraps POST /memory/ingest
client.recall('query') returns plain string list from search-flat
client.remember_and_recall() stores + retrieves in one call
SKILL.md: /memory/ingest as default entry point for new integrations
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…tyle) Replaces auto-detect with interactive terminal UI: Space toggle · Arrow navigate · Enter confirm · q quit 12 agents in registry: Claude Code (SKILL.md native) Cursor (MCP .cursor/mcp.json) Windsurf (MCP ~/.codeium/windsurf/mcp_settings.json) VS Code (MCP settings.json) Cline (MCP .cline/mcp.json) RooCode (MCP .roo/mcp.json) Continue.dev (guide URL) GitHub Copilot (guide URL) OpenAI Codex CLI (shell wrapper guide) Aider (shell wrapper guide) Gemini CLI (guide URL) Amazon Q (guide URL) ASCII splash screen. --yes flag for non-interactive CI. Windsurf MCP support added (_install_windsurf). _install_mcp_generic for arbitrary MCP config paths. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…AutoGen, etc.)
Wizard now has 2 sections:
Coding Assistants (12): Claude Code, Cursor, Windsurf, VS Code, Cline,
RooCode, Continue, GitHub Copilot, Codex CLI, Aider, Gemini, Amazon Q
Agent Frameworks (7): LangChain, CrewAI, AutoGen, LlamaIndex, Pydantic AI,
n8n/Make.com, DSPy
For frameworks: writes a ready-to-import starter file in cwd:
hipcortex_langchain.py — HipCortexMemory + AsyncHipCortexMemory
hipcortex_crewai.py — RememberTool + RecallTool + ForgetTool
hipcortex_autogen.py — HipCortexAutoGenMemory (0.4 + 0.3 legacy)
hipcortex_llamaindex.py — HipCortexChatStore + StorageContext
hipcortex_pydantic_ai.py — remember/recall tool functions
hipcortex_n8n_curl.sh — curl examples + OpenAPI import hint
hipcortex_dspy.py — DSPy trace storage helpers
Auto-detects installed frameworks from requirements.txt/pyproject.toml
(shows [detected] badge for installed ones)
Section dividers in wizard UI. Arrow keys skip sections.
--yes flag configures all non-guide items.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes Fly.io health check (bypass auth for /health), upgrades Dockerfile to Rust 1.87, adds real benchmark numbers, fly.toml volume mount. Live at https://hipcortex.fly.dev