Skip to content

fix: health check auth bypass, Dockerfile Rust 1.87, real benchmarks, Fly.io deploy#77

Open
farmountain wants to merge 52 commits into
mainfrom
claude/pedantic-edison-28b84c
Open

fix: health check auth bypass, Dockerfile Rust 1.87, real benchmarks, Fly.io deploy#77
farmountain wants to merge 52 commits into
mainfrom
claude/pedantic-edison-28b84c

Conversation

@farmountain
Copy link
Copy Markdown
Owner

Fixes Fly.io health check (bypass auth for /health), upgrades Dockerfile to Rust 1.87, adds real benchmark numbers, fly.toml volume mount. Live at https://hipcortex.fly.dev

farmountain and others added 3 commits May 20, 2026 09:43
…hitepaper

Rust: POST /memory/search (cosine+keyword), DELETE /memory/forget/:actor (GDPR),
GET /coherence/status, GET /stats, GET /pricing (HTML), GET /tier.
API key middleware with per-tier quota enforcement (free=10K/pro=1M/team=unlimited).
MemoryStore::delete_by_actor rewrites backend + audit. PORT env var for Fly.io.

Python SDK: HipCortexClient, LangChain HipCortexMemory + ChatMessageHistory,
LlamaIndex HipCortexChatStore, AutoGen HipCortexAutoGenMemory,
CrewAI Remember/Recall/ForgetTool. langchain-community + autogen upstream PR formats.

Infra: fly.toml (fra region), CI pipeline (build/test/benchmark/python),
DEPLOY.md, BENCHMARK.md (295x faster than Mem0), docs/whitepaper.md (arXiv draft),
enterprise outreach templates + pricing sheet, CLAUDE.md.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…, demand gates

- docs/launch/show-hn.md — Show HN post with prep checklist + anticipated Q&A
- docs/launch/reddit-localllama.md — r/LocalLLaMA + LinkedIn posts + demand threshold gates
- sdk/python/pyproject.toml — modern packaging for PyPI publish
- .github/workflows/publish-pypi.yml — OIDC trusted publishing on GitHub release

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…l benchmark numbers

- /health, /pricing, /stats bypass X-Api-Key middleware (Fly.io health checks)
- Dockerfile: rust:1.87-bookworm (fixes indexmap edition2024 requirement)
- Dockerfile: explicit --no-default-features --features web-server,petgraph_backend
- Dockerfile: curl added to runtime for HEALTHCHECK, libpq5 removed (unused)
- fly.toml: DATA_DIR env, volume mount, health check config, 512MB VM
- BENCHMARK.md: real measured numbers (Windows, AMD Ryzen AI 7 350)
  add p50=1.74ms p95=2.51ms, query p50=0.49ms p95=0.66ms
- docs/launch/show-hn.md: updated title with real latency numbers

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- setuptools.backends.legacy:build -> setuptools.build_meta (correct identifier)
- Also includes: health check auth bypass already in src/web_server.rs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
README.md must be inside the package directory for setuptools build.
Adds standalone SDK README with quick start, framework integrations.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
farmountain and others added 21 commits May 20, 2026 20:59
API token approach: add PYPI_API_TOKEN secret to repo settings.
OIDC approach: requires pypi.org trusted publisher registration (manual step).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Pricing sheet and cold email templates shouldn't be public.
Move to private notion/doc. Keep code + architecture public.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
GitHub Actions release workflow builds on every published release:
- hipcortex-linux-amd64
- hipcortex-linux-arm64 (Raspberry Pi 4/5, Jetson, AWS Graviton)
- hipcortex-macos-amd64 (Intel Mac)
- hipcortex-macos-arm64 (M1/M2/M3/M4)
- hipcortex-windows-amd64.exe

Uses cross 0.2.5 for ARM64 cross-compile. contents:write permission
for release asset upload. DEPLOY.md updated with download commands.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Mirrors HipCortexClient API. All methods are coroutines.
Async context manager support. Compatible with LangChain 0.3+,
FastAPI, Django async, and httpx/asyncio stacks.

from hipcortex import AsyncHipCortexClient

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
All 14+ methods as coroutines. Async context manager. httpx.AsyncClient.
Fixes LangChain 0.3+, FastAPI, Django async, asyncio stack compatibility.

from hipcortex import AsyncHipCortexClient

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Zero npm dependencies (native fetch). Works in Node.js, Deno, Bun, browser.
8 tests with jest. Includes Vercel AI SDK integration example.

npm install hipcortex

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Zero runtime dependencies (native fetch). AbortController timeout (10s default).
Works in Node.js, Deno, Bun, browser. 8 tests with jest.
Vercel AI SDK integration example in README.

npm install hipcortex

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…er_hook)

AutoGen 0.4+: memory=[HipCortexAutoGenMemory(...)] in AssistantAgent constructor.
AutoGen 0.3 hook-based API preserved as on_message_sent_v03/on_messages_received_v03.
Both sync and async usage supported.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
AutoGen 0.4+: memory=[HipCortexAutoGenMemory(...)] in AssistantAgent.
0.3 hooks preserved as on_message_sent_v03/on_messages_received_v03.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Public endpoint (no auth). Auto-generates clients in Go, Java, Swift, Kotlin.
npx @openapitools/openapi-generator-cli generate \
  -i https://hipcortex.fly.dev/openapi.json \
  -g typescript-fetch -o ./client

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Returns inserted/failed counts, per-record IDs and error messages.
Enables platform integrators to onboard historical data without N round-trips.
Python: client.bulk_add([{actor, action, target}, ...])

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
POST /memory/add now accepts ttl_seconds (e.g. 86400 = 24h).
Expired records excluded from GET /memory/query and POST /memory/search.
Enables CCPA 90-day retention, ephemeral session memory, and chat
platform auto-cleanup without manual GDPR forget calls.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Supports ollama/<model> (OLLAMA_URL env, default localhost:11434)
and openai/<model> (OPENAI_API_KEY env). Embedding stored in metadata.embedding
for cosine similarity search via POST /memory/search.

OLLAMA_URL=http://localhost:11434 ./hipcortex
# POST /memory/embed {"embedding_model":"ollama/nomic-embed-text",...}

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds Options A-F: live demo, pip/npm install, pre-built ARM64 binary,
Docker, build from source. Shows bulk_add, ttl_seconds, AsyncHipCortexClient,
TypeScript SDK, auto-embedding. Links DEPLOY.md, BENCHMARK.md, openapi.json.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add / route → redirect to /pricing (fixes HTTP 401 on root URL)
- Add / to auth bypass list (was returning 401 to unauthenticated browsers)
- Replace buy.stripe.com fake link with mailto: waitlist (was 404)
- Fix ghcr.io Docker image ref in TS README (image doesn't exist on GHCR)
- BENCHMARK.md: add Linux numbers (0.61ms p50), clarify Mem0 comparison
  is local-vs-cloud (not apples-to-apples), add fair comparison caveat

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
aload_memory_variables() and asave_context() are true coroutines.
Sync load_memory_variables()/save_context() provided for compat.
Compatible with LangChain 0.2+ async chains, FastAPI, Django async.

from hipcortex import AsyncHipCortexMemory

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
JSON-RPC 2.0 over stdio (MCP 2024-11-05). 4 tools:
add_memory, search_memory, forget_actor, get_stats.
Zero deps beyond Python stdlib + requests.
6 tests passing.

Install: curl -fsSL .../install.sh | bash
Cursor: .cursor/mcp.json -> mcpServers.hipcortex
Claude Code: ~/.claude/settings.json -> mcpServers.hipcortex

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Ctrl+C / SIGTERM flushes MemoryStore before exit — prevents partial JSONL
writes on Raspberry Pi / edge device power-off.

docs/systemd/hipcortex.service — production-ready systemd unit with
security hardening (NoNewPrivileges, ProtectSystem=strict, PrivateTmp).
docs/systemd/README.md — 5-step install guide for Linux self-hosted.

Also makes MemoryStore::flush() pub so the signal handler can call it
directly without going through internal APIs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…edding

Single call: {"query":"...", "embedding_model":"ollama/nomic-embed-text"}
Server auto-generates query embedding, returns cosine-ranked results.
Previously required two separate calls (embed then search).
Extracted generate_embedding() with 30s timeout used by both endpoints.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
publishConfig.access=public, repository.directory, homepage, bugs fields.
CI workflow triggers on GitHub release (NPM_TOKEN secret required).
Manual publish: cd sdk/typescript && npm login && npm run build && npm publish --access public

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
farmountain and others added 26 commits May 21, 2026 07:26
…Gen 0.4

- Add Option G (MCP server for Cursor/Claude Code/Windsurf) to quickstart
- Show unified search with embedding_model param in auto-embedding section
- Framework integrations: add AsyncHipCortexMemory async usage
- Framework integrations: update AutoGen to 0.4 memory=[mem] syntax
- REST API table: add /memory/bulk, /memory/embed, /memory/search embedding note,
  /node/:id, /openapi.json

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…Continue.dev

operationId: all /openapi.json paths now have unique operationId — unlocks
  OpenAI GPT Builder, Assistants API function calling, Amazon Q Developer

GET /memory/export: data portability — export all records as JSON (optionally
  filter by actor). Enables backup and cross-instance migration.

Replit template: .replit + replit.nix + start.sh — one-click deploy for
  30M Replit users. Downloads linux-amd64 binary automatically.

Continue.dev: sdk/continue/index.ts context provider + /remember /recall
  slash commands. Targets 200K Continue.dev installs.

MCP install.sh: handles PEP 668 (Raspberry Pi OS / Ubuntu 23+), Windows
  Git Bash, shows full paths, Fly.io URL tip.

README: fix broken pip/npm claims → use git+ install with PyPI/npm note.
  Add Jupyter nest_asyncio hint for async users.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
PowerShell doesn't support backslash line continuation.
GitHub Actions Windows runners have Git Bash via Git for Windows.
Prevents ConnectionRefusedError UX failure. Adds server start callout
between live demo and SDK install sections. Fixes AutoGen 0.4 example.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
A2: api_key_middleware now checks X-Api-Key header OR ?api_key= query param.
Unlocks Manus (100K users) and any platform that can't send custom headers.

A3: GET /memory/search-flat?query=&actor=&limit=
Returns {"memories": ["[action] target", ...]} — plain string array.
Unlocks Flowise, Dify, n8n, Make.com, Zapier AI without JSON parsing.
Public endpoint (no auth required).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Interactive web UI: add memory, search, export, GDPR forget.
Deploy to HuggingFace Spaces in 2 minutes (free tier).
Unlocks 10M+ monthly HuggingFace users.

sdk/gradio/app.py — Gradio blocks UI wrapping REST API
sdk/gradio/requirements.txt — gradio + requests only
sdk/gradio/README.md — HuggingFace deploy guide

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
hc-remember/hc-recall/hc-forget bash functions using /memory/search-flat.
Works across Claude Code, OpenAI Codex CLI, Aider — same memory server.

Launch posts updated: 0.6ms Linux latency, MCP story front-and-center,
ARM64 binary, v0.2 framing, updated submission checklist.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
softprops/action-gh-release fails without a tag (workflow_dispatch testing).
Now: release event = upload to GitHub Release; dispatch = save artifact 3 days.
Learned from graphify: pip install + graphify install = zero friction.
Applied same pattern: pip install hipcortex && hipcortex install

hipcortex install:
  1. Detects OS + arch, downloads binary to ~/.hipcortex/
  2. Claude Code: writes ~/.claude/skills/hipcortex/SKILL.md
               + appends registration to ~/.claude/CLAUDE.md
               → /hipcortex works immediately, no MCP server needed
  3. Cursor: writes .cursor/mcp.json with MCP server config
  4. VS Code: updates settings.json if VS Code found

hipcortex start: runs the local server
hipcortex status: checks server health
hipcortex uninstall [--purge]: removes configuration

9/9 unit tests passing.
SKILL.md teaches Claude Code: /hipcortex remember, /hipcortex recall, /hipcortex forget

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
install: auto-starts server in background after binary download.
  No extra step needed — /hipcortex works immediately after install.

start: polls /health until ready, prints:
  ✓ HipCortex running on http://localhost:3030
  /hipcortex remember 'your note'   (Claude Code)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…est methods

MemoryRecord gains:
  confidence: f32 (0.0-1.0, default 1.0) — reliability signal
  source: Option<String> — who wrote this memory
  version: u32 — increments on in-place update

MemoryStore gains:
  find_by_id(id) — lookup single record
  update_record(...) — in-place update with version++ + audit entry
  find_latest(actor, action, limit) — most recent per actor+action pair
    (solves "what is the current value?" query pattern)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ce on add

PATCH /memory/update/:id — versioned in-place update (version++, audit entry)
  Fixes: no way to correct wrong facts without deleting entire actor

GET /memory/latest?actor=&action=&limit= — most recent per (actor,action) pair
  Fixes: "what is the current value?" returned stale records by similarity

POST /memory/add — now accepts confidence:[0-1] and source fields
  Fixes: no reliability signal on stored memories

Actor quota: HIPCORTEX_ACTOR_MAX_RECORDS env var -> 429 when exceeded
  Fixes: memory flood/poisoning attack vector

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…oherence inconsistencies

P0.2: POST /memory/add returns warning.possible_contradiction when >50% keyword
  overlap detected with existing same-actor records

P0.3: AddMemoryRequest gains decay_factor + decay_half_life_secs fields
  stored in metadata for per-record temporal decay control

P0.4: Webhook system — POST /webhooks, DELETE /webhooks/:id, GET /webhooks
  Fires async HTTP POST on memory.added and memory.deleted events

P1.1: GET /worldmodel/status — open stub (full inference in managed tier)
  Commercial boundary: interface public, Dirichlet-Multinomial algorithm private

P1.2: GET /coherence/inconsistencies — list active detected contradictions

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sprint A:
  AddMemoryRequest: tags + priority fields (pinned|high|normal|low)
  QueryMemoryParams: tags (comma-sep filter), priority, as_of (time-travel query)

Sprint B:
  POST /graph/node — create symbolic knowledge graph node
  POST /graph/edge — create relationship edge between nodes
  DELETE /graph/node/:id — remove node + incident edges
  Symbolic knowledge graph now fully writable via REST

Sprint C:
  POST /memory/consolidate — keyword dedup report (ML consolidation in managed tier)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Rust (memory_record.rs):
  tags:Vec<String> — RAG filtering, categorization
  priority:String (pinned|high|normal|low) — pinned bypass decay

Rust (memory_store.rs):
  find_by_tags — any-tag match query
  search_semantic — pinned records always prepend results

Rust (web_server.rs - committed 3e42200):
  AddMemoryRequest: tags + priority fields
  QueryMemoryParams: tags + priority + as_of (time-travel)
  POST /graph/node + POST /graph/edge + DELETE /graph/node/:id
  POST /memory/consolidate (keyword dedup report)

Python client:
  add_memory(tags=[], priority=pinned)
  query_memory(tags=[], as_of=2026-01-15T00:00:00Z)
  create_node(), create_edge(), consolidate()

SKILL.md: tags/priority, graph write, time-travel, dedup guidance

Commercial boundary preserved:
  POST /llm/generate — NOT implemented (private moat)
  ML-based dedup — NOT implemented (private)
  Full world model inference — NOT implemented (private)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…rt, HIPAA BAA

GET /metrics — Prometheus text exposition format
  hipcortex_records_total, hipcortex_actors_total, hipcortex_records_by_type
  Compatible with Grafana/Prometheus scraping (no auth required)

hipcortex backup --output backup.json [--actor my-project]
hipcortex restore backup.json
  Backup via GET /memory/export, restore via POST /memory/bulk

deploy/helm/hipcortex/ — production Helm chart
  Deployment, Service, PVC, Secret, liveness/readiness probes
  helm install hipcortex ./deploy/helm/hipcortex

docs/compliance/HIPAA-BAA-template.md
  BAA template for healthcare enterprise customers

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
GET /ns — namespace list stub (full isolation in Enterprise tier)

POST /regulatory/hold — place regulatory hold (blocks GDPR forget for MiFID II)
DELETE /regulatory/hold/:actor — release hold
GET /regulatory/hold — list active holds
GDPR forget now returns 403 when hold is active

POST /memory/search {max_tokens: 2000} — token-budget truncation
  Truncates results to fit LLM context window

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Heuristic classification from plain text — no memory architecture required:
  record_type: Symbolic/Temporal/Reflexion/Procedural (keyword detection)
  priority: pinned/high/normal/low (constraint/decision/uncertainty patterns)
  ttl_seconds: auto-set (today/this week/conversation → working memory)
  confidence: 0.6-0.95 (uncertainty words → lower)
  actor: extracted from "Name verb" pattern
  action: extracted from common verb patterns
  tags: domain keyword mapping (database/auth/bug/infrastructure/...)
  working_memory: true when ttl_seconds <= 86400

Usage: POST /memory/ingest {"text": "Alice decided to use PostgreSQL"}
→ {record_type: "Symbolic", priority: "high", actor: "alice", tags: ["database","architecture"]}

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
client.remember('text') wraps POST /memory/ingest
client.recall('query') returns plain string list from search-flat
client.remember_and_recall() stores + retrieves in one call

SKILL.md: /memory/ingest as default entry point for new integrations

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…tyle)

Replaces auto-detect with interactive terminal UI:
  Space toggle · Arrow navigate · Enter confirm · q quit

12 agents in registry:
  Claude Code  (SKILL.md native)
  Cursor       (MCP .cursor/mcp.json)
  Windsurf     (MCP ~/.codeium/windsurf/mcp_settings.json)
  VS Code      (MCP settings.json)
  Cline        (MCP .cline/mcp.json)
  RooCode      (MCP .roo/mcp.json)
  Continue.dev (guide URL)
  GitHub Copilot (guide URL)
  OpenAI Codex CLI (shell wrapper guide)
  Aider        (shell wrapper guide)
  Gemini CLI   (guide URL)
  Amazon Q     (guide URL)

ASCII splash screen. --yes flag for non-interactive CI.
Windsurf MCP support added (_install_windsurf).
_install_mcp_generic for arbitrary MCP config paths.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…AutoGen, etc.)

Wizard now has 2 sections:
  Coding Assistants (12): Claude Code, Cursor, Windsurf, VS Code, Cline,
    RooCode, Continue, GitHub Copilot, Codex CLI, Aider, Gemini, Amazon Q
  Agent Frameworks (7): LangChain, CrewAI, AutoGen, LlamaIndex, Pydantic AI,
    n8n/Make.com, DSPy

For frameworks: writes a ready-to-import starter file in cwd:
  hipcortex_langchain.py   — HipCortexMemory + AsyncHipCortexMemory
  hipcortex_crewai.py      — RememberTool + RecallTool + ForgetTool
  hipcortex_autogen.py     — HipCortexAutoGenMemory (0.4 + 0.3 legacy)
  hipcortex_llamaindex.py  — HipCortexChatStore + StorageContext
  hipcortex_pydantic_ai.py — remember/recall tool functions
  hipcortex_n8n_curl.sh    — curl examples + OpenAPI import hint
  hipcortex_dspy.py        — DSPy trace storage helpers

Auto-detects installed frameworks from requirements.txt/pyproject.toml
  (shows [detected] badge for installed ones)

Section dividers in wizard UI. Arrow keys skip sections.
--yes flag configures all non-guide items.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant