Self-hosted AI Agent Memory + Code Intelligence Platform
One MCP endpoint for every AI agent — persistent memory, AST-aware code search, quality enforcement.
Why · Features · Benchmarks · Quick Start · MCP Tools · Docs
Every AI coding agent works in isolation. Switch IDE, switch machine, switch project — the agent starts from zero. Your team's hard-won decisions, bug fixes, and architectural patterns live and die inside individual chat sessions.
Cortex Hub is a self-hosted backend that gives every agent on your team persistent memory, shared knowledge, and cross-project intelligence via a single MCP endpoint:
Claude Code Cursor Antigravity Codex Gemini
│ │ │ │ │
└──────────┴────────────┴───────────┴─────────┘
│
┌───────▼────────┐
│ Cortex Hub │ ← one MCP endpoint for everything
│ │
│ Memory │ Agents remember across sessions
│ Knowledge │ Team shares bug fixes, patterns
│ Code Intel │ Search any project's code by name
│ Quality Gates │ Enforce build/lint before commit
└────────────────┘
New machine, instant context:
You: (fresh laptop, just cloned your project)
> /install ← one command, MCP configured
> /cs ← session starts
Cortex: "Resuming from last session. You were working on JWT rotation
for the auth service. The team decided RS256 with 90-day expiry
(stored by Dev B on March 15). Related: payment service uses
the same pattern — see knowledge doc kdoc-4a2b."
You didn't bring any notes. You didn't clone the other repos.
Cortex remembered everything.
Cross-project code search without cloning:
You: "How does the backend handle user auth?"
Agent: cortex_code_search(query: "user auth", repo: "my-backend")
→ validateToken (Function) — src/middleware/auth.ts
→ refreshSession (Function) — src/services/session.ts
You never cloned my-backend. Cortex indexed it server-side
and searched the AST graph across 15 repos in 20ms.
Team knowledge that compounds:
Session 1 (Dev A, Claude Code):
Fixed auth bug → stored knowledge: "JWT needs audience claim for mobile"
Session 2 (Dev B, Cursor, different project):
Hit same auth issue → cortex_knowledge_search("JWT mobile") → instant fix
Session 3 (New hire, day 1):
/cs → sees all team decisions, patterns, bug fixes
Productive from hour one, not week three.
Retrieval quality: 96.0% R@5 on LongMemEval (500 questions, 6 categories) — matching MemPalace (96.6%) with $0 cost, no API key, fully offline. MemPalace requires paid OpenAI embeddings; Cortex runs the model in-process for free.
Zero data leaves your infrastructure. Self-hosted on a $5/month VPS behind Cloudflare Tunnel. Handles 5+ concurrent agents. Local embedding by default — no API keys, no network, no rate limits.
| Capability | Tool | What It Does |
|---|---|---|
| Multi-project search | cortex_code_search |
Omit repo to scan ALL indexed projects in parallel — ranked hints |
| 360° symbol context | cortex_code_context |
Callers, callees, execution flows for any function/class |
| Blast radius | cortex_code_impact |
See downstream impact before editing |
| Pre-commit risk | cortex_detect_changes |
Analyze uncommitted changes, find affected flows |
| Direct graph queries | cortex_cypher |
Cypher against the AST knowledge graph |
| Multi-repo registry | cortex_list_repos |
All indexed repos discoverable by name or slug |
| Auto-reindex | cortex_code_reindex |
Trigger after pushes |
| Read source | cortex_code_read |
Fetch raw file content from any indexed repo |
Smart cross-project search (added Apr 2026): call cortex_code_search(query: "...") without specifying repo and Cortex fans out across every indexed repo, runs both flow + symbol search, and returns a ranked list with refine hints. No more list_repos → guess → retry loops.
Memory (per-agent, semantic recall across sessions):
cortex_memory_store / cortex_memory_searchKnowledge Base (team-wide, structured):
cortex_knowledge_store / cortex_knowledge_searchfact, event, discovery, preference, advice, generalvalid_from / invalidated_at — query "what was true on date X"GET /api/knowledge/timeline — chronological exploration// Store a fact with validity window
cortex_knowledge_store({
title: "JWT secret rotation policy",
content: "Rotate every 90 days, ...",
hallType: "fact",
validFrom: "2026-01-01"
})
// Later, when policy changes:
POST /api/knowledge/{id}/invalidate
body: { supersededBy: "new-doc-id" }
Inspired by HKUDS/OpenSpace — Cortex captures patterns from completed work automatically:
task.complete and session_end — if execution log shows a non-trivial workflow, an LLM extracts it as a reusable recipeselection_count, applied_count, completion_count, fallback_count per docvector_similarity * 0.6 + effective_rate * 0.3 + recency * 0.1 (only when selection_count >= 3)fallback_rate > 0.4 flagged for LLM rewrite via /health-checkDashboard /knowledge page shows the Recipe Health Panel — capture pipeline status, quality distribution, origin breakdown (manual/captured/derived/fixed), recent capture log.
model: "auto" selects tier based on task4-dimension scoring after every session:
| Dimension | Weight | Measures |
|---|---|---|
| Build | 25 | Code compiles |
| Regression | 25 | No existing tests broken |
| Standards | 25 | Follows conventions |
| Traceability | 25 | Changes linked to requirements |
Plus plan quality (cortex_plan_quality) — 8-criterion plan assessment before execution.
/cs blocks edits until knowledge + memory recall calledReproducible retrieval benchmarks against industry-standard datasets.
| Cortex Hub | MemPalace | |
|---|---|---|
| R@5 | 96.0% | 96.6% |
| R@10 | 97.8% | 98.2% |
| NDCG@10 | 1.44 | 0.889 |
| Embedding | Local (in-process, free) | OpenAI API (paid) |
| API key needed | No | Yes |
| Embedding speed | ~10ms/text | ~600ms/text |
| Search (500 queries) | 52.6s | ~5 min |
| Cost per run | $0 | ~$2-5 |
Cortex matches MemPalace within 0.6 points on R@5 — while being free, offline, and 60x faster per embedding. NDCG@10 is 62% higher: when Cortex finds the answer, it places it at #1, not just somewhere in top 5.
MemPalace requires a paid OpenAI API key for embeddings. Cortex runs Xenova/all-MiniLM-L6-v2 in-process — zero network, zero cost, zero rate limits.
# Run benchmark (no API key needed)
pnpm --filter @cortex/benchmarks bench:longmemeval
# Cleanup test data
pnpm --filter @cortex/benchmarks bench:longmemeval --cleanup
See benchmarks/README.md for full methodology, per-category breakdown, and results log.
Cortex supports two interchangeable embedding backends:
| Provider | Model | Dim | Speed | Cost | Quality |
|---|---|---|---|---|---|
local (default) |
Xenova/all-MiniLM-L6-v2 |
384 | ~10-50ms in-process | Free | 96.7% R@5 |
gemini |
gemini-embedding-001 |
768 | ~600ms/text via API | $$ | 96.7% R@5 |
Local mode (default) runs the model in-process via @huggingface/transformers — no network, no API key, no rate limits, fully offline. Switch to Gemini via EMBEDDING_PROVIDER=gemini env var if needed.
graph TB
subgraph Agents["AI Agents (any MCP client)"]
AG["Antigravity"]
CC["Claude Code"]
CU["Cursor"]
WS["Windsurf"]
CX["Codex"]
end
subgraph Gateway["Hub MCP Server"]
AUTH["API Key Auth"]
ROUTER["Tool Router (25 tools)"]
TEL["Telemetry + Hints Engine"]
end
subgraph Services["Backend Services (Docker)"]
direction LR
GN["GitNexus<br/>AST Graph :4848"]
QD["Qdrant<br/>Vectors :6333"]
DB["SQLite<br/>WAL Mode"]
CLIP["CLIProxy<br/>LLM Gateway :8317"]
end
subgraph Frontend["Dashboard"]
DASH["Next.js 15<br/>(13 pages, static export)"]
end
Agents --> AUTH --> ROUTER --> TEL
TEL --> Services
DASH --> DB
Internet
├── cortex-mcp.jackle.dev ──── Hub MCP Server
└── hub.jackle.dev ─────────── Dashboard UI
│
Cloudflare Tunnel
│
┌─────────┼─────────┐
│ Docker Compose │
│ ├─ dashboard-web │ Nginx (UI + API proxy)
│ ├─ cortex-api │ Internal API + mem9
│ ├─ cortex-mcp │ 18+ MCP tools
│ ├─ qdrant │ vectors + knowledge
│ ├─ gitnexus │ AST code graph
│ ├─ llm-proxy │ CLIProxy
│ └─ watchtower │ auto-update
└────────────────────┘
Zero open ports on host.
Cortex exposes 25 tools via a single MCP endpoint:
| # | Tool | Purpose |
|---|---|---|
| 1 | cortex_session_start |
Start session, get project context + relevant knowledge |
| 2 | cortex_session_end |
Close session with compliance grade |
| 3 | cortex_changes |
Check unseen changes from other agents |
| 4 | cortex_code_search |
Multi-project AST/symbol search with smart fan-out |
| 5 | cortex_code_context |
360° symbol view |
| 6 | cortex_code_impact |
Blast radius analysis |
| 7 | cortex_code_read |
Read raw source from any indexed repo |
| 8 | cortex_code_reindex |
Trigger re-indexing |
| 9 | cortex_list_repos |
List indexed repos with names + slugs |
| 10 | cortex_cypher |
Direct graph queries |
| 11 | cortex_detect_changes |
Pre-commit risk analysis |
| 12 | cortex_memory_search |
Recall agent memories |
| 13 | cortex_memory_store |
Store findings |
| 14 | cortex_knowledge_search |
Search knowledge base (with hall_type + asOf filters) |
| 15 | cortex_knowledge_store |
Store knowledge with hall type + validity |
| 16 | cortex_quality_report |
Report build/test/lint results |
| 17 | cortex_plan_quality |
8-criterion plan assessment |
| 18 | cortex_tool_stats |
Token savings + tool usage analytics |
| 19 | cortex_health |
Backend service health check |
Cross-project search just works — no repo lookup needed:
cortex_code_search(query: "auth middleware jwt") // scans ALL projects
cortex_code_search(query: "auth middleware jwt", repo: "cortex-hub") // narrow to one
Full API reference:
docs/api/hub-mcp-reference.md
# macOS / Linux — interactive wizard
curl -fsSL https://raw.githubusercontent.com/lktiep/cortex-hub/master/scripts/run-agent.sh | bash -s -- launch
# Headless daemon with preset
curl -fsSL https://raw.githubusercontent.com/lktiep/cortex-hub/master/scripts/run-agent.sh | bash -s -- start --daemon --preset fullstack
# Windows
iwr -useb "https://raw.githubusercontent.com/lktiep/cortex-hub/master/scripts/run-agent.ps1" -OutFile $env:TEMP\run-agent.ps1
& $env:TEMP\run-agent.ps1 start
# macOS / Linux
curl -fsSL "https://raw.githubusercontent.com/lktiep/cortex-hub/master/scripts/install.sh" | bash
# Windows
iwr -useb "https://raw.githubusercontent.com/lktiep/cortex-hub/master/scripts/install.ps1" -OutFile $env:TEMP\install.ps1; & $env:TEMP\install.ps1
Or inside Claude Code: type /install
The installer:
.claude/hooks/*).gitignore entries for generated filesgit clone https://github.com/lktiep/cortex-hub.git
cd cortex-hub
corepack enable && pnpm install
cp .env.example .env # add API keys
cd infra && docker compose up -d
Cortex includes an experimental multi-agent orchestration layer for cross-IDE task delegation. It is not feature-complete — agents can already create/pickup tasks, but autonomous strategy execution and smart agent matching are still WIP.
📖 See docs/conductor.md for current capabilities, limitations, and the rough edges to expect.
| Layer | Technology | Role |
|---|---|---|
| MCP Server | Hono on Node.js | Streamable HTTP + JSON-RPC, 25 tools |
| Code Intel | GitNexus | AST parsing, execution flow, Cypher graph |
| Embeddings | mem9 + Qdrant | Vector search with semantic recall |
| LLM Proxy | CLIProxy | Multi-provider with fallback chains |
| App DB | SQLite (WAL) | Sessions, quality, usage, knowledge metadata |
| API | Hono | Dashboard backend + mem9 in-process |
| Frontend | Next.js 15 + React 19 | Static export, served by nginx |
| Infra | Docker Compose | Service orchestration |
| Tunnel | Cloudflare Tunnel | Zero open ports |
| Hooks | Lefthook | Stack-aware git hooks |
| Monorepo | pnpm + Turborepo | Build orchestration |
cortex-hub/
├── apps/
│ ├── hub-mcp/ # MCP Server — 25 tools
│ ├── dashboard-api/ # Hono API + mem9 + recipe pipeline
│ └── dashboard-web/ # Next.js dashboard (13 pages)
├── packages/
│ ├── shared-types/ # TS type definitions
│ ├── shared-utils/ # Logger, common utilities
│ └── shared-mem9/ # Embedding pipeline + vector store
├── benchmarks/ # Reproducible benchmarks (LongMemEval, etc.)
├── infra/
│ ├── docker-compose.yml # Full stack
│ └── Dockerfile.* # Per-service builds
├── scripts/
│ ├── install.sh / .ps1 # Unified installer
│ ├── run-agent.sh / .ps1 # Agent daemon launcher
│ └── bootstrap.sh # Admin setup
├── docs/
│ ├── architecture/ # Design docs (recipe, conductor, gateway)
│ ├── conductor.md # ⚠️ Multi-agent orchestration (experimental)
│ └── guides/ # Onboarding, installation, use cases
├── templates/
│ ├── skills/install/ # /install slash command
│ └── workflows/ # Workflow templates (/code, /continue)
└── .cortex/ # Project profile + agent identity
| Document | Description |
|---|---|
docs/architecture/overview.md |
System architecture with diagrams |
docs/architecture/recipe-system.md |
Recipe System (auto-learning from execution) |
docs/architecture/llm-gateway.md |
LLM Gateway design |
docs/architecture/agent-quality-strategy.md |
Quality gates + scoring |
docs/conductor.md |
Multi-agent conductor (experimental) |
docs/api/hub-mcp-reference.md |
Full MCP tool API reference |
benchmarks/README.md |
Benchmark methodology + results |
docs/guides/installation.md |
Full installation guide |
docs/guides/use-cases.md |
Use cases + system requirements |
You maintain 5 repos across 3 languages. You fix a deployment bug in project A. Next week, project B has the same issue. Without Cortex, you debug from scratch. With Cortex:
cortex_knowledge_search("docker nginx 502 after restart")
→ "Nginx caches DNS at startup. Fix: resolver 127.0.0.11 valid=5s"
(stored 6 days ago by you, in project A)
Time saved: 30 min per known bug. Across 5 projects, that's hours/week.
Dev A refactors the auth middleware on Monday. Dev B starts a feature on Wednesday using the old auth pattern. Without Cortex, B's code breaks and nobody knows why. With Cortex:
/cs → "Dev A refactored auth middleware on Monday. New pattern uses
middleware.authenticate() instead of req.checkAuth(). See
knowledge doc kdoc-8f2a for migration steps."
Zero "who changed this?" conversations. Zero broken PRs from stale patterns.
Day 1. New hire clones the repo. Runs /install. Opens Claude Code.
/cs → Cortex loads:
- 47 team knowledge docs (deployment patterns, API conventions, known gotchas)
- Recent session summaries (what's being worked on NOW)
- Code intelligence across all indexed repos
New hire: "How does the payment flow work?"
cortex_code_search(query: "payment flow checkout")
→ 3 projects with relevant code, ranked by relevance, with file paths
Productive on day 1, not week 3. No "ask Dave, he knows how it works."
Working on your Mac at the office. Continue on Windows VPS at home.
Same API key → same memory → same knowledge → same session context.
/cs resumes exactly where you left off.
No git stash, no notes, no "what was I doing?"
Debug in Claude Code (deep reasoning). UI work in Cursor (fast iteration). Code review in Antigravity (visual). All three share the same Cortex backend:
Claude Code: stores finding → "Race condition in WebSocket reconnect"
Cursor: picks up finding → applies fix in the UI component
Antigravity: reviews the fix → stores quality feedback
Every agent builds on what the others learned. No repeated explanations.
| Resource | Minimum | Recommended | Notes |
|---|---|---|---|
| CPU | 2 vCPU | 4 vCPU | Qdrant vector search is CPU-bound |
| RAM | 4 GB | 8 GB | Qdrant + GitNexus + Node services |
| Disk | 20 GB | 50 GB | Vector indices grow with knowledge |
| OS | Ubuntu 22.04+ | Ubuntu 24.04 LTS | Any Linux with Docker 24+ |
Best value: Hetzner CX22 (~$4.50/mo) handles 3-5 agents.
| Component | Cost | Notes |
|---|---|---|
| Linux server | $4.50/mo+ | Hetzner CX22 minimum |
| Cloudflare Tunnel | Free | No open ports |
| All services | Free | Self-hosted in Docker |
| LLM API calls | Pay-per-use | Your own keys, budget-controlled |
| Total | ~$5/mo + LLM usage |
See Contributing Guide for development setup, commit conventions, and code standards.
MIT © Cortex Hub Contributors