Open source · MIT · MCP Registry · Docker · PyPI

Give your coding
agents memory
that compounds.

MCP Context Server turns AI coding agents from amnesiacs into collaborators with a persistent, searchable, shared memory. Every session, every decision, every past solution — accumulating, cross-linked, instantly retrievable — across agents, projects, and context compactions.

One command · macOS · Docker Desktop required
$
CLAUDE_CODE_TOOLBOX_ENV_CONFIG='https://raw.githubusercontent.com/alex-feel/mcp-context-server/refs/heads/main/agents/claude-code/environment-docker-ollama.yaml' CLAUDE_CODE_TOOLBOX_SKIP_INSTALL=1 bash -c "$(curl -fsSL https://raw.githubusercontent.com/alex-feel/claude-code-toolbox/main/scripts/macos/setup-environment.sh)"
~2-5 min setup Auto-configures Claude Code Cursor · Codex · LangChain — connect manually via docs
Thread-scoped memory · live 0 entries · 0 links
02 — The amnesia tax
Before / After

Context compaction shouldn’t cost you an afternoon.

Your agent had a plan. The window compacted. Now it’s asking what you were working on. Sound familiar?

Without the server

Refactor, forgotten.

Agent builds a rich plan, starts executing, hits compaction. Plan evaporates. You spend the next five minutes re-explaining, re-pasting, re-hoping.

You · 10:02 Let’s refactor the billing module…
Agent I’ll read the files, map the domain, and draft a step-by-step plan across 14 files.
You · 10:41 · after /compactContinue with the plan.
Agent I’m not sure what we were working on. Can you summarize the plan again?
With MCP Context Server

Refactor, continuous.

The agent stores its own plan in a persistent thread. After compaction it reads it back — instantly — and keeps executing. The plan lives outside the window.

You · 10:02 Plan the refactor and save it to the context server.
Agent Stored plan #a1f…c2 in thread billing-refactor: 14 files, 7 steps, 3 tests.
You · 10:41 · after /compactProceed.
Agent Re-read plan from server. Resuming at step 4 of 7. No summary needed.
03 — Use cases
What it unlocks

From scratch pad to compounding knowledge base.

01

Keep agents on track across compactions.

Plan once, persist it, resume forever. The agent retrieves its own roadmap after every context reset and picks up exactly where it left off.

02

Lossless multi-agent handoffs.

Orchestrator points subagent at a thread; subagent reads the full original. No telephone-game summaries. Nothing compressed at the boundary.

03

Debug with full history.

Two weeks later, the agent can replay why a decision was made — original request, research, tradeoffs.

04

Cross-project knowledge base.

A pattern solved in project A is available to agents in project B — automatically searchable across your whole workflow.

05

Dedicated task threads.

Spin up a tasks or knowledge-base thread. Agents read and write there — a structured store that fits your workflow.

06

“Just remember this.” — a persistent scratch pad that never forgets and is always searchable.

Meeting notes, architecture decisions, review findings, requirements. Store now, retrieve later — same session, next session, different project.

04 — Quick start
~2-5 minutes

One command. Three platforms. Zero config.

Downloads a Docker Compose stack (Postgres + pgvector, Ollama, MCP Context Server), starts it, and fully configures Claude Code — hooks, skills, rules included.

CLAUDE_CODE_TOOLBOX_ENV_CONFIG='https://raw.githubusercontent.com/alex-feel/mcp-context-server/refs/heads/main/agents/claude-code/environment-docker-ollama.yaml' \
CLAUDE_CODE_TOOLBOX_SKIP_INSTALL='1' \
bash -c "$(curl -fsSL https://raw.githubusercontent.com/alex-feel/claude-code-toolbox/main/scripts/macos/setup-environment.sh)"
local stack · Postgres + pgvector + Ollama + server
1

Run the one-liner

Pulls ~1.2 GB of Ollama models on first start. Everything else is handled.

2

Claude Code auto-configured

Hooks, skills and rules are wired so agents learn when to store and when to retrieve. No manual setup.

+

Prefer another path?

Other MCP clients: Point Cursor, Codex, or LangChain at http://localhost:8000/mcp
PyPI: pip install mcp-context-server
Backends: SQLite (zero-config) or PostgreSQL
Deploy: Docker Compose · Kubernetes · Helm
Full documentation ↗

05 — Under the hood
Production-grade

Built for real workflows, not just demos.

Architecture

A thoughtful stack: FastMCP, thread-scoped storage, pluggable everything.

Claude CodeagentCursor / CodexagentLangChainagentMCP CONTEXT SERVERFastMCP · 13 toolsstoresearchthreadsrerankhybrid · fts · semanticmetadata · tags · threadsSQLite · PostgresstorageOllama · OpenAIembeddingsFlashRankcross-encoder rerank
Retrieval

Full-text, semantic, hybrid.

Stemming, ranking, boolean queries; pgvector similarity; Reciprocal Rank Fusion when you want both — plus cross-encoder reranking on top.

FTS5 / tsvectorRRFpgvector
Filtering

16 metadata operators.

Filter by nested JSON paths, tags, date ranges, and indexed fields. GIN indexes on array/object metadata for speed.

$eq $ne $in $nin$gte $lte$contains+9 more
Backends

SQLite or PostgreSQL.

Zero-config for solo work; 10× write throughput with Postgres for teams. Same API, swap with one env var.

WAL modeMVCCSupabase-ready
Embeddings

5 providers, one config.

Ollama (local, default), OpenAI, Azure, HuggingFace, Voyage.

local-first1024-dim default
Summaries

LLM-generated, source-aware.

Every stored entry gets a concise summary via Ollama, OpenAI, or Anthropic.

on by default3 providerssource-aware prompts
Deploy

From laptop to cluster.

Docker Compose for local; Helm chart for Kubernetes; bearer for HTTP transport auth.

stdio · HTTPHelmOAuth
Scale

Chunking, reranking, indexing.

Long docs chunked for semantic search. Results over-fetched, then reranked with ms-marco-MiniLM for precision.

ENABLE_CHUNKINGflashrank
06 — API surface
13 MCP tools

A small, sharp toolbox your agent actually uses.

Core
store_context

Persist a text or multimodal entry to a thread.

Core
search_context

Browse and filter entries by thread, metadata, tags, dates.

Core
get_context_by_ids

Retrieve full, untruncated entries by ID.

Core
update_context

Patch text, metadata, tags, or images in place.

Core
delete_context

Remove stale or superseded entries.

Core
list_threads

Enumerate threads with entry counts and timestamps.

Core
get_statistics

Server health, feature status, and usage metrics.

Batch
store_context_batch

Persist many entries in one call.

Batch
update_context_batch

Bulk patches across entries.

Batch
delete_context_batch

Multi-criteria bulk deletion (IDs, threads, age).

Docs
api-reference.md

Parameters, return shapes, and examples for every tool.

Read the reference ↗
07 — Compatibility
MCP-native

Works with anything that speaks MCP.

Point any MCP-compatible client at http://localhost:8000/mcp. No lock-in, no custom protocol.

Claude Code
MCP · http · auto-configured
Cursor
MCP · http
Codex
MCP · http
LangChain
MCP · http
Any MCP client
spec-compliant
Your stack
MCP · Docker · PyPI
One command away

Stop re-explaining. Start compounding.

The project is open source under the MIT license. Star it, fork it, contribute to it — or just run the command and feel the difference on your next coding session.