Open source · MIT · MCP Registry · Docker · PyPI

Give your coding
agents memory
that compounds.

MCP Context Server turns AI coding agents from amnesiacs into collaborators with a persistent, searchable, shared memory. Every session, every decision, every past solution — accumulating, cross-linked, instantly retrievable — across agents, projects, and context compactions.

One command · macOS · Docker Desktop required

CLAUDE_CODE_TOOLBOX_ENV_CONFIG='https://raw.githubusercontent.com/alex-feel/mcp-context-server/refs/heads/main/agents/claude-code/environment-docker-ollama.yaml' CLAUDE_CODE_TOOLBOX_SKIP_INSTALL=1 bash -c "$(curl -fsSL https://raw.githubusercontent.com/alex-feel/claude-code-toolbox/main/scripts/macos/setup-environment.sh)"

~2-5 min setup Auto-configures Claude Code Cursor · Codex · LangChain — connect manually via docs

Thread-scoped memory · live 0 entries · 0 links

02 — The amnesia tax

Before / After

Context compaction shouldn’t cost you an afternoon.

Your agent had a plan. The window compacted. Now it’s asking what you were working on. Sound familiar?

Without the server

Refactor, forgotten.

Agent builds a rich plan, starts executing, hits compaction. Plan evaporates. You spend the next five minutes re-explaining, re-pasting, re-hoping.

You · 10:02 Let’s refactor the billing module…

Agent I’ll read the files, map the domain, and draft a step-by-step plan across 14 files.

You · 10:41 · after /compactContinue with the plan.

Agent I’m not sure what we were working on. Can you summarize the plan again?

With MCP Context Server

Refactor, continuous.

The agent stores its own plan in a persistent thread. After compaction it reads it back — instantly — and keeps executing. The plan lives outside the window.

You · 10:02 Plan the refactor and save it to the context server.

Agent Stored plan #a1f…c2 in thread billing-refactor: 14 files, 7 steps, 3 tests.

You · 10:41 · after /compactProceed.

Agent Re-read plan from server. Resuming at step 4 of 7. No summary needed.

03 — Use cases

What it unlocks

From scratch pad to compounding knowledge base.

Keep agents on track across compactions.

Plan once, persist it, resume forever. The agent retrieves its own roadmap after every context reset and picks up exactly where it left off.

Lossless multi-agent handoffs.

Orchestrator points subagent at a thread; subagent reads the full original. No telephone-game summaries. Nothing compressed at the boundary.

Debug with full history.

Two weeks later, the agent can replay why a decision was made — original request, research, tradeoffs.

Cross-project knowledge base.

A pattern solved in project A is available to agents in project B — automatically searchable across your whole workflow.

Dedicated task threads.

Spin up a tasks or knowledge-base thread. Agents read and write there — a structured store that fits your workflow.

“Just remember this.” — a persistent scratch pad that never forgets and is always searchable.

Meeting notes, architecture decisions, review findings, requirements. Store now, retrieve later — same session, next session, different project.

04 — Quick start

~2-5 minutes

One command. Three platforms. Zero config.

Downloads a Docker Compose stack (Postgres + pgvector, Ollama, MCP Context Server), starts it, and fully configures Claude Code — hooks, skills, rules included.

CLAUDE_CODE_TOOLBOX_ENV_CONFIG='https://raw.githubusercontent.com/alex-feel/mcp-context-server/refs/heads/main/agents/claude-code/environment-docker-ollama.yaml' \
CLAUDE_CODE_TOOLBOX_SKIP_INSTALL='1' \
bash -c "$(curl -fsSL https://raw.githubusercontent.com/alex-feel/claude-code-toolbox/main/scripts/macos/setup-environment.sh)"

CLAUDE_CODE_TOOLBOX_ENV_CONFIG='https://raw.githubusercontent.com/alex-feel/mcp-context-server/refs/heads/main/agents/claude-code/environment-docker-ollama.yaml' \
CLAUDE_CODE_TOOLBOX_SKIP_INSTALL='1' \
bash -c "$(curl -fsSL https://raw.githubusercontent.com/alex-feel/claude-code-toolbox/main/scripts/linux/setup-environment.sh)"

powershell -NoProfile -NoExit -ExecutionPolicy Bypass -Command "`$env:CLAUDE_CODE_TOOLBOX_ENV_CONFIG='https://raw.githubusercontent.com/alex-feel/mcp-context-server/refs/heads/main/agents/claude-code/environment-docker-ollama.yaml'; `$env:CLAUDE_CODE_TOOLBOX_SKIP_INSTALL='1'; iex (irm 'https://raw.githubusercontent.com/alex-feel/claude-code-toolbox/main/scripts/windows/setup-environment.ps1')"

Run the one-liner

Pulls ~1.2 GB of Ollama models on first start. Everything else is handled.

Claude Code auto-configured

Hooks, skills and rules are wired so agents learn when to store and when to retrieve. No manual setup.

Prefer another path?

Other MCP clients: Point Cursor, Codex, or LangChain at http://localhost:8000/mcp
PyPI: pip install mcp-context-server
Backends: SQLite (zero-config) or PostgreSQL
Deploy: Docker Compose · Kubernetes · Helm
Full documentation ↗

05 — Under the hood

Production-grade

Built for real workflows, not just demos.

Architecture

A thoughtful stack: FastMCP, thread-scoped storage, pluggable everything.

Retrieval

Full-text, semantic, hybrid.

Stemming, ranking, boolean queries; pgvector similarity; Reciprocal Rank Fusion when you want both — plus cross-encoder reranking on top.

FTS5 / tsvectorRRFpgvector

Filtering

16 metadata operators.

Filter by nested JSON paths, tags, date ranges, and indexed fields. GIN indexes on array/object metadata for speed.

$eq $ne $in $nin$gte $lte$contains+9 more

Backends

SQLite or PostgreSQL.

Zero-config for solo work; 10× write throughput with Postgres for teams. Same API, swap with one env var.

WAL modeMVCCSupabase-ready

Embeddings

5 providers, one config.

Ollama (local, default), OpenAI, Azure, HuggingFace, Voyage.

local-first1024-dim default

Summaries

LLM-generated, source-aware.

Every stored entry gets a concise summary via Ollama, OpenAI, or Anthropic.

on by default3 providerssource-aware prompts

Deploy

From laptop to cluster.

Docker Compose for local; Helm chart for Kubernetes; bearer for HTTP transport auth.

stdio · HTTPHelmOAuth

Scale

Chunking, reranking, indexing.

Long docs chunked for semantic search. Results over-fetched, then reranked with ms-marco-MiniLM for precision.

ENABLE_CHUNKINGflashrank

06 — API surface

13 MCP tools

A small, sharp toolbox your agent actually uses.

Core

store_context

Persist a text or multimodal entry to a thread.

Core

search_context

Browse and filter entries by thread, metadata, tags, dates.

Core

get_context_by_ids

Retrieve full, untruncated entries by ID.

Core

update_context

Patch text, metadata, tags, or images in place.

Core

delete_context

Remove stale or superseded entries.

Core

list_threads

Enumerate threads with entry counts and timestamps.

Core

get_statistics

Server health, feature status, and usage metrics.

fts_search_context

Stemming, ranking, boolean queries, cross-encoder rerank.

semantic_search_context

Vector similarity with cross-encoder rerank.

hybrid_search_context

FTS + semantic fused with RRF, cross-encoder rerank.

Batch

store_context_batch

Persist many entries in one call.

Batch

update_context_batch

Bulk patches across entries.

Batch

delete_context_batch

Multi-criteria bulk deletion (IDs, threads, age).

Docs

api-reference.md

Parameters, return shapes, and examples for every tool.

Read the reference ↗

Give your coding
agents memory
that compounds.

Context compaction shouldn’t cost you an afternoon.

Refactor, forgotten.

Refactor, continuous.

From scratch pad to compounding knowledge base.

Keep agents on track across compactions.

Lossless multi-agent handoffs.

Debug with full history.

Cross-project knowledge base.

Dedicated task threads.

“Just remember this.” — a persistent scratch pad that never forgets and is always searchable.

One command. Three platforms. Zero config.

Run the one-liner

Claude Code auto-configured

Prefer another path?

Built for real workflows, not just demos.

A thoughtful stack: FastMCP, thread-scoped storage, pluggable everything.

Full-text, semantic, hybrid.

16 metadata operators.

SQLite or PostgreSQL.

5 providers, one config.

LLM-generated, source-aware.

From laptop to cluster.

Chunking, reranking, indexing.

A small, sharp toolbox your agent actually uses.

Works with anything that speaks MCP.

Stop re-explaining. Start compounding.

Give your codingagents memorythat compounds.

Context compaction shouldn’t cost you an afternoon.

Refactor, forgotten.

Refactor, continuous.

From scratch pad to compounding knowledge base.

Keep agents on track across compactions.

Lossless multi-agent handoffs.

Debug with full history.

Cross-project knowledge base.

Dedicated task threads.

“Just remember this.” — a persistent scratch pad that never forgets and is always searchable.

One command. Three platforms. Zero config.

Run the one-liner

Claude Code auto-configured

Prefer another path?

Built for real workflows, not just demos.

A thoughtful stack: FastMCP, thread-scoped storage, pluggable everything.

Full-text, semantic, hybrid.

16 metadata operators.

SQLite or PostgreSQL.

5 providers, one config.

LLM-generated, source-aware.

From laptop to cluster.

Chunking, reranking, indexing.

A small, sharp toolbox your agent actually uses.

Works with anything that speaks MCP.

Stop re-explaining. Start compounding.

Give your coding
agents memory
that compounds.