1. What is MCP? (And Why It Matters)
The Model Context Protocol (MCP) is Anthropic's standardization of how AI clients communicate with external tools and data sources. Instead of every AI company inventing their own tool integration format, MCP provides a unified protocol that works across Claude, GitHub Copilot CLI, and any other MCP-compatible client.
At its core, MCP is JSON-RPC 2.0 over stdio or Server-Sent Events (SSE). That's it. No custom transport, no proprietary serialization. Just structured JSON messages flowing between a client (like Claude Desktop) and a server (like Memory Spine's MCP endpoint).
Why MCP exists: The tool integration mess
Before MCP, every AI platform had its own way to define tools:
- OpenAI Function Calling — JSON schema with specific format requirements
- Anthropic Tool Use — Similar but incompatible schema differences
- LangChain Tools — Python-centric with multiple inheritance patterns
- Custom Agent Frameworks — Each with proprietary tool definitions
If you built a tool for one platform, porting it to another meant rewriting the interface layer. MCP solves this by standardizing the protocol layer, not just the schema format.
MCP's key insight: The transport layer matters more than the schema. JSON-RPC over stdio means any language can implement an MCP server with ~50 lines of code.
2. MCP Server Anatomy: Transport > Handler > Registry
Every MCP server has the same three-layer architecture, regardless of implementation language:
MCP server architecture: clean separation between transport, handling, and business logic.
Transport Layer: JSON-RPC over stdio
The transport layer handles the physical communication. For stdio-based servers (the most common), this means:
# Client sends to server's stdin:
{"jsonrpc": "2.0", "id": 1, "method": "tools/call", "params": {...}}
# Server responds via stdout:
{"jsonrpc": "2.0", "id": 1, "result": {...}}
The beauty is simplicity. No HTTP servers, no WebSocket management, no connection pooling. The client launches your server as a subprocess, communicates over pipes, and kills it when done. Stateless by design.
Handler Layer: Message parsing and validation
The handler layer parses JSON-RPC messages, validates schemas, and routes to appropriate methods:
class MCPHandler:
def handle_message(self, message):
# Parse JSON-RPC
request = json.loads(message)
method = request['method']
params = request.get('params', {})
# Route to method
if method == 'tools/call':
return self.call_tool(params)
elif method == 'tools/list':
return self.list_tools()
elif method == 'resources/list':
return self.list_resources()
# ...etc
Registry Layer: Tool and resource management
The registry maintains your server's capabilities — tools, resources, and prompts. This is where the business logic lives:
class ToolRegistry:
def __init__(self):
self.tools = {}
self.resources = {}
def register_tool(self, name, func, schema):
self.tools[name] = {
'function': func,
'schema': schema
}
def call_tool(self, name, args):
tool = self.tools[name]
return tool['function'](**args)
3. Tool Definition Format & Schema Validation
MCP tools use JSON Schema Draft 2020-12 for parameter validation. The format is similar to OpenAI function calling but with MCP-specific extensions:
{
"name": "memory_search",
"description": "Search stored memories by semantic similarity",
"inputSchema": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "Search query"
},
"limit": {
"type": "integer",
"default": 10,
"minimum": 1,
"maximum": 100
}
},
"required": ["query"],
"additionalProperties": false
}
}
Key differences from OpenAI function calling:
- inputSchema vs parameters — MCP uses
inputSchema, OpenAI usesparameters - Stricter validation — MCP enforces
additionalProperties: falseby default - Better error handling — Schema validation failures return structured error objects
- Tool grouping — MCP supports grouping related tools under a common namespace
Advanced schema patterns
For complex tools, MCP supports advanced JSON Schema features:
{
"name": "memory_batch_store",
"inputSchema": {
"type": "object",
"properties": {
"memories": {
"type": "array",
"items": {
"type": "object",
"properties": {
"content": {"type": "string"},
"tags": {"type": "array", "items": {"type": "string"}},
"metadata": {"type": "object"}
},
"required": ["content"]
}
}
},
"required": ["memories"]
}
}
Memory Spine processes 50K+ tool calls daily. JSON Schema validation adds ~0.2ms per call — negligible overhead for the safety it provides. We've caught 847 malformed tool calls this month that would have caused runtime errors.
4. Resource & Prompt Primitives
Beyond tools, MCP defines two other primitives:
Resources: Static data access
Resources provide read-only access to data sources. Unlike tools (which are function calls), resources are static references:
# List available resources
GET resources/list
> {"resources": [{"uri": "memory://search/recent", "name": "Recent Memories"}]}
# Read a specific resource
GET resources/read {"uri": "memory://search/recent"}
> {"contents": [{"type": "text", "text": "Last 10 memories..."}]}
Think of resources as MCP's equivalent of file paths or URLs — ways for clients to reference static data without making function calls.
Prompts: Reusable prompt templates
Prompts are parameterized templates that clients can invoke:
{
"name": "analyze_codebase",
"description": "Generate codebase analysis prompt with context",
"arguments": [
{"name": "repo_path", "description": "Path to repository"}
]
}
# Client calls prompt
GET prompts/get {"name": "analyze_codebase", "arguments": {"repo_path": "/app"}}
> {
"description": "Analyze codebase structure and patterns",
"messages": [
{"role": "user", "content": "Analyze the codebase at /app..."}
]
}
Prompts are useful for complex, reusable prompt engineering that you want to centralize server-side rather than hardcode in clients.
5. Building a Custom MCP Server in Python
Let's build a minimal but complete MCP server. This example implements a simple key-value store with three tools:
#!/usr/bin/env python3
import json
import sys
from typing import Dict, Any
class SimpleMCPServer:
def __init__(self):
self.storage: Dict[str, Any] = {}
def handle_stdin(self):
"""Main server loop reading from stdin"""
for line in sys.stdin:
try:
request = json.loads(line.strip())
response = self.handle_request(request)
print(json.dumps(response), flush=True)
except Exception as e:
error_response = {
"jsonrpc": "2.0",
"id": request.get("id"),
"error": {"code": -1, "message": str(e)}
}
print(json.dumps(error_response), flush=True)
def handle_request(self, request: dict) -> dict:
"""Route JSON-RPC requests to handlers"""
method = request["method"]
params = request.get("params", {})
request_id = request.get("id")
if method == "initialize":
result = self.initialize(params)
elif method == "tools/list":
result = self.list_tools()
elif method == "tools/call":
result = self.call_tool(params)
else:
raise Exception(f"Unknown method: {method}")
return {
"jsonrpc": "2.0",
"id": request_id,
"result": result
}
def initialize(self, params: dict) -> dict:
"""MCP initialization handshake"""
return {
"protocolVersion": "2024-11-05",
"capabilities": {
"tools": {}
},
"serverInfo": {
"name": "simple-kv-server",
"version": "1.0.0"
}
}
def list_tools(self) -> dict:
"""Return available tools"""
return {
"tools": [
{
"name": "kv_set",
"description": "Store a key-value pair",
"inputSchema": {
"type": "object",
"properties": {
"key": {"type": "string"},
"value": {"type": "string"}
},
"required": ["key", "value"]
}
},
{
"name": "kv_get",
"description": "Retrieve value for a key",
"inputSchema": {
"type": "object",
"properties": {
"key": {"type": "string"}
},
"required": ["key"]
}
},
{
"name": "kv_list",
"description": "List all stored keys",
"inputSchema": {
"type": "object",
"properties": {}
}
}
]
}
def call_tool(self, params: dict) -> dict:
"""Execute a tool call"""
name = params["name"]
args = params.get("arguments", {})
if name == "kv_set":
self.storage[args["key"]] = args["value"]
return {
"content": [
{
"type": "text",
"text": f"Stored {args['key']} = {args['value']}"
}
]
}
elif name == "kv_get":
key = args["key"]
if key in self.storage:
return {
"content": [
{
"type": "text",
"text": f"{key} = {self.storage[key]}"
}
]
}
else:
return {
"content": [
{
"type": "text",
"text": f"Key '{key}' not found"
}
]
}
elif name == "kv_list":
keys = list(self.storage.keys())
return {
"content": [
{
"type": "text",
"text": f"Stored keys: {', '.join(keys) if keys else 'none'}"
}
]
}
else:
raise Exception(f"Unknown tool: {name}")
if __name__ == "__main__":
server = SimpleMCPServer()
server.handle_stdin()
Save this as simple-mcp-server.py, make it executable, and you have a working MCP server. The key insights:
- Synchronous by default — Each request blocks until complete
- Stateless design — Server can be killed and restarted anytime
- Error handling — Always return valid JSON-RPC error responses
- Tool results — Must return
contentarray withtypeand data
6. Connecting to Claude, Copilot & Other Clients
To connect your MCP server to AI clients, you need to register it in the client's configuration:
Claude Desktop configuration
Edit ~/AppData/Roaming/Claude/claude_desktop_config.json (Windows) or ~/Library/Application Support/Claude/claude_desktop_config.json (macOS):
{
"mcpServers": {
"simple-kv": {
"command": "python",
"args": ["/path/to/simple-mcp-server.py"]
}
}
}
GitHub Copilot CLI configuration
For the GitHub Copilot CLI, edit ~/.copilot/mcp-config.json:
{
"mcpServers": {
"simple-kv": {
"url": "stdio://python /path/to/simple-mcp-server.py"
}
}
}
Environment and security considerations
When deploying MCP servers in production:
- Use absolute paths — Relative paths fail when clients change working directories
- Handle environment variables — Pass secrets via environment, not command args
- Resource limits — MCP servers inherit resource limits from the client process
- Logging — Write logs to stderr (stdout is reserved for JSON-RPC)
# Production MCP server configuration
{
"mcpServers": {
"memory-spine": {
"command": "/usr/bin/python3",
"args": ["/opt/memory-spine/mcp-server.py"],
"env": {
"MEMORY_SPINE_API_KEY": "${MEMORY_SPINE_API_KEY}",
"LOG_LEVEL": "INFO"
}
}
}
}
7. Debugging with MCP Inspector
Anthropic provides the MCP Inspector — a web-based debugging tool for MCP servers. It's invaluable during development:
# Install MCP Inspector
npm install -g @modelcontextprotocol/inspector
# Debug your server
mcp-inspector python /path/to/simple-mcp-server.py
The inspector opens a web UI at localhost:5173 where you can:
- Test tool calls — Interactive forms for each tool with schema validation
- View logs — Real-time stderr output from your server
- Inspect messages — Full JSON-RPC request/response pairs
- Performance profiling — Tool execution times and memory usage
MCP Inspector runs your server with full permissions in the current environment. Don't use it with production credentials or in shared environments — it's a development tool only.
Advanced debugging techniques
For complex servers, add structured logging to stderr:
import logging
import sys
# Configure logging to stderr (stdout is for JSON-RPC)
logging.basicConfig(
level=logging.DEBUG,
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
stream=sys.stderr
)
logger = logging.getLogger(__name__)
def call_tool(self, params: dict) -> dict:
logger.info(f"Tool call: {params['name']} with args {params.get('arguments', {})}")
start_time = time.time()
try:
result = self._execute_tool(params)
duration = time.time() - start_time
logger.info(f"Tool {params['name']} completed in {duration:.3f}s")
return result
except Exception as e:
logger.error(f"Tool {params['name']} failed: {e}")
raise
8. Real Example: Memory Spine's 32-Tool MCP Server
Memory Spine's MCP server is a production example handling 50K+ daily requests across 32 tools. Here's the high-level architecture:
32 tools across 6 categories: memory operations, search, analytics, knowledge graphs, conversation tracking, and agent handoff. Average response time: 47ms. 99.97% uptime over the last 6 months.
Tool categories in production
| Category | Tools | Usage % | Avg Latency |
|---|---|---|---|
| Memory Ops | 8 | 45% | 23ms |
| Search | 6 | 32% | 78ms |
| Analytics | 5 | 12% | 156ms |
| Knowledge Graph | 7 | 8% | 91ms |
| Conversation | 4 | 2% | 34ms |
| Agent Handoff | 2 | 1% | 67ms |
Key architectural decisions
Async tool execution — Search and analytics tools run async to prevent blocking:
async def call_tool_async(self, name: str, args: dict) -> dict:
if name in self.async_tools:
return await self.execute_async_tool(name, args)
else:
return self.execute_sync_tool(name, args)
Connection pooling — Database connections are pooled and reused across requests:
class MemorySpineMCPServer:
def __init__(self):
self.db_pool = asyncio.create_task(
asyncpg.create_pool(DATABASE_URL, min_size=5, max_size=20)
)
Tool result caching — Expensive operations like analytics are cached for 5 minutes:
@lru_cache(maxsize=128)
def memory_analytics(self, cache_key: str) -> dict:
# Expensive analytics computation
return self._compute_analytics()
Graceful degradation — When dependencies fail, tools return partial results rather than errors:
def memory_search(self, query: str, limit: int = 10) -> dict:
try:
results = self.vector_search(query, limit)
except VectorDBException:
# Fallback to text search
results = self.text_search(query, limit)
return {"memories": results, "fallback_used": True}
Build Your Own MCP Server
Ready to integrate your tools with the MCP ecosystem? Our starter template gets you running in minutes.
Download MCP Template →