MCP/Tools · 14 min read

MCP Server Architecture Explained

Building a production MCP server for Memory Spine taught me the intricacies of JSON-RPC over stdio, tool registries, and resource management. Here's what 32 MCP tools and 50K+ daily requests revealed about scalable server architecture.

🚀
Part of ChaozCode · Memory Spine is one of 8 apps in the ChaozCode DevOps AI Platform. 233 agents. 363+ tools. Start free

1. What is MCP? (And Why It Matters)

The Model Context Protocol (MCP) is Anthropic's standardization of how AI clients communicate with external tools and data sources. Instead of every AI company inventing their own tool integration format, MCP provides a unified protocol that works across Claude, GitHub Copilot CLI, and any other MCP-compatible client.

At its core, MCP is JSON-RPC 2.0 over stdio or Server-Sent Events (SSE). That's it. No custom transport, no proprietary serialization. Just structured JSON messages flowing between a client (like Claude Desktop) and a server (like Memory Spine's MCP endpoint).

Why MCP exists: The tool integration mess

Before MCP, every AI platform had its own way to define tools:

If you built a tool for one platform, porting it to another meant rewriting the interface layer. MCP solves this by standardizing the protocol layer, not just the schema format.

MCP's key insight: The transport layer matters more than the schema. JSON-RPC over stdio means any language can implement an MCP server with ~50 lines of code.

2. MCP Server Anatomy: Transport > Handler > Registry

Every MCP server has the same three-layer architecture, regardless of implementation language:

Transport Layer: JSON-RPC over stdio

The transport layer handles the physical communication. For stdio-based servers (the most common), this means:

# Client sends to server's stdin:
{"jsonrpc": "2.0", "id": 1, "method": "tools/call", "params": {...}}

# Server responds via stdout:
{"jsonrpc": "2.0", "id": 1, "result": {...}}

The beauty is simplicity. No HTTP servers, no WebSocket management, no connection pooling. The client launches your server as a subprocess, communicates over pipes, and kills it when done. Stateless by design.

Handler Layer: Message parsing and validation

The handler layer parses JSON-RPC messages, validates schemas, and routes to appropriate methods:

class MCPHandler:
    def handle_message(self, message):
        # Parse JSON-RPC
        request = json.loads(message)
        method = request['method']
        params = request.get('params', {})
        
        # Route to method
        if method == 'tools/call':
            return self.call_tool(params)
        elif method == 'tools/list':
            return self.list_tools()
        elif method == 'resources/list':
            return self.list_resources()
        # ...etc

Registry Layer: Tool and resource management

The registry maintains your server's capabilities — tools, resources, and prompts. This is where the business logic lives:

class ToolRegistry:
    def __init__(self):
        self.tools = {}
        self.resources = {}
        
    def register_tool(self, name, func, schema):
        self.tools[name] = {
            'function': func,
            'schema': schema
        }
    
    def call_tool(self, name, args):
        tool = self.tools[name]
        return tool['function'](**args)

3. Tool Definition Format & Schema Validation

MCP tools use JSON Schema Draft 2020-12 for parameter validation. The format is similar to OpenAI function calling but with MCP-specific extensions:

{
  "name": "memory_search",
  "description": "Search stored memories by semantic similarity",
  "inputSchema": {
    "type": "object",
    "properties": {
      "query": {
        "type": "string",
        "description": "Search query"
      },
      "limit": {
        "type": "integer", 
        "default": 10,
        "minimum": 1,
        "maximum": 100
      }
    },
    "required": ["query"],
    "additionalProperties": false
  }
}

Key differences from OpenAI function calling:

Advanced schema patterns

For complex tools, MCP supports advanced JSON Schema features:

{
  "name": "memory_batch_store",
  "inputSchema": {
    "type": "object", 
    "properties": {
      "memories": {
        "type": "array",
        "items": {
          "type": "object",
          "properties": {
            "content": {"type": "string"},
            "tags": {"type": "array", "items": {"type": "string"}},
            "metadata": {"type": "object"}
          },
          "required": ["content"]
        }
      }
    },
    "required": ["memories"]
  }
}
🔧 Schema Validation Performance

Memory Spine processes 50K+ tool calls daily. JSON Schema validation adds ~0.2ms per call — negligible overhead for the safety it provides. We've caught 847 malformed tool calls this month that would have caused runtime errors.

4. Resource & Prompt Primitives

Beyond tools, MCP defines two other primitives:

Resources: Static data access

Resources provide read-only access to data sources. Unlike tools (which are function calls), resources are static references:

# List available resources
GET resources/list
> {"resources": [{"uri": "memory://search/recent", "name": "Recent Memories"}]}

# Read a specific resource  
GET resources/read {"uri": "memory://search/recent"}
> {"contents": [{"type": "text", "text": "Last 10 memories..."}]}

Think of resources as MCP's equivalent of file paths or URLs — ways for clients to reference static data without making function calls.

Prompts: Reusable prompt templates

Prompts are parameterized templates that clients can invoke:

{
  "name": "analyze_codebase",
  "description": "Generate codebase analysis prompt with context",
  "arguments": [
    {"name": "repo_path", "description": "Path to repository"}
  ]
}

# Client calls prompt
GET prompts/get {"name": "analyze_codebase", "arguments": {"repo_path": "/app"}}
> {
  "description": "Analyze codebase structure and patterns", 
  "messages": [
    {"role": "user", "content": "Analyze the codebase at /app..."}
  ]
}

Prompts are useful for complex, reusable prompt engineering that you want to centralize server-side rather than hardcode in clients.

5. Building a Custom MCP Server in Python

Let's build a minimal but complete MCP server. This example implements a simple key-value store with three tools:

#!/usr/bin/env python3
import json
import sys
from typing import Dict, Any

class SimpleMCPServer:
    def __init__(self):
        self.storage: Dict[str, Any] = {}
        
    def handle_stdin(self):
        """Main server loop reading from stdin"""
        for line in sys.stdin:
            try:
                request = json.loads(line.strip())
                response = self.handle_request(request)
                print(json.dumps(response), flush=True)
            except Exception as e:
                error_response = {
                    "jsonrpc": "2.0",
                    "id": request.get("id"),
                    "error": {"code": -1, "message": str(e)}
                }
                print(json.dumps(error_response), flush=True)
                
    def handle_request(self, request: dict) -> dict:
        """Route JSON-RPC requests to handlers"""
        method = request["method"]
        params = request.get("params", {})
        request_id = request.get("id")
        
        if method == "initialize":
            result = self.initialize(params)
        elif method == "tools/list":
            result = self.list_tools()
        elif method == "tools/call":
            result = self.call_tool(params)
        else:
            raise Exception(f"Unknown method: {method}")
            
        return {
            "jsonrpc": "2.0",
            "id": request_id,
            "result": result
        }
        
    def initialize(self, params: dict) -> dict:
        """MCP initialization handshake"""
        return {
            "protocolVersion": "2024-11-05",
            "capabilities": {
                "tools": {}
            },
            "serverInfo": {
                "name": "simple-kv-server",
                "version": "1.0.0"
            }
        }
        
    def list_tools(self) -> dict:
        """Return available tools"""
        return {
            "tools": [
                {
                    "name": "kv_set",
                    "description": "Store a key-value pair",
                    "inputSchema": {
                        "type": "object",
                        "properties": {
                            "key": {"type": "string"},
                            "value": {"type": "string"}
                        },
                        "required": ["key", "value"]
                    }
                },
                {
                    "name": "kv_get",
                    "description": "Retrieve value for a key",
                    "inputSchema": {
                        "type": "object",
                        "properties": {
                            "key": {"type": "string"}
                        },
                        "required": ["key"]
                    }
                },
                {
                    "name": "kv_list",
                    "description": "List all stored keys",
                    "inputSchema": {
                        "type": "object",
                        "properties": {}
                    }
                }
            ]
        }
        
    def call_tool(self, params: dict) -> dict:
        """Execute a tool call"""
        name = params["name"]
        args = params.get("arguments", {})
        
        if name == "kv_set":
            self.storage[args["key"]] = args["value"]
            return {
                "content": [
                    {
                        "type": "text", 
                        "text": f"Stored {args['key']} = {args['value']}"
                    }
                ]
            }
        elif name == "kv_get":
            key = args["key"]
            if key in self.storage:
                return {
                    "content": [
                        {
                            "type": "text",
                            "text": f"{key} = {self.storage[key]}"
                        }
                    ]
                }
            else:
                return {
                    "content": [
                        {
                            "type": "text",
                            "text": f"Key '{key}' not found"
                        }
                    ]
                }
        elif name == "kv_list":
            keys = list(self.storage.keys())
            return {
                "content": [
                    {
                        "type": "text",
                        "text": f"Stored keys: {', '.join(keys) if keys else 'none'}"
                    }
                ]
            }
        else:
            raise Exception(f"Unknown tool: {name}")

if __name__ == "__main__":
    server = SimpleMCPServer()
    server.handle_stdin()

Save this as simple-mcp-server.py, make it executable, and you have a working MCP server. The key insights:

6. Connecting to Claude, Copilot & Other Clients

To connect your MCP server to AI clients, you need to register it in the client's configuration:

Claude Desktop configuration

Edit ~/AppData/Roaming/Claude/claude_desktop_config.json (Windows) or ~/Library/Application Support/Claude/claude_desktop_config.json (macOS):

{
  "mcpServers": {
    "simple-kv": {
      "command": "python",
      "args": ["/path/to/simple-mcp-server.py"]
    }
  }
}

GitHub Copilot CLI configuration

For the GitHub Copilot CLI, edit ~/.copilot/mcp-config.json:

{
  "mcpServers": {
    "simple-kv": {
      "url": "stdio://python /path/to/simple-mcp-server.py"
    }
  }
}

Environment and security considerations

When deploying MCP servers in production:

# Production MCP server configuration
{
  "mcpServers": {
    "memory-spine": {
      "command": "/usr/bin/python3",
      "args": ["/opt/memory-spine/mcp-server.py"],
      "env": {
        "MEMORY_SPINE_API_KEY": "${MEMORY_SPINE_API_KEY}",
        "LOG_LEVEL": "INFO"
      }
    }
  }
}

7. Debugging with MCP Inspector

Anthropic provides the MCP Inspector — a web-based debugging tool for MCP servers. It's invaluable during development:

# Install MCP Inspector
npm install -g @modelcontextprotocol/inspector

# Debug your server
mcp-inspector python /path/to/simple-mcp-server.py

The inspector opens a web UI at localhost:5173 where you can:

⚠️ Inspector Security

MCP Inspector runs your server with full permissions in the current environment. Don't use it with production credentials or in shared environments — it's a development tool only.

Advanced debugging techniques

For complex servers, add structured logging to stderr:

import logging
import sys

# Configure logging to stderr (stdout is for JSON-RPC)
logging.basicConfig(
    level=logging.DEBUG,
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
    stream=sys.stderr
)

logger = logging.getLogger(__name__)

def call_tool(self, params: dict) -> dict:
    logger.info(f"Tool call: {params['name']} with args {params.get('arguments', {})}")
    start_time = time.time()
    
    try:
        result = self._execute_tool(params)
        duration = time.time() - start_time
        logger.info(f"Tool {params['name']} completed in {duration:.3f}s")
        return result
    except Exception as e:
        logger.error(f"Tool {params['name']} failed: {e}")
        raise

8. Real Example: Memory Spine's 32-Tool MCP Server

Memory Spine's MCP server is a production example handling 50K+ daily requests across 32 tools. Here's the high-level architecture:

📊 Memory Spine MCP Stats

32 tools across 6 categories: memory operations, search, analytics, knowledge graphs, conversation tracking, and agent handoff. Average response time: 47ms. 99.97% uptime over the last 6 months.

Tool categories in production

CategoryToolsUsage %Avg Latency
Memory Ops845%23ms
Search632%78ms
Analytics512%156ms
Knowledge Graph78%91ms
Conversation42%34ms
Agent Handoff21%67ms

Key architectural decisions

Async tool execution — Search and analytics tools run async to prevent blocking:

async def call_tool_async(self, name: str, args: dict) -> dict:
    if name in self.async_tools:
        return await self.execute_async_tool(name, args)
    else:
        return self.execute_sync_tool(name, args)

Connection pooling — Database connections are pooled and reused across requests:

class MemorySpineMCPServer:
    def __init__(self):
        self.db_pool = asyncio.create_task(
            asyncpg.create_pool(DATABASE_URL, min_size=5, max_size=20)
        )

Tool result caching — Expensive operations like analytics are cached for 5 minutes:

@lru_cache(maxsize=128)
def memory_analytics(self, cache_key: str) -> dict:
    # Expensive analytics computation
    return self._compute_analytics()

Graceful degradation — When dependencies fail, tools return partial results rather than errors:

def memory_search(self, query: str, limit: int = 10) -> dict:
    try:
        results = self.vector_search(query, limit)
    except VectorDBException:
        # Fallback to text search
        results = self.text_search(query, limit)
        
    return {"memories": results, "fallback_used": True}

Build Your Own MCP Server

Ready to integrate your tools with the MCP ecosystem? Our starter template gets you running in minutes.

Download MCP Template →
Share this article:

🔧 Related ChaozCode Tools

Memory Spine

Persistent memory for AI agents — store, search, and recall context across sessions

Solas AI

Multi-perspective reasoning engine with Council of Minds for complex decisions

AgentZ

Agent orchestration and execution platform powering 233+ specialized AI agents

Explore all 8 ChaozCode apps >