The hands-on developer guide to building, testing, publishing, securing, and distributing your first MCP server in 2026: from zero to production in one article.
The MCP Python SDK is downloaded 5.9 million times per day. The TypeScript SDK has 43,888 dependent npm packages. Combined, the protocol hit 97 million monthly SDK downloads in March 2026, a 970x increase from the handful of servers that existed when Anthropic released the spec in November 2024 - Digital Applied. React took three years to reach 100 million monthly downloads. MCP did it in sixteen months.
The protocol is now governed by the Agentic AI Foundation under the Linux Foundation, with AWS, Anthropic, Block, Bloomberg, Cloudflare, Google, Microsoft, and OpenAI as platinum members - Linux Foundation. Over 544 MCP clients exist across editors, chat applications, and agent platforms - PulseMCP. If you build an MCP server, it works with Claude, ChatGPT, Cursor, VS Code Copilot, Windsurf, Zed, Replit, and hundreds of other tools without any per-client integration work.
This guide walks you through the entire journey: understanding the protocol, writing your first server, testing it locally, publishing it, listing it for discovery, and securing it for production. Every code example runs. Every configuration is copy-pasteable. By the end, you will have a working MCP server that AI agents can discover and use.
Written by Yuma Heymans (@yumahey), who builds agent infrastructure at O-mega.ai and has published MCP servers as part of the Suprsonic unified agent API platform. For a broader market overview of the MCP ecosystem, see our 50 best MCP servers guide.
Contents
- What MCP Actually Is (The 2-Minute Version)
- The Architecture: Hosts, Clients, Servers, and Transports
- The Three Primitives: Tools, Resources, and Prompts
- Building Your First MCP Server (Python)
- Building Your First MCP Server (TypeScript)
- Testing Locally with Claude Desktop, Cursor, and VS Code
- Testing with MCP Inspector
- Transport Deep Dive: stdio vs Streamable HTTP
- Authentication: OAuth 2.1 for Remote Servers
- Error Handling That Helps the AI Recover
- Security: The OWASP MCP Top 10 and How to Avoid Them
- Publishing to npm, PyPI, and the Official Registry
- Listing for Discovery: Where to Submit
- Advanced Patterns: Dynamic Tools, Streaming, Multi-Server
- Monetization: Getting Paid for Your MCP Server
- Performance: Which Language to Choose
1. What MCP Actually Is (The 2-Minute Version)
The Model Context Protocol is a standard for connecting AI applications to external tools and data. It defines a communication protocol (based on JSON-RPC 2.0) that lets any AI client discover what tools a server offers, call those tools with structured arguments, and receive structured results back. The analogy is the Language Server Protocol (LSP) that standardized how code editors connect to language tooling: just as LSP made it so any editor could get autocomplete for any language, MCP makes it so any AI model can use any tool - Wikipedia.
Before MCP, connecting an AI agent to a tool (say, GitHub) meant writing custom integration code for every client that wanted to use it. Claude needed one integration, ChatGPT needed another, Cursor needed a third. With MCP, you build the GitHub integration once as an MCP server, and every MCP-compatible client can use it. Build once, work everywhere.
The spec has evolved rapidly since launch. The current version is 2025-11-25, which added experimental Tasks (async execution), an extensions framework, and standardized OAuth scope names. The major earlier updates added OAuth 2.1 (March 2025), Streamable HTTP transport (March 2025), structured tool output (June 2025), and elicitation (June 2025) - MCP Changelog.
The practical implication for you as a builder: if you publish an MCP server today, it works with every major AI application without modification. The protocol handles the compatibility. You just build the tool.
Why MCP Won (and What It Replaced)
Before MCP, there were three competing approaches to connecting AI models to tools.
Function calling embeds tool definitions directly in the LLM API request. You send the model a JSON Schema describing your tool, the model decides whether to call it, and your application code executes the call. This works but is tightly coupled: the tool definitions, execution logic, and credentials all live inside your application. If five different AI clients want to use the same tool, you write five different integrations.
Custom plugins (like the original ChatGPT Plugins) defined a manifest file and an OpenAPI spec that the AI platform loaded. This was a step toward standardization but was platform-specific: a ChatGPT plugin did not work in Claude, and vice versa.
MCP inverted the architecture. Instead of the AI client defining tools (function calling) or the AI platform hosting plugins (custom plugins), the tool itself becomes an independent server that any client can connect to. Build the server once, publish it, and every MCP-compatible client discovers and uses it without integration work from either side.
The protocol won because of network effects. Once Claude, ChatGPT, Cursor, VS Code, and 540+ other clients supported MCP, the incentive for tool builders to publish MCP servers became overwhelming: one implementation reaches all clients. As of April 2026, MCP is the de facto standard. There is no competing protocol with comparable backing (the AAIF has eight platinum members including every major AI company). For the full analysis of MCP's ecosystem growth, see our 50 best MCP servers guide.
2. The Architecture: Hosts, Clients, Servers, and Transports
MCP has three participants. The Host is the AI application (Claude Desktop, VS Code, Cursor) that the user interacts with. The Client is a connector within the host that maintains a connection to one MCP server. The Server is the program you build that provides tools, resources, or prompts to the client - MCP Architecture.
One host can create multiple clients, each connected to a different server. Claude Desktop might have one client connected to a GitHub MCP server, another to a database server, and a third to a web search server. Each client-server pair is independent.
The protocol has two layers. The Data Layer defines the message format (JSON-RPC 2.0), the lifecycle (initialize, discover, call, respond), and the primitives (tools, resources, prompts). The Transport Layer defines how messages physically travel between client and server. There are two production transports: stdio (local, runs as a subprocess, single client) and Streamable HTTP (remote, runs as a web service, multiple clients).
The lifecycle is straightforward but understanding each step helps you debug issues when your server does not appear in the client.
Step 1: Initialize. The client sends an initialize request with its protocol version, its capabilities (what features it supports), and client info (name, version). Your server responds with the protocol version it will use (the highest version both sides support), its own capabilities, and server info. This negotiation is critical: if your server advertises "tools": {"listChanged": true}, the client knows it can subscribe to tool list change notifications.
{
"jsonrpc": "2.0", "id": 1, "method": "initialize",
"params": {
"protocolVersion": "2025-06-18",
"capabilities": {},
"clientInfo": { "name": "Claude Desktop", "version": "1.0.0" }
}
}
Step 2: Initialized. The client sends a notifications/initialized notification (no id, no response expected). This tells the server that the client is ready to proceed.
Step 3: Discover. The client calls tools/list to get all available tools, resources/list for resources, and prompts/list for prompts. Each response includes the name, description, and schema for every capability. The model reads these descriptions to decide which tools to use.
Step 4: Use. When the model decides to use a tool, the client sends tools/call with the tool name and arguments. Your server executes the operation and returns the result. The result can be text, images, or resource links.
Step 5: Ongoing. The connection stays open. The model can call tools multiple times. Your server can send notifications (e.g., notifications/tools/list_changed if you add new tools dynamically). The connection ends when the client disconnects or the host application closes.
Every message follows JSON-RPC 2.0. Requests have an id, a method, and params. Responses have the same id and either a result or an error. Notifications have a method but no id (fire-and-forget, no response expected) - Portkey.
Understanding this lifecycle helps with debugging. If your server appears in Claude's settings but no tools show up, the issue is likely in the tools/list response (your server is not returning tools correctly). If the server does not appear at all, the issue is in the initialize handshake (your server is not starting or is crashing during initialization). Check Claude's MCP logs (Developer menu > Open MCP Log) for the exact error.
3. The Three Primitives: Tools, Resources, and Prompts
MCP servers expose three types of capabilities to clients. Understanding when to use each is the first design decision you make.
Tools are executable functions that the AI can invoke. When the model decides it needs to search the web, query a database, or send an email, it calls a tool. Tools are the most common primitive and what most MCP servers expose. Each tool has a name, a description (that the model reads to decide when to use it), an input schema (JSON Schema defining the expected arguments), and a handler function that executes the operation and returns results.
The description is critical: it is the primary mechanism by which the AI decides whether to use your tool. A description like "Search" is useless. A description like "Search the web for current information. Returns up to 10 results with title, URL, and snippet. Use when the user asks about recent events, live data, or information that may have changed since training" tells the model exactly when your tool is relevant. As we explored in our guide to how Claude Code works, Anthropic's own production agent uses tool descriptions extensively to guide model behavior.
Resources are data sources that provide contextual information. Unlike tools (which do something), resources represent information that exists and can be read. File contents, database records, API responses, and configuration data are all resources. Each resource has a URI (which can be a template for dynamic resources), a read handler, and optional metadata (MIME type, description).
Resources are useful when the AI needs to understand context before taking action. A coding agent might read a project's configuration file (resource) before deciding which tool to call. A customer support agent might read a customer's profile (resource) before responding to their query.
Prompts are reusable templates that structure interactions with the LLM. They define system prompts, few-shot examples, or multi-turn conversation patterns that the AI can use. Prompts are the least commonly used primitive but valuable for servers that want to influence how the AI uses their tools.
For your first MCP server, start with tools. They are the most immediately useful and the easiest to understand. Add resources when your server needs to provide context. Add prompts when you want to shape the AI's interaction pattern.
Tool Description Design: The Most Important Decision
The tool description is not documentation for humans. It is an instruction for the AI model that determines when your tool gets called. A bad description means your tool is never used (the model does not understand when it is relevant) or used incorrectly (the model misunderstands what it does).
The anatomy of a good tool description has four elements. First, a one-sentence summary of what the tool does. Second, what inputs it expects and what they mean. Third, what output to expect. Fourth, when the model should and should not use this tool.
Compare these two descriptions for the same email finder tool:
Bad: "Find email. Takes a name and company."
Good: "Find the work email address for a person at a specific company. Returns the email if found, or an error if no match exists. Use when the user provides a person's name and their company, and needs their email for outreach. Do not use for personal email addresses or if the user already has the email."
The good description is 4x longer but makes the model's decision-making dramatically better. The model knows exactly when to call the tool, what to pass, and what to expect back. This is the single highest-leverage investment in your MCP server's usefulness.
As we documented in our leaked Claude Code source analysis, Anthropic's own production agent uses detailed tool descriptions with explicit when-to-use and when-not-to-use instructions. The 18 built-in tools in Claude Code each have multi-paragraph descriptions. If Anthropic invests this much in tool descriptions for their own agent, you should too.
Tool Annotations (New in 2025)
The March 2025 spec update introduced tool annotations: metadata that describes a tool's behavior without affecting the model's context. Two annotations are currently defined:
readOnlyHint: true: The tool does not modify any state. Clients can use this to skip confirmation prompts for read-only operations.destructiveHint: true: The tool modifies or deletes data. Clients should prompt for confirmation before executing.
@mcp.tool(annotations={"readOnlyHint": True})
def search_contacts(query: str) -> str:
"""Search the CRM for contacts matching a query."""
# read-only, no confirmation needed
@mcp.tool(annotations={"destructiveHint": True})
def delete_contact(contact_id: str) -> str:
"""Permanently delete a contact from the CRM."""
# destructive, client should confirm
Annotations are optional but improve the user experience in clients that support them. Claude Desktop uses destructiveHint to show a confirmation dialog before executing destructive tools.
4. Building Your First MCP Server (Python)
The Python MCP SDK includes FastMCP, a high-level framework that turns Python functions into MCP tools with decorators. FastMCP powers approximately 70% of all MCP servers across all languages - FastMCP.
Install the SDK:
pip install mcp
Create a file called server.py:
from mcp.server.fastmcp import FastMCP
# Create a server with a name (shown in the client UI)
mcp = FastMCP("my-first-server")
@mcp.tool()
def add(a: int, b: int) -> int:
"""Add two numbers together.
Args:
a: The first number
b: The second number
"""
return a + b
@mcp.tool()
def get_weather(city: str) -> str:
"""Get the current weather for a city.
Args:
city: The city name (e.g., San Francisco, London, Tokyo)
"""
# In a real server, this would call a weather API
return f"The weather in {city} is 72°F and sunny."
@mcp.resource("greeting://{name}")
def get_greeting(name: str) -> str:
"""Get a personalized greeting."""
return f"Hello, {name}! Welcome to my first MCP server."
if __name__ == "__main__":
mcp.run(transport="stdio")
That is a complete MCP server. FastMCP automatically generates the tool definitions (name, description, input schema) from your function signatures and docstrings. Type hints become JSON Schema types. Docstring descriptions become tool descriptions. The @mcp.tool() decorator registers the function as a callable tool. The @mcp.resource() decorator registers a URI-template resource - MCP Build Server.
The mcp.run(transport="stdio") line starts the server using stdio transport, which means it communicates via stdin/stdout. This is the transport used for local servers that run as subprocesses of the AI application.
Let us walk through what FastMCP does with your code. When you decorate add with @mcp.tool(), FastMCP inspects the function signature: a: int, b: int becomes a JSON Schema with two required integer properties. The docstring becomes the tool description. The function name add becomes the tool name. When a client calls tools/list, your server responds with:
{
"tools": [{
"name": "add",
"description": "Add two numbers together.\n\nArgs:\n a: The first number\n b: The second number",
"inputSchema": {
"type": "object",
"properties": {
"a": {"type": "integer", "description": "The first number"},
"b": {"type": "integer", "description": "The second number"}
},
"required": ["a", "b"]
}
}]
}
This automatic schema generation is why FastMCP powers 70% of MCP servers: you write Python functions with type hints and docstrings, and the MCP protocol layer is handled for you. No JSON Schema authoring, no manual tool registration, no boilerplate.
For more complex types, FastMCP handles Pydantic models, enums, optional fields, and nested objects. A tool that takes a Pydantic model as input:
from pydantic import BaseModel, Field
class SearchQuery(BaseModel):
query: str = Field(description="The search query string")
max_results: int = Field(default=10, description="Maximum results to return")
include_snippets: bool = Field(default=True, description="Include text snippets")
@mcp.tool()
def search(params: SearchQuery) -> str:
"""Search for documents matching the query."""
return f"Found {params.max_results} results for '{params.query}'"
FastMCP converts the Pydantic model into the correct JSON Schema with defaults, constraints, and descriptions. This pattern scales to arbitrarily complex input schemas without manual JSON Schema authoring.
5. Building Your First MCP Server (TypeScript)
The TypeScript SDK uses Zod for schema validation and provides a class-based API.
Initialize a project:
mkdir my-mcp-server && cd my-mcp-server
npm init -y
npm install @modelcontextprotocol/sdk zod
Create src/index.ts:
import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import { z } from "zod";
const server = new McpServer({
name: "my-first-server",
version: "1.0.0",
});
// Register a tool
server.registerTool("calculate-bmi", {
title: "BMI Calculator",
description: "Calculate Body Mass Index from weight and height",
inputSchema: {
weightKg: z.number().describe("Weight in kilograms"),
heightM: z.number().describe("Height in meters"),
},
}, async ({ weightKg, heightM }) => {
const bmi = weightKg / (heightM * heightM);
return {
content: [{ type: "text", text: `BMI: ${bmi.toFixed(1)}` }],
};
});
// Register a resource
server.registerResource("config", "config://app", {
title: "App Config",
description: "Application configuration",
mimeType: "application/json",
}, async (uri) => ({
contents: [{
uri: uri.href,
text: JSON.stringify({ version: "1.0.0", debug: false }),
}],
}));
// Connect via stdio transport
const transport = new StdioServerTransport();
await server.connect(transport);
The Zod schemas serve double duty: they define the JSON Schema that the AI model sees (via .describe() annotations), and they validate the inputs your handler receives. If the model sends invalid arguments, the SDK rejects them before your code runs - TypeScript SDK Docs.
Add a build step and a bin entry to package.json for publishing:
{
"name": "my-mcp-server",
"version": "1.0.0",
"bin": { "my-mcp-server": "dist/index.js" },
"scripts": { "build": "tsc" }
}
6. Testing Locally with Claude Desktop, Cursor, and VS Code
Claude Desktop
Edit ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):
{
"mcpServers": {
"my-first-server": {
"command": "python",
"args": ["/absolute/path/to/server.py"]
}
}
}
For TypeScript (after building):
{
"mcpServers": {
"my-first-server": {
"command": "node",
"args": ["/absolute/path/to/dist/index.js"]
}
}
}
Restart Claude Desktop. You should see your server's tools in the tools icon (hammer) in the chat input area. Type "What is 2 + 3?" and Claude will call your add tool - Claude Support.
Debugging connection issues: The most common reason your server does not appear is a path error in the config. Always use absolute paths, not relative ones. ~/my-server/server.py does not work on all systems; use /Users/yourname/my-server/server.py instead. The second most common issue is missing dependencies: if your server imports a package that is not installed in the Python environment that Claude launches, the server crashes silently during initialization.
Check Claude's logs (Developer menu > Open MCP Log) for the exact error. The log shows the JSON-RPC messages being exchanged, which tells you exactly where the handshake fails. If you see an initialize request but no response, your server is crashing during startup. If you see the initialize handshake complete but tools/list returns empty, your tools are not being registered (check your decorators).
Another subtle issue: if your Python server writes anything to stdout (print statements, logging to stdout), it corrupts the JSON-RPC stream and the connection dies. Use sys.stderr.write() or configure Python logging to stderr. This is the #1 cause of "server connects but then disconnects immediately."
If you are using uv (recommended for Python projects), the config looks like:
{
"mcpServers": {
"my-first-server": {
"command": "uv",
"args": ["--directory", "/absolute/path/to/project", "run", "server.py"]
}
}
}
Cursor
Create .cursor/mcp.json in your project root (or ~/.cursor/mcp.json for global):
{
"mcpServers": {
"my-first-server": {
"command": "python",
"args": ["/absolute/path/to/server.py"]
}
}
}
Note the root key is mcpServers (same as Claude) - TrueFoundry.
VS Code
Create .vscode/mcp.json in your project:
{
"servers": {
"my-first-server": {
"command": "python",
"args": ["/absolute/path/to/server.py"]
}
}
}
Note the root key is servers (NOT mcpServers). This inconsistency between clients is one of the most common configuration mistakes. Claude Desktop and Cursor use mcpServers. VS Code uses servers. If your server does not appear, check the key name first.
MCP tools only work in Copilot's Agent mode (not inline completions or chat). To use your MCP tools in VS Code, open Copilot Chat and select Agent mode from the dropdown. Then type your prompt, and Copilot will discover and call your MCP tools as needed - VS Code Docs.
The .vscode/mcp.json file can be committed to your repository, which means team members get the MCP server configuration automatically when they clone the project. This is a significant advantage over Claude Desktop's per-user configuration: it enables team-wide MCP server sharing through source control. Include the MCP config in your project's setup documentation so new team members get the servers automatically.
Environment Variables in MCP Configuration
All three clients support passing environment variables to MCP servers. This is how you provide API keys and configuration without hardcoding them:
{
"mcpServers": {
"my-server": {
"command": "python",
"args": ["server.py"],
"env": {
"WEATHER_API_KEY": "your-key-here",
"DATABASE_URL": "postgresql://localhost/mydb"
}
}
}
}
The env object is merged with the host process's environment, so your server's os.environ.get("WEATHER_API_KEY") picks up the value. Never commit actual API keys in configuration files. Use placeholder values and document which environment variables need to be set. For team configurations committed to Git, reference a .env file or secret manager instead of inline values.
7. Testing with MCP Inspector
The MCP Inspector is a visual testing tool that lets you interact with your server without an AI client. It shows all exposed tools, resources, and prompts, and lets you call them with custom arguments.
Run it directly (no installation needed):
npx @modelcontextprotocol/inspector python server.py
This opens a web interface at http://localhost:6274 where you can see your server's capabilities and test each tool individually. For verbose logging:
DEBUG=true npx @modelcontextprotocol/inspector python server.py
The Inspector also supports a CLI mode for scripted testing, which is useful for CI/CD pipelines - MCP Inspector.
The Inspector is the most important debugging tool in your MCP development workflow. Use it before connecting to Claude because it shows you exactly what the client sees: the tool definitions, the input schemas, the response format. If something looks wrong in the Inspector, it will be wrong in Claude too, but the Inspector gives you better error messages.
A Practical Testing Workflow
The recommended development workflow for MCP servers is:
- Write your tool. Define the function, type hints, and docstring.
- Test with Inspector. Run
npx @modelcontextprotocol/inspector python server.py, check that the tool appears with the correct name and schema, and call it with test arguments. Verify the response format is clean. - Test with Claude Desktop. Add to
claude_desktop_config.json, restart Claude, and try natural language prompts that should trigger your tool. Verify the model calls the tool when expected and handles the response correctly. - Test edge cases. Try invalid inputs, missing arguments, network failures (for API-backed tools), and large responses. Verify your error handling works and that error messages are useful to the model.
- Test in production context. If your server will be used in multi-step agent workflows, test it in that context: does it work when called 50 times in sequence? Does it handle concurrent calls if running as a remote server?
This workflow catches issues at each level: schema issues in step 2, description quality issues in step 3, robustness issues in step 4, and production issues in step 5. Skipping any step means the issue surfaces in production where debugging is harder.
8. Transport Deep Dive: stdio vs Streamable HTTP
The transport you choose determines how your server runs and who can use it.
stdio is the simplest transport. The MCP client launches your server as a subprocess and communicates via stdin/stdout. This means your server runs locally on the user's machine, serves exactly one client, and starts/stops with the host application. It is the default for MCP servers distributed via npm or PyPI.
Critical rule for stdio: never write to stdout for anything other than JSON-RPC messages. Console.log, print statements, and debug output MUST go to stderr. A single stray stdout line corrupts the JSON-RPC stream and crashes the connection. In Python: use sys.stderr.write() for logging. In TypeScript: use console.error() or redirect to a file.
Streamable HTTP is the modern transport for remote servers. Introduced in the March 2025 spec update, it replaced the deprecated SSE transport. Your server runs as a web service (typically on port 3000-8080) and accepts HTTP POST requests on a single endpoint (conventionally /mcp). Responses can be plain JSON or upgraded to Server-Sent Events for streaming - MCP Transports.
Streamable HTTP enables multi-client access (one server serves your entire team), remote deployment (run on a cloud server, accessed over the network), and proper authentication (OAuth 2.1). The trade-off is operational complexity: you need to host the server, manage uptime, and handle authentication.
For your first server: use stdio. It is simpler, requires no hosting, and works with all major clients. Switch to Streamable HTTP when you need remote access or multi-user support - MCPCat Transport Comparison.
When to Switch to Streamable HTTP
The decision to go remote is driven by three use cases. First, team access: if multiple developers on your team need to use the same MCP server, a shared remote server is better than each person running their own local instance (which requires separate configuration, separate API keys, and separate updates).
Second, heavy infrastructure: if your MCP server needs a database connection, a GPU, or a large language model running locally, it makes more sense to host it on a cloud server where those resources are available, rather than requiring every user to set up the same infrastructure on their machine.
Third, public distribution: if you are distributing your MCP server to users who are not developers (or who do not want to manage local processes), a remote server with a simple URL is dramatically easier to configure than a local command that requires npm, Python, and environment variables.
The migration from stdio to Streamable HTTP in Python (FastMCP) is one line:
# Local (stdio)
mcp.run(transport="stdio")
# Remote (Streamable HTTP)
mcp.run(transport="streamable-http", host="0.0.0.0", port=8080)
In TypeScript, you swap the transport class:
// Local (stdio)
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
const transport = new StdioServerTransport();
// Remote (Streamable HTTP)
import { StreamableHTTPServerTransport } from "@modelcontextprotocol/sdk/server/streamableHttp.js";
const transport = new StreamableHTTPServerTransport({ path: "/mcp" });
Once your server is running on a remote host, clients configure it with a URL instead of a command:
{
"mcpServers": {
"my-remote-server": {
"url": "https://my-server.example.com/mcp",
"transport": "streamable-http"
}
}
}
The Deprecated SSE Transport
If you see references to SSEServerTransport or two-endpoint configurations (one POST endpoint for requests, one SSE endpoint for responses), these are from the pre-March 2025 spec. SSE transport was deprecated in favor of Streamable HTTP. Do not build new servers with SSE. If you are maintaining an existing SSE server, migrate to Streamable HTTP. The protocol differences are minimal (Streamable HTTP combines both endpoints into one), but SSE will eventually be removed from the spec - Blog.fka.dev.
9. Authentication: OAuth 2.1 for Remote Servers
If your server uses Streamable HTTP (remote access), the MCP spec mandates OAuth 2.1 for authentication. This is not optional for production remote servers. The spec requires PKCE (Proof Key for Code Exchange) with the S256 method, prohibits the implicit flow, and requires exact redirect URI matching - MCP Authorization.
In practice, this means your server needs to implement an OAuth authorization endpoint, a token endpoint, and token validation on every request. The MCP client handles the OAuth dance (redirect, code exchange, token storage) and sends the Bearer token with every JSON-RPC request.
For internal or team tools where OAuth is overkill, simpler authentication (API keys, pre-shared secrets) is acceptable but not spec-compliant. Use API keys for internal tools and OAuth for anything public-facing or security-sensitive - Stack Overflow.
Implementing OAuth from scratch is a significant engineering investment. You need an authorization endpoint, a token endpoint, PKCE validation, token refresh logic, and Bearer token validation on every request. For most first-time MCP server builders, this is unnecessary complexity.
There are three practical approaches to authentication, ranked by complexity:
Option 1: No auth (simplest). For local stdio servers that only you use, authentication is unnecessary. The transport security comes from the fact that the server runs as a subprocess on your machine and is not network-accessible.
Option 2: API key auth (moderate). For internal team servers, pass an API key via environment variable. The client configuration includes the key, and your server validates it on every request. This is not spec-compliant for public remote servers, but it works for internal use.
{
"mcpServers": {
"my-server": {
"url": "https://my-server.example.com/mcp",
"transport": "streamable-http",
"env": { "API_KEY": "your-secret-key" }
}
}
}
Option 3: OAuth 2.1 (production). For public remote servers, implement the full OAuth flow. Cloudflare Workers provides the simplest path: their workers-oauth-provider library handles the entire OAuth dance (authorization endpoint, token endpoint, PKCE, redirect handling):
npm create cloudflare@latest -- my-mcp-server \
--template=cloudflare/ai/demos/remote-mcp-github-oauth
This creates a remote MCP server with GitHub OAuth pre-configured, deployable to *.workers.dev in minutes - Cloudflare.
For non-Cloudflare deployments, libraries like authlib (Python) and oidc-provider (Node.js) can handle the OAuth server implementation. The MCP spec requires PKCE with S256, exact redirect URI matching, and short-lived scoped tokens. For a detailed walkthrough of the OAuth requirements, see the MCP authorization tutorial - MCP Auth Tutorial.
The June 2025 spec update added incremental scope requests: your server can request additional permissions mid-session if the user's task requires them. For example, a GitHub server might start with read-only access and request write access only when the user asks to create a pull request. This progressive permission model improves user trust because the initial grant is minimal.
10. Error Handling That Helps the AI Recover
MCP distinguishes two types of errors, and handling them correctly is the difference between a server that works and one that breaks agent workflows.
Protocol-level errors are standard JSON-RPC errors (wrong method, invalid params, server crash). These are captured by the MCP client and typically discarded. The AI model never sees them. They are for debugging, not for the model.
Tool execution errors are returned as successful JSON-RPC responses with isError: true in the result. These ARE injected into the model's context, which means the AI can read the error, understand what went wrong, and decide what to do next (retry with different arguments, try a different tool, or ask the user for help).
@mcp.tool()
def read_file(path: str) -> str:
"""Read a file from the filesystem."""
try:
with open(path) as f:
return f.read()
except FileNotFoundError:
# This gets injected into the model's context
raise ValueError(f"File not found: {path}. Available files: {os.listdir('.')}")
except PermissionError:
raise ValueError(f"Permission denied: {path}. The server can only read files in the working directory.")
The error message should be descriptive and actionable. "Error" is useless. "File not found: config.yaml. Available files: main.py, requirements.txt, README.md" tells the model exactly what happened and what alternatives exist. The model can then ask the user "I couldn't find config.yaml, but I see main.py and README.md. Which one did you mean?" - MCPCat Error Handling.
The Error Hierarchy Pattern
For MCP servers that call external APIs, implement a three-level error hierarchy that gives the model progressively more information:
Level 1: Actionable recovery. If the error is recoverable (wrong parameter, missing field, rate limit), return a clear message that tells the model how to fix it.
@mcp.tool()
def search_contacts(query: str, limit: int = 10) -> str:
if limit > 100:
raise ValueError(
f"Limit {limit} exceeds maximum of 100. "
"Please use a limit between 1 and 100."
)
if not query.strip():
raise ValueError(
"Search query cannot be empty. "
"Provide a name, email, company, or keyword to search for."
)
# ... actual search
Level 2: Informative fallback. If the error is not recoverable by the model (external API down, authentication failure), explain what happened and suggest the user take action.
except requests.exceptions.HTTPError as e:
if e.response.status_code == 401:
raise ValueError(
"Authentication failed. The API key may have expired. "
"Ask the user to check their API key in the server configuration."
)
if e.response.status_code == 429:
raise ValueError(
"Rate limit exceeded. The server is receiving too many requests. "
"Wait 60 seconds before trying again."
)
Level 3: Generic fallback. For unexpected errors, return a safe message that does not expose internals.
except Exception:
raise ValueError(
"An unexpected error occurred while searching contacts. "
"Please try again. If the issue persists, ask the user to check server logs."
)
Never include stack traces, API keys, internal URLs, or database connection strings in error messages. The model sees everything you return, and model conversations can be logged, shared, or exported. Sensitive information in error messages is a data leak waiting to happen - Alpic AI.
Timeout Handling
MCP tools that call external APIs should implement timeouts. An AI agent waiting indefinitely for a tool response blocks the entire conversation. Set reasonable timeouts (5-30 seconds depending on the operation) and return an informative error when they expire:
import httpx
@mcp.tool()
async def fetch_data(url: str) -> str:
"""Fetch data from a URL with a 10-second timeout."""
try:
async with httpx.AsyncClient(timeout=10.0) as client:
response = await client.get(url)
return response.text
except httpx.TimeoutException:
raise ValueError(
f"Request to {url} timed out after 10 seconds. "
"The server may be slow or unavailable. Try a different URL or try again later."
)
The 10-second timeout is a good default for most API calls. For long-running operations (file conversion, data processing), increase the timeout but also consider returning partial results or a progress indicator. The MCP spec does not currently support streaming within a single tool call response, but the experimental Tasks primitive (added in November 2025) enables a "call now, fetch later" pattern for long-running operations.
11. Security: The OWASP MCP Top 10 and How to Avoid Them
Security is not optional. AgentSeal scanned 1,808 MCP servers and found 66% had security findings. Among 2,614 implementations analyzed by Endor Labs, 82% use file operations prone to path traversal and 67% have APIs related to code injection - AgentSeal, Endor Labs.
OWASP published an MCP-specific Top 10 in 2026 - OWASP MCP Top 10. Here are the most critical items and how to address them in your server.
MCP01: Token Mismanagement and Secret Exposure
The problem: 53% of MCP servers use hard-coded credentials. API keys in source code, environment variables exposed in error messages, and secrets committed to Git repos.
The fix: Never hard-code credentials. Read API keys from environment variables (os.environ.get("API_KEY")). Never include secrets in error messages. Use .env files locally and secret managers (AWS Secrets Manager, 1Password) in production. Run git-secrets or trufflehog before every commit.
MCP05: Command Injection and Execution
The problem: 43% of all disclosed MCP vulnerabilities are shell/exec injection. If your server passes user-controlled input to subprocess.run(), os.system(), or exec() without sanitization, an attacker can execute arbitrary commands.
The fix: Never pass raw user input to shell commands. Use parameterized functions instead of string interpolation. If you must run shell commands, use subprocess.run() with shell=False and pass arguments as a list (not a string). Validate all inputs against an expected pattern before execution.
MCP03: Tool Poisoning
The problem: Attackers embed malicious instructions in MCP tool descriptions that get injected into the LLM's context. These hidden instructions can redirect the AI to exfiltrate data, call different tools, or ignore user instructions. Invariant Labs demonstrated SSH key exfiltration from Claude Desktop and Cursor using this technique - Practical DevSecOps.
The fix: If your server connects to third-party MCP servers, sanitize their tool descriptions before presenting them to the model. Use mcp-scan (Invariant Labs) to hash tool manifests and detect changes. For your own server: keep tool descriptions factual and descriptive. Do not include instructions addressed to the AI.
Security Scanning Before Publishing
Run these tools before every publish:
# Invariant Labs mcp-scan (like npm audit for MCP)
npx @anthropic-ai/mcp-scan
# npm audit for dependency vulnerabilities
npm audit
# Snyk for deeper analysis
npx snyk test
For Python servers:
pip install safety
safety check
# Or use Enkrypt AI's MCP-specific scanner
pip install enkrypt-mcp-scan
enkrypt-mcp-scan server.py
The mcp-scan tool from Invariant Labs hashes your tool manifests and runs both local and cloud guardrails to detect prompt injection, tool poisoning, and toxic data flows. Think of it as npm audit for MCP security - Invariant Labs.
Path Traversal: The #1 MCP Vulnerability
Path traversal is present in approximately 22% of tested MCP servers and is by far the most common vulnerability class. It occurs when your server accepts a file path as input and reads or writes to it without validating that the path stays within the intended directory.
# VULNERABLE: accepts any path
@mcp.tool()
def read_file(path: str) -> str:
with open(path) as f:
return f.read()
# SECURE: validates path is within allowed directory
import os
ALLOWED_DIR = "/home/user/documents"
@mcp.tool()
def read_file(path: str) -> str:
resolved = os.path.realpath(os.path.join(ALLOWED_DIR, path))
if not resolved.startswith(os.path.realpath(ALLOWED_DIR)):
raise ValueError(f"Access denied: path must be within {ALLOWED_DIR}")
with open(resolved) as f:
return f.read()
The vulnerable version allows an AI agent (or an attacker who injects instructions via tool poisoning) to read /etc/passwd, ~/.ssh/id_rsa, or any other file on the system. The secure version resolves the full path, checks it is within the allowed directory, and rejects anything outside. This is the single most important security pattern for MCP servers that handle files.
Credential Isolation
The third most common vulnerability is credential exposure. Your MCP server needs API keys to call external services, and those keys must never appear in tool responses, error messages, or logs that the AI model sees.
import os
API_KEY = os.environ.get("WEATHER_API_KEY")
@mcp.tool()
def get_weather(city: str) -> str:
try:
response = requests.get(f"https://api.weather.com/v1/{city}",
headers={"Authorization": f"Bearer {API_KEY}"})
return response.json() ["summary"]
except Exception as e:
# NEVER include the API key or full error in the response
return f"Weather lookup failed for {city}. Please try again."
The error handler returns a user-friendly message, not the raw exception (which might contain the API key in the URL or headers). Use .env files for local development and proper secret managers (AWS Secrets Manager, 1Password, Vault) for production. Never commit .env files to Git.
The SlowMist MCP Security Checklist
SlowMist, a blockchain security firm, published a comprehensive MCP security checklist based on their audit experience. The most actionable items for first-time MCP server developers:
- Validate all tool inputs against expected types and ranges
- Sanitize file paths to prevent directory traversal
- Never expose internal errors or stack traces to the model
- Use environment variables for secrets, never hard-code
- Implement rate limiting to prevent abuse from runaway agents
- Log all tool invocations for audit trails
- Test with adversarial inputs (path traversal strings, SQL injection payloads, prompt injection attempts)
The full checklist is available at - SlowMist GitHub.
12. Publishing to npm, PyPI, and the Official Registry
Publishing to npm
For TypeScript servers, publish as an npm package that users install globally:
npm login
npm publish
Users install and use it via:
npx your-mcp-server
Add a bin entry to package.json so the package is executable. Include "type": "module" for ESM support. The files field should include only the built output (not source maps, which was the exact cause of the Claude Code source leak).
Publishing to PyPI
For Python servers:
pip install build twine
python -m build
twine upload dist/*
Users install via:
pip install your-mcp-server
Use pyproject.toml with [project.scripts] for CLI entry points. Include classifiers like Topic :: Scientific/Engineering :: Artificial Intelligence for discoverability.
Registering on the Official MCP Registry
The Official MCP Registry is the canonical source that MCP clients query programmatically. Publishing steps:
# Install the registry CLI
npm install -g mcp-publisher
# Initialize server.json metadata
mcp-publisher init
# Authenticate (gives access to io.github.<username>/* namespace)
mcp-publisher login github
# Publish
mcp-publisher publish
The registry uses reverse-DNS namespaces for identity verification. If you authenticate via GitHub, you own io.github.<yourusername>/*. If you authenticate via DNS TXT record or HTTP .well-known file, you own com.yourdomain/*. This prevents impersonation: nobody else can publish a server under your namespace - MCP Registry.
The registry requires a server.json file with metadata about your server. Here is an example:
{
"$schema": "https://static.modelcontextprotocol.io/schemas/2025-12-11/server.schema.json",
"name": "io.github.yourname/weather-server",
"version": "1.0.0",
"description": "Get weather forecasts and alerts for any location",
"transport": ["stdio"],
"tools": [
{
"name": "get_weather",
"description": "Get current weather for a city"
},
{
"name": "get_alerts",
"description": "Get weather alerts for a US state"
}
],
"repository": "https://github.com/yourname/weather-server",
"package": {
"registry": "npm",
"name": "weather-mcp-server"
}
}
The mcp-publisher init command auto-generates this from your server's capabilities. The registry fetches the npm or PyPI package metadata and verifies that mcp-name: weather-server appears in the package README, linking the registry entry to the published package - Glama.
The npm/PyPI Publishing Checklist
Before publishing to package registries, verify these items:
package.json/pyproject.tomlhas the correct name, version, description, and keywords (include "mcp", "ai-agent", "tool")binentry (npm) or[project.scripts](Python) makes your server executable vianpx your-serveroryour-serverafter pip install- README starts with a one-line description, then installation, then minimal usage example. Not a logo. Not badges. The first thing a developer sees should answer "what does this do and how do I use it"
.npmignoreorfilesfield excludes source maps, test files, and development artifacts. The Claude Code source leak happened because a.mapfile was accidentally included in the npm package- No hardcoded secrets in the published package. Run
npm pack --dry-runto see exactly what gets published, and verify no.env, credentials, or private configuration is included - License is specified (MIT is the standard for MCP servers, matching the protocol's own license)
- Security scan passes:
npm auditfor TypeScript,safety checkfor Python
For Python specifically, test your package in a clean virtual environment before publishing:
python -m venv test_env
source test_env/bin/activate
pip install dist/your_package-1.0.0.tar.gz
your-server --help # verify the entry point works
This catches missing dependencies, import errors, and packaging issues before they reach users.
13. Listing for Discovery: Where to Submit
After publishing to npm/PyPI and the Official Registry, submit to these directories for maximum discovery. All are free. As we covered in our comprehensive guide to where to list your API or MCP server, the sequencing matters: infrastructure first (npm, GitHub, Registry), then MCP registries, then broader directories.
Must-list (week 1):
- Awesome MCP Servers (github.com/punkpeye/awesome-mcp-servers, 85.2K stars): Submit a PR with your server categorized correctly.
- PulseMCP (pulsemcp.com, 12,970+ servers): Submit for hand-review. Highest editorial quality.
- Glama (glama.ai, 21,845+ servers): Auto-indexed if your repo is public. Verify your listing.
- Smithery (smithery.ai, 7,000+ servers): Publish via CLI for one-command installation.
- mcp.so (20,289+ servers): Submit for usage-based ranking.
Should-list (week 2-4):
- MCP Market (mcpmarket.com): Daily leaderboard by GitHub stars.
- LobeHub MCP Marketplace: One-click install for LobeHub users.
- Composio (mcp.composio.dev): If your server benefits from managed OAuth.
Optional (enterprise):
- Claude Desktop Extensions: Requires Anthropic editorial review but gives direct distribution to all Claude users.
- Azure API Center MCP: If your server targets Azure enterprise users.
- Kong MCP Registry: If your server targets Kong Konnect users.
Listings updated within 30 days rank 2-3x higher than stale listings. Set a monthly calendar reminder to update your listings with new features, version bumps, and documentation improvements.
The llms.txt File: Machine-Readable Discovery
Beyond registries and directories, implement the llms.txt standard on your server's website or documentation domain. Place a file at yourdomain.com/llms.txt that describes your MCP server for LLM consumption. Current adoption is at 10.13% across 300K surveyed domains, with Anthropic, Stripe, Zapier, and Cloudflare already implementing it - llmstxt.org.
The format is simple Markdown:
# Weather MCP Server
> An MCP server that provides real-time weather data for any location. Returns current conditions, forecasts, and severe weather alerts.
## Documentation
- [Getting Started](https://weather-mcp.example.com/docs/quickstart)
- [API Reference](https://weather-mcp.example.com/docs/api)
- [Claude Desktop Configuration](https://weather-mcp.example.com/docs/claude)
## Tools
- `get_weather`: Get current weather for a city
- `get_forecast`: Get 5-day forecast
- `get_alerts`: Get severe weather alerts for a region
Developer tools like Cursor, Continue, and Aider already read llms.txt files. When a developer asks their AI coding assistant about your product, the assistant can read your llms.txt to understand your capabilities without visiting your website. The implementation takes 15 minutes and the upside is future-proofed machine discovery.
MCP itself provides an llms.txt file at modelcontextprotocol.io/llms-full.txt that demonstrates the format for a complex specification - MCP llms.txt.
14. Advanced Patterns: Dynamic Tools, Streaming, Multi-Server
Dynamic Tool Registration
MCP supports adding and removing tools at runtime. When your server's capabilities change (e.g., a user grants additional permissions, or a new API endpoint becomes available), send a notifications/tools/list_changed notification. Clients will re-fetch tools/list to discover the updated capabilities.
This enables powerful patterns: a database server that exposes tools based on which tables the user has access to, or an API server that discovers available endpoints at startup and creates tools dynamically - Speakeasy.
Multi-Server Composition
Agent frameworks like LangChain and LlamaIndex support connecting to multiple MCP servers simultaneously. LangChain's MultiServerMCPClient creates a single agent with tools from multiple servers:
from langchain_mcp_adapters.client import MultiServerMCPClient
async with MultiServerMCPClient({
"github": {"command": "npx", "args": ["@modelcontextprotocol/server-github"]},
"database": {"command": "python", "args": ["db_server.py"]},
}) as client:
tools = client.get_tools()
# Agent now has tools from both servers
This is the pattern for building agents that combine capabilities from multiple sources: search + enrichment + email, for example. An agent that needs to research a company, find the right contact, enrich their profile, and draft a personalized email would connect to four MCP servers simultaneously (web search, company enrichment, contact finder, email drafter) and orchestrate them through a single agent loop - LangChain MCP.
LlamaIndex provides a similar integration. The llama-index-tools-mcp package converts MCP tools into LlamaIndex FunctionTool objects that work with any LlamaIndex agent:
from llama_index.tools.mcp import get_tools_from_mcp_url
tools = await get_tools_from_mcp_url("http://localhost:8080/mcp")
# Each MCP tool is now a LlamaIndex FunctionTool
The key decision is whether to use stateless or stateful connections. LangChain's MultiServerMCPClient is stateless by default (creates a fresh session for each invocation), which is simpler but means you cannot maintain state between tool calls. For servers that need persistent state (e.g., a database server that keeps a connection pool open), use the session parameter to maintain a ClientSession across invocations - LlamaIndex MCP.
Structured Tool Output (New in June 2025)
The June 2025 spec update added structuredContent to tool responses. Instead of returning only text (which the model must parse), you can return typed JSON alongside the text. This is invaluable for tools that return data the agent needs to process programmatically:
server.registerTool("get-contacts", {
description: "Get contacts from the CRM",
inputSchema: { query: z.string() },
outputSchema: z.array(z.object({
name: z.string(),
email: z.string(),
company: z.string(),
})),
}, async ({ query }) => {
const contacts = await searchCRM(query);
return {
content: [{ type: "text", text: `Found ${contacts.length} contacts` }],
structuredContent: contacts,
};
});
The content field is what the model sees in its context. The structuredContent field is what downstream code can process without parsing. This dual output is the correct pattern for tools that need to be both human-readable (in the model's response) and machine-processable (in the agent's workflow).
Remote Hosting on Cloudflare Workers
For serverless remote MCP deployment, Cloudflare Workers provides the simplest path. Your server scales to zero when idle (no cost), runs at the edge (low latency), and supports OAuth via their workers-oauth-provider library:
npm create cloudflare@latest -- my-mcp-server \
--template=cloudflare/ai/demos/remote-mcp-github-oauth
cd my-mcp-server
npx wrangler deploy
This deploys a production remote MCP server with OAuth to *.workers.dev in minutes - Cloudflare.
Docker deployment is the alternative for non-Cloudflare environments. Docker provides sandboxed isolation, reproducible builds, and compatibility with any cloud provider. The Docker MCP Toolkit (available from Docker Desktop) enables one-click server setup from the Docker Desktop UI - Docker.
A minimal Dockerfile for a Python MCP server:
FROM python:3.12-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
EXPOSE 8080
CMD ["python", "server.py"]
Where server.py uses Streamable HTTP transport:
mcp.run(transport="streamable-http", host="0.0.0.0", port=8080)
Build and push:
docker build -t your-mcp-server:latest .
docker push yourregistry/your-mcp-server:latest
For production deployments, use a reverse proxy (nginx, Caddy, or your cloud provider's load balancer) to terminate TLS. MCP over Streamable HTTP should always use HTTPS in production. The OAuth flow requires HTTPS for redirect URIs, and unencrypted connections expose Bearer tokens to network interception.
The Docker approach is preferred for enterprise deployments because it provides a standardized packaging format that ops teams are familiar with, supports health checks and auto-restart, and integrates with container orchestration platforms (Kubernetes, ECS, Cloud Run) for scaling.
15. Monetization: Getting Paid for Your MCP Server
The MCP monetization ecosystem is nascent but growing. Several platforms now support paid MCP servers.
MCP Marketplace (mcp-marketplace.io) lets you set a price (one-time or subscription), handles Stripe checkout, license key generation, and payouts. The creator keeps the majority of revenue - MCP Marketplace.
MCPize (mcpize.com) offers an 85/15 revenue share (highest in the ecosystem). You deploy your server to MCPize's cloud, and they handle hosting, billing, and user management - MCPize.
xpay (xpay.sh) enables pay-per-tool-call using the x402 protocol and USDC stablecoins. No code changes needed: xpay wraps your MCP server with a payment layer - xpay.
The pricing models that work in practice: freemium (free tier for discovery, paid tier for premium features or higher rate limits) is the most successful for adoption. One-time purchases ($5-25) work for utility servers. Subscriptions ($5-25/month) work for servers that provide ongoing value (data access, API credits) - PulseMCP.
The Monetization Decision Framework
Before deciding to monetize, consider whether monetization is even the right goal for your server. Open-source MCP servers have a structural advantage in the ecosystem: they are more trusted (code is auditable), more discoverable (Awesome MCP Servers focuses on open source), and more likely to get community contributions (bug fixes, translations, feature additions). The GitHub stars that open-source servers accumulate are a durable marketing asset that compounds over years.
If your server wraps a paid API (your own or a third party's), the monetization happens at the API level, not the MCP server level. The server itself should be open source and free. You make money from the API calls, not from the server software. This is the model that most successful API companies (Stripe, Twilio, Anthropic) follow: the SDKs and integration tools are free, the API calls are paid.
If your server provides standalone value without an underlying API (e.g., a local tool that processes files, a data transformer, a workflow orchestrator), then direct monetization makes sense. The freemium model works best: a free tier with rate limits or feature restrictions, and a paid tier that removes those restrictions. The free tier should be useful enough to demonstrate value. The paid tier should be priced at what an individual developer would pay without corporate approval (under $25/month).
The MCP ecosystem is still too young for aggressive monetization. The primary goal for most servers in 2026 should be adoption and distribution, not revenue. Build the user base first. Monetize later, when you have hundreds of active users who depend on your server and would pay to keep it running.
Rate Limiting for MCP Servers
Whether you monetize or not, rate limiting is essential for remote MCP servers. An AI agent running in an infinite loop (a common failure mode) can generate thousands of tool calls per minute, exhausting your API quotas, your compute budget, or your database connections.
Implement rate limiting at two levels. First, at the MCP server level: limit the number of tools/call requests per client per minute. A reasonable default is 60 calls per minute (1 per second), which is fast enough for any interactive workflow but prevents runaway loops.
from functools import wraps
import time
call_timestamps = {}
def rate_limit(max_calls_per_minute=60):
def decorator(func):
@wraps(func)
async def wrapper(*args, **kwargs):
now = time.time()
timestamps = call_timestamps.get(func.__name__, [])
timestamps = [t for t in timestamps if now - t < 60]
if len(timestamps) >= max_calls_per_minute:
raise ValueError(f"Rate limit exceeded. Max {max_calls_per_minute} calls per minute.")
timestamps.append(now)
call_timestamps [func.__name__] = timestamps
return await func(*args, **kwargs)
return wrapper
return decorator
Second, at the backend level: if your server calls external APIs, rate limit at the API client level to stay within the external API's rate limits. This prevents one aggressive MCP client from exhausting your API quota and affecting other users - MintMCP.
16. Performance: Which Language to Choose
A comprehensive benchmark by TM Dev Lab tested MCP server implementations across 3.9 million requests. The results matter for choosing your implementation language - TM Dev Lab:
Java and Go achieve virtually identical throughput (1,624 RPS) with sub-1ms latency. Node.js is 12x slower. Python (FastMCP) is 31x slower than Java/Go but still handles 292 requests per second, which is sufficient for most use cases. Go has the best memory efficiency at 92.6 requests per MB (18 MB total).
The practical advice: Use Python (FastMCP) for your first server and for most servers. The development speed advantage (decorators, type hints, automatic schema generation) outweighs the performance difference for 95% of use cases. Switch to Go or Java only when you need >500 RPS or sub-millisecond latency, which matters for high-traffic remote servers serving thousands of concurrent agents.
For the 70% of MCP servers that are local (stdio transport, single client), performance is irrelevant. The bottleneck is the LLM inference call (seconds), not the tool execution (milliseconds). Even Python's 26ms latency is invisible when the model takes 2-5 seconds to generate a response.
Memory Efficiency: The Underrated Metric
Go uses only 18 MB of memory for a running MCP server (92.6 requests per MB). Python uses 98 MB. Node.js uses 110 MB. Java uses 226 MB. For local servers running as subprocesses, this matters because every MCP server you add to Claude Desktop is a separate process consuming memory. If you have 10 local MCP servers running simultaneously, Python consumes ~1 GB while Go consumes ~180 MB.
For remote servers handling many concurrent connections, memory efficiency determines how many clients you can serve per instance. A Go server on a 512 MB VPS can handle thousands of concurrent connections. A Java server on the same VPS might handle only a few hundred before running out of memory.
The Language Decision Matrix
| Factor | Python (FastMCP) | TypeScript | Go | Java |
|---|---|---|---|---|
| Development speed | Fastest (decorators, auto-schema) | Fast (Zod) | Moderate | Moderate |
| Runtime performance | Slowest (26ms) | Medium (10ms) | Fastest (0.85ms) | Fastest (0.83ms) |
| Memory usage | 98 MB | 110 MB | 18 MB | 226 MB |
| Ecosystem maturity | Highest (FastMCP powers 70%) | High (official SDK) | Growing | Growing |
| AI/ML library access | Best (all ML libs are Python) | Limited | Limited | Limited |
| Deployment simplicity | Moderate (needs Python runtime) | Easy (npx) | Easiest (single binary) | Complex (JVM) |
For most developers: start with Python. Switch to Go when you need performance. Use TypeScript if your team is JS-first. Use Java only in Java-heavy enterprise environments.
The Python ecosystem advantage is decisive for MCP servers that interact with AI/ML workflows: every major ML framework (PyTorch, transformers, LangChain, LlamaIndex) is Python-first. A Python MCP server can directly call these libraries without cross-language bridges. If your MCP server needs to run inference, process embeddings, or interact with vector databases, Python is the only practical choice because the tooling simply does not exist in other languages at the same maturity level.
The Go advantage is equally decisive for infrastructure servers: if your MCP server is a thin wrapper around an API or database, Go's single-binary deployment (no runtime dependencies), low memory footprint (18 MB), and sub-millisecond latency make it the ideal choice for remote servers that need to handle thousands of concurrent connections on minimal infrastructure.
What to Build Next
You now have everything you need to build, test, publish, secure, and distribute an MCP server. The protocol is mature, the ecosystem is massive (97 million monthly downloads, 544 clients, 21,000+ servers), and the governance is stable (Linux Foundation, eight platinum members including every major AI company).
The most successful MCP servers solve a specific, narrow problem extremely well. The Playwright MCP server (82,000 monthly searches) does one thing: browser automation. The Stripe MCP server does one thing: payment management. The best MCP servers are not platforms. They are sharp tools.
Start small: build a server that wraps an API you use daily. Publish it to npm/PyPI. Register on the Official MCP Registry. Submit to the Awesome list. Get your first 10 users. Then iterate based on their feedback.
Ideas for Your First MCP Server
If you are looking for inspiration, here are categories of servers that are underserved in the ecosystem and would be immediately useful:
Internal tools: Wrap your company's internal APIs (Jira, Confluence, internal databases) as MCP servers. These do not need to be published publicly but make your team's AI agents dramatically more useful. A Jira MCP server that lets Claude create, search, and update issues is a weekend project that saves hours every week.
Niche data sources: The most popular MCP servers wrap well-known services (GitHub, Slack, Stripe). But most businesses use niche tools that have no MCP server: accounting software, industry-specific databases, government data APIs, specialized SaaS platforms. Building the MCP server for a niche tool with no existing integration gives you an instant monopoly on that capability in the MCP ecosystem.
Workflow orchestrators: Instead of wrapping a single API, build a server that orchestrates a multi-step workflow. A "sales prospecting" server that chains company search, contact enrichment, email verification, and email drafting into a single tool call is more valuable than four individual servers because it reduces the number of tool calls the agent needs to make (fewer calls = faster execution, lower cost, fewer failure points).
Data transformers: Build servers that convert between formats (CSV to JSON, PDF to markdown, image to text) or validate data against schemas. These are utility servers that get used by many different agents across many different workflows. For agent builders using platforms like Suprsonic, these conversion capabilities are already available through the unified API, but for developers who want to self-host, a specialized MCP server fills the gap.
For the broader context of how MCP servers fit into AI agent architectures, see our guide to the top 10 agent capabilities. For how to monetize your developer product beyond the MCP ecosystem, see our guide to where to list your API or MCP server.
This guide reflects the MCP ecosystem as of April 2026. The protocol evolves rapidly. Verify current SDK versions, spec features, and registry requirements on modelcontextprotocol.io before building.