MCP Server Performance: What to Expect

On this page

Cold start: under 1 second
What paso does at startup
Request overhead
paso vs. hand-written: same performance, less code
What affects MCP server performance
Optimizing for production
Measuring your own server
Questions

Before you deploy an MCP server, you want to know what it costs. How fast does it start? What overhead does the protocol add? Does paso make it slower?

We measured it. Here’s what we found.

Cold start: under 1 second

paso parses your usepaso.yaml, registers MCP tools, and starts the server. For a declaration with 6 capabilities:

time usepaso serve

usepaso serving "Sentry" (6 capabilities). Agents welcome.

real    0m0.87s

Sub-second cold start. The bottleneck is Node.js startup, not paso. The YAML parsing and tool registration add roughly 50ms on top of the Node.js baseline.

For comparison, usepaso init (which generates a template file) completes in about 1 second, and usepaso validate (which parses and validates the full declaration) completes in under 0.9 seconds. The CLI is fast across the board.

What paso does at startup

When you run usepaso serve, four things happen:

Parse YAML. reads usepaso.yaml and validates against the spec (~30ms)
Register tools. each capability becomes an MCP tool with typed inputs (~10ms)
Set up auth. reads USEPASO_AUTH_TOKEN and configures header forwarding (~1ms)
Start transport. opens the stdio or HTTP transport for MCP clients (~10ms)

Total paso overhead: roughly 50ms. The rest is Node.js module loading.

Request overhead

When an agent calls a tool, paso:

Validates the input against the declared schema
Constructs the HTTP request (URL, headers, body)
Forwards the auth token
Sends the request to your API
Returns the response to the agent

Steps 1-4 add single-digit milliseconds. The dominant latency is always your API’s response time, not paso’s processing.

paso vs. hand-written: same performance, less code

A hand-written MCP server using the TypeScript SDK does the same work: parse inputs, build a request, forward auth, return the response. The per-request overhead is equivalent.

The difference is maintenance cost, not runtime cost.

	Hand-written MCP	paso
Cold start	~0.8s	~0.9s
Per-request overhead	~2-5ms	~2-5ms
Code to maintain	200+ lines TypeScript	30 lines YAML
Adding a capability	New function + registration	8 lines of YAML

paso adds roughly 50ms to cold start for YAML parsing. Per-request performance is identical because paso generates the same request pipeline a hand-written server would.

What affects MCP server performance

The things that actually matter for MCP server latency:

1. Your API’s response time. If your API takes 500ms to respond, the MCP server takes 500ms plus a few milliseconds of overhead. paso can’t make your API faster.

2. Transport type. MCP supports stdio (local) and HTTP (remote). Stdio has zero network overhead since the client and server communicate through standard input/output. HTTP adds a network round trip.

3. Payload size. Large responses take longer to serialize and transmit. If your API returns 10MB of JSON, the agent receives 10MB of JSON. Consider adding limit parameters to your capabilities.

4. Auth token validation. paso forwards your token to the API on every request. If your API validates tokens against a remote auth server, that adds latency. This is your API’s behavior, not paso’s.

Optimizing for production

Three things to consider:

Add pagination. Don’t let agents fetch unbounded lists. Add limit and offset inputs with sensible defaults.

inputs:
  limit:
    type: integer
    description: Number of results (1-100)
    default: 20
    in: query

Use constraints. Rate limits prevent agents from overwhelming your API.

constraints:
  - max_per_hour: 500
    description: List operations are rate-limited

Keep declarations focused. A declaration with 6 well-chosen capabilities serves agents better than one with 50. Fewer tools means faster discovery and less confusion for the agent.

Measuring your own server

Validate that everything builds correctly before measuring:

usepaso validate          # Structure check
usepaso test --all --dry-run  # Request construction check
usepaso doctor            # Environment check

Then benchmark:

time usepaso serve &
# Send a test request through your MCP client
usepaso test list_issues \
  -p organization_slug=my-org \
  -p project_slug=my-project

The test command shows the full round trip: paso processing + network + API response time.

Questions

Does paso cache responses? No. paso is a passthrough. Every tool call results in a fresh request to your API. If you need caching, implement it in your API layer.

Is there a connection pool? paso uses Node.js’s default HTTP agent, which reuses connections via keep-alive. For high-throughput scenarios, this is usually sufficient.

What about Python? paso works the same in Python. pip install usepaso, same YAML file, same performance characteristics. The Python runtime has a slightly higher baseline startup time than Node.js but per-request overhead is equivalent.

Related:

Five Ways to Test Before You Ship for the complete testing workflow
Common MCP Server Errors for fixing issues before they affect performance
paso vs Writing MCP Servers by Hand for the full code comparison

Read the CLI reference.