diff --git a/README.md b/README.md
index 166fc52fa..aeac86be8 100644
--- a/README.md
+++ b/README.md
@@ -1235,7 +1235,7 @@ Note that `uv run mcp run` or `uv run mcp dev` only supports server using FastMC
### Streamable HTTP Transport
-> **Note**: Streamable HTTP transport is the recommended transport for production deployments. Use `stateless_http=True` and `json_response=True` for optimal scalability.
+> **Note**: Streamable HTTP transport is the recommended transport for production deployments. For serverless and load-balanced environments, consider using `stateless_http=True` and `json_response=True`. See [Understanding Stateless Mode](#understanding-stateless-mode) for guidance on choosing between stateful and stateless operation.
```python
@@ -1347,6 +1347,151 @@ The streamable HTTP transport supports:
- JSON or SSE response formats
- Better scalability for multi-node deployments
+#### Understanding Stateless Mode
+
+The Streamable HTTP transport can operate in two modes: **stateful** (default) and **stateless**. Understanding the difference is important for choosing the right deployment model.
+
+##### What "Stateless" Means
+
+In **stateless mode** (`stateless_http=True`), each HTTP request creates a completely independent MCP session that exists only for the duration of that single request:
+
+- **No session tracking**: No `Mcp-Session-Id` header is used or required
+- **Per-request lifecycle**: Each request initializes a fresh server instance, processes the request, and terminates
+- **No state persistence**: No information is retained between requests
+- **No event store**: Resumability features are disabled
+
+This is fundamentally different from **stateful mode** (default), where:
+
+- A session persists across multiple requests
+- The `Mcp-Session-Id` header links requests to an existing session
+- Server state (e.g., subscriptions, context) is maintained between calls
+- Event stores can provide resumability if the connection drops
+
+##### MCP Features Impacted by Stateless Mode
+
+When running in stateless mode, certain MCP features are unavailable or behave differently:
+
+| Feature | Stateful Mode | Stateless Mode |
+|---------|---------------|----------------|
+| **Server Notifications** | ✅ Supported | ❌ Not available1 |
+| **Resource Subscriptions** | ✅ Supported | ❌ Not available1 |
+| **Multi-turn Context** | ✅ Maintained | ❌ Lost between requests2 |
+| **Long-running Tools** | ✅ Can use notifications for progress | ⚠️ Must complete within request timeout |
+| **Event Resumability** | ✅ With event store | ❌ Not applicable |
+| **Tools/Resources/Prompts** | ✅ Fully supported | ✅ Fully supported |
+| **Concurrent Requests** | ⚠️ One per session | ✅ Unlimited3 |
+
+1 Server-initiated notifications require a persistent connection to deliver updates
+2 Each request starts fresh; client must provide all necessary context
+3 Each request is independent, enabling horizontal scaling
+
+##### When to Use Stateless Mode
+
+**Stateless mode is ideal for:**
+
+- **Serverless Deployments**: AWS Lambda, Cloud Functions, or similar FaaS platforms where instances are ephemeral
+- **Load-Balanced Multi-Node**: Deploying across multiple servers without sticky sessions
+- **Stateless APIs**: Services where each request is self-contained (e.g., data lookups, calculations)
+- **High Concurrency**: Scenarios requiring many simultaneous independent operations
+- **Simplified Operations**: Avoiding session management complexity
+
+**Use stateful mode when:**
+
+- Server needs to push notifications to clients (e.g., progress updates, real-time events)
+- Resources require subscriptions with change notifications
+- Tools maintain conversation state across multiple turns
+- Long-running operations need to report progress asynchronously
+- Connection resumability is required
+
+##### Example: Stateless Configuration
+
+```python
+from mcp.server.fastmcp import FastMCP
+
+# Stateless server - each request is independent
+mcp = FastMCP(
+ "StatelessAPI",
+ stateless_http=True, # Enable stateless mode
+ json_response=True, # Recommended for stateless
+)
+
+@mcp.tool()
+def calculate(a: int, b: int, operation: str) -> int:
+ """Stateless calculation tool."""
+ operations = {"add": a + b, "multiply": a * b}
+ return operations[operation]
+
+# Each request will:
+# 1. Initialize a new server instance
+# 2. Process the calculate tool call
+# 3. Return the result
+# 4. Terminate the instance
+```
+
+##### Deployment Patterns
+
+###### Pattern 1: Pure Stateless (Recommended)
+
+```python
+# Best for: Serverless, auto-scaling environments
+mcp = FastMCP("MyServer", stateless_http=True, json_response=True)
+
+# Clients can connect to any instance
+# Load balancer doesn't need session affinity
+```
+
+###### Pattern 2: Stateful with Sticky Sessions
+
+```python
+# Best for: When you need notifications but have load balancing
+mcp = FastMCP("MyServer", stateless_http=False) # Default
+
+# Load balancer must use sticky sessions based on Mcp-Session-Id header
+# ALB/NGINX can route by header value to maintain session affinity
+```
+
+###### Pattern 3: Hybrid Approach
+
+```python
+# Deploy both modes side-by-side
+stateless_mcp = FastMCP("StatelessAPI", stateless_http=True)
+stateful_mcp = FastMCP("StatefulAPI", stateless_http=False)
+
+app = Starlette(routes=[
+ Mount("/api/stateless", app=stateless_mcp.streamable_http_app()),
+ Mount("/api/stateful", app=stateful_mcp.streamable_http_app()),
+])
+```
+
+##### Technical Details
+
+**Session Lifecycle in Stateless Mode:**
+
+1. Client sends HTTP POST request to `/mcp` endpoint
+2. Server creates ephemeral `StreamableHTTPServerTransport` (no session ID)
+3. Server initializes fresh `Server` instance with `stateless=True` flag
+4. Request is processed using the ephemeral transport
+5. Response is sent back to client
+6. Transport and server instance are immediately terminated
+
+**Performance Characteristics:**
+
+- **Initialization overhead**: Each request pays the cost of server initialization
+- **Memory efficiency**: No long-lived sessions consuming memory
+- **Scalability**: Excellent horizontal scaling with no state synchronization
+- **Latency**: Slightly higher per-request latency due to initialization
+
+**Stateless Mode Checklist:**
+
+When designing for stateless mode, ensure:
+
+- ✅ Tools are self-contained and don't rely on previous calls
+- ✅ All required context is passed in each request
+- ✅ Tools complete synchronously within request timeout
+- ✅ No server notifications or subscriptions are needed
+- ✅ Client handles any necessary state management
+- ✅ Operations are idempotent where possible
+
#### CORS Configuration for Browser-Based Clients
If you'd like your server to be accessible by browser-based MCP clients, you'll need to configure CORS headers. The `Mcp-Session-Id` header must be exposed for browser clients to access it: