Production Deployment Guide
Complete guide to deploying Seed MCP Server to production with Okta OIDC authentication, Docker Swarm, and secure metrics endpoint.
Overview
This guide covers:
- Okta OIDC Setup
- Docker Swarm Deployment
- Securing the Metrics Endpoint
- Monitoring and Observability
- Troubleshooting
Production Architecture
✅ Production-Ready Features (as of 2026-01-06):
- Graceful shutdown with SIGTERM/SIGINT handling
- Liveness and readiness probes for Kubernetes
- Configuration validation at startup
- Redis connection resilience with circuit breaker
- Token revocation support (RFC 7009)
Prerequisites
- Docker Swarm cluster initialized
- Traefik reverse proxy deployed
- Domain with DNS configured
- Okta account with admin access
- Redis instance (included in stack)
Okta OIDC Setup
Step 1: Create Application in Okta
Log into Okta Admin Console
- Navigate to Applications → Applications
- Click Create App Integration
Select Integration Type
- Sign-in method: OIDC - OpenID Connect
- Application type: Web Application
- Click Next
Configure Application Settings
General Settings:
- App integration name:
Seed MCP Server - Logo: (optional)
Grant types:
- ✅ Authorization Code
- ✅ Refresh Token
- ✅ Implicit (hybrid) - if needed
Sign-in redirect URIs:
http://localhost:*/callback https://seed.yourdomain.com/oauth/authorize/callback1
2Sign-out redirect URIs: (optional)
https://seed.yourdomain.com1Controlled access:
- Select who can access this application
- Recommended: Limit access to selected groups
- App integration name:
Save Application
- Click Save
- You'll be redirected to the application page
Step 2: Note Application Credentials
On the application page, note these values:
# Client ID (under "Client Credentials")
OIDC_AUDIENCE=0oa1abc2def3ghi4jkl5
# Okta Domain (from URL or Dashboard)
OKTA_DOMAIN=dev-12345678.okta.com
# or custom domain: auth.yourdomain.com
# Construct these URLs:
OIDC_ISSUER=https://${OKTA_DOMAIN}/oauth2/default
OAUTH_TOKEN_URL=https://${OKTA_DOMAIN}/oauth2/default/v1/token
OAUTH_AUTHORIZATION_URL=https://${OKTA_DOMAIN}/oauth2/default/v1/authorize2
3
4
5
6
7
8
9
10
11
Using Default Authorization Server
The /oauth2/default path refers to Okta's default authorization server. For custom authorization servers, replace default with your server ID.
Step 3: Configure Token Settings
Navigate to Security → API → Authorization Servers
Click on default (or your custom server)
Go to Settings tab
Configure:
- Issuer: Should be
https://${OKTA_DOMAIN}/oauth2/default - Audience:
api://default(or custom)
- Issuer: Should be
Go to Access Policies tab
- Ensure there's a policy allowing your application
- Default policy: Default Policy Rule allows all clients
Go to Scopes tab
- Verify these scopes exist:
openid(required)profile(recommended)email(recommended)offline_access(required for refresh tokens)
- Verify these scopes exist:
Step 4: Configure PKCE Settings
- In your application settings, go to General tab
- Scroll to General Settings → Edit
- Proof Key for Code Exchange (PKCE):
- Select: Require PKCE as additional verification
- This is required for public clients (Claude Desktop/Code)
Step 5: Assign Users/Groups
- Go to Assignments tab in your application
- Click Assign → Assign to People or Assign to Groups
- Select users/groups who should have access
- Click Assign → Done
Step 6: Test OIDC Configuration
You can test your Okta configuration using the .well-known endpoint:
curl https://${OKTA_DOMAIN}/oauth2/default/.well-known/openid-configurationExpected response includes:
{
"issuer": "https://dev-12345678.okta.com/oauth2/default",
"authorization_endpoint": "https://dev-12345678.okta.com/oauth2/default/v1/authorize",
"token_endpoint": "https://dev-12345678.okta.com/oauth2/default/v1/token",
"jwks_uri": "https://dev-12345678.okta.com/oauth2/default/v1/keys",
...
}2
3
4
5
6
7
Docker Swarm Deployment
Step 1: Prepare Docker Stack File
The repository includes docker-stack.production.yml. Review and customize:
services:
seed:
image: containers.home/mcp-servers/seed:latest
networks:
- traefik-public
- seed-internal
deploy:
labels:
# Update with your domain
- traefik.http.routers.seed.rule=Host(`seed.yourdomain.com`)
- traefik.http.routers.seed.entrypoints=websecure
- traefik.http.routers.seed.tls.certresolver=letsencrypt
environment:
- NODE_ENV=production
- AUTH_REQUIRED=true
# Update with your Okta values
- OIDC_ISSUER=https://dev-12345678.okta.com/oauth2/default
- OIDC_AUDIENCE=0oa1abc2def3ghi4jkl5
- OAUTH_TOKEN_URL=https://dev-12345678.okta.com/oauth2/default/v1/token
- OAUTH_AUTHORIZATION_URL=https://dev-12345678.okta.com/oauth2/default/v1/authorize
- BASE_URL=https://seed.yourdomain.com2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
Step 2: Create Docker Secrets (Recommended)
For sensitive values, use Docker secrets instead of environment variables:
# Create secrets
echo "0oa1abc2def3ghi4jkl5" | docker secret create seed_oidc_audience -
echo "https://dev-12345678.okta.com/oauth2/default" | docker secret create seed_oidc_issuer -
# Update docker-stack.yml to use secrets
services:
seed:
secrets:
- seed_oidc_audience
- seed_oidc_issuer
environment:
- OIDC_AUDIENCE_FILE=/run/secrets/seed_oidc_audience
- OIDC_ISSUER_FILE=/run/secrets/seed_oidc_issuer
secrets:
seed_oidc_audience:
external: true
seed_oidc_issuer:
external: true2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
WARNING
The current implementation doesn't support _FILE suffix for secrets. Use environment variables for now or extend the config loader.
Step 3: Build and Push Docker Image
# Build image
docker build -t seed-mcp-server:latest .
# Tag for your registry
docker tag seed-mcp-server:latest containers.home/mcp-servers/seed:latest
# Push to registry
docker push containers.home/mcp-servers/seed:latest2
3
4
5
6
7
8
Step 4: Deploy to Docker Swarm
# Deploy the stack
docker stack deploy -c docker-stack.production.yml seed
# Verify deployment
docker stack ps seed
# Check logs
docker service logs seed_seed -f2
3
4
5
6
7
8
Step 5: Verify Deployment
# Check service status
docker service ps seed_seed
# Test health endpoint (no auth required)
curl https://seed.yourdomain.com/health
# Test MCP endpoint (requires auth)
curl https://seed.yourdomain.com/mcp \
-H "Authorization: Bearer YOUR_JWT_TOKEN"2
3
4
5
6
7
8
9
Securing the Metrics Endpoint
The /metrics endpoint exposes Prometheus metrics and should be secured in production.
Option 1: Network-Level Restriction (Recommended)
Restrict access to /metrics at the Traefik level using IP whitelisting:
services:
seed:
deploy:
labels:
# Main application router
- traefik.http.routers.seed.rule=Host(`seed.yourdomain.com`)
- traefik.http.routers.seed.entrypoints=websecure
- traefik.http.routers.seed.tls.certresolver=letsencrypt
- traefik.http.services.seed.loadbalancer.server.port=3000
# Separate router for metrics with IP whitelist
- traefik.http.routers.seed-metrics.rule=Host(`seed.yourdomain.com`) && PathPrefix(`/metrics`)
- traefik.http.routers.seed-metrics.entrypoints=websecure
- traefik.http.routers.seed-metrics.tls.certresolver=letsencrypt
- traefik.http.routers.seed-metrics.priority=100
# Only allow Prometheus server and admin IPs
- traefik.http.routers.seed-metrics.middlewares=metrics-ipwhitelist
- traefik.http.middlewares.metrics-ipwhitelist.ipwhitelist.sourcerange=10.0.0.0/8,192.168.1.100/322
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
Option 2: Internal Network Only
Deploy metrics endpoint on a separate internal network:
services:
seed:
networks:
- traefik-public # Public traffic
- seed-internal # Internal traffic (Redis, metrics)
deploy:
labels:
# Only expose main app publicly
- traefik.http.routers.seed.rule=Host(`seed.yourdomain.com`) && !PathPrefix(`/metrics`)
prometheus:
image: prom/prometheus:latest
networks:
- seed-internal
command:
- '--config.file=/etc/prometheus/prometheus.yml'
configs:
- source: prometheus-config
target: /etc/prometheus/prometheus.yml
configs:
prometheus-config:
file: ./prometheus.yml
networks:
seed-internal:
driver: overlay
internal: true # No external access2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
Create prometheus.yml:
global:
scrape_interval: 15s
scrape_configs:
- job_name: 'seed'
static_configs:
- targets: ['seed:3000']
metrics_path: '/metrics'2
3
4
5
6
7
8
Option 3: Disable Metrics in Production
If you don't need metrics, disable them entirely:
services:
seed:
environment:
- METRICS_ENABLED=false2
3
4
Option 4: Authentication via Traefik BasicAuth
Add HTTP Basic Auth to the metrics endpoint:
# Generate password hash
htpasswd -nb admin your-password
# Output: admin:$apr1$ruca84Hq$mbjdMZBAG.KWn7vfN/SNK/
# Create Traefik middleware
docker config create metrics-auth-users -
# Paste the htpasswd output, then Ctrl+D2
3
4
5
6
7
Update stack file:
services:
seed:
deploy:
labels:
# Metrics router with BasicAuth
- traefik.http.routers.seed-metrics.rule=Host(`seed.yourdomain.com`) && PathPrefix(`/metrics`)
- traefik.http.routers.seed-metrics.middlewares=metrics-auth
- traefik.http.middlewares.metrics-auth.basicauth.usersfile=/run/secrets/metrics-auth-users
configs:
- source: metrics-auth-users
target: /run/secrets/metrics-auth-users
configs:
metrics-auth-users:
external: true2
3
4
5
6
7
8
9
10
11
12
13
14
15
Monitoring and Observability
Prometheus Setup
Add Seed as Prometheus Target
Edit your Prometheus configuration:
yamlscrape_configs: - job_name: 'seed-mcp-server' static_configs: - targets: ['seed:3000'] # Internal service name metrics_path: '/metrics' scrape_interval: 30s1
2
3
4
5
6Verify Scraping
Check Prometheus UI → Targets to ensure Seed is being scraped successfully.
Grafana Dashboard
Create a Grafana dashboard to visualize Seed metrics:
Key Metrics to Monitor:
HTTP Metrics
http_request_duration_seconds- Request latencyhttp_request_total- Request count by method/route/status
MCP Session Metrics
mcp_sessions_active- Current active sessionsmcp_sessions_total- Total sessions created/terminatedmcp_tool_invocations_total- Tool usage by tool/statusmcp_tool_duration_seconds- Tool execution time
Authentication Metrics
auth_attempts_total- Auth success/failure rateauth_token_validation_duration_seconds- Token validation latency
OAuth Flow Metrics (✅ Added 2026-01-07)
oauth_authorization_requests_total- OAuth authorization requests by resultoauth_token_exchanges_total- Token exchanges by grant type and resultoauth_token_exchange_duration_seconds- IdP response timedcr_registrations_total- Dynamic client registrations
Token Refresh Metrics (✅ Added 2026-01-07)
token_refresh_attempts_total- Token refresh attempts by type (proactive/reactive) and resulttoken_refresh_duration_seconds- Token refresh operation latencypending_tokens_claimed_total- Pending tokens claimed by sessions
JWKS Metrics
jwks_refresh_total- JWKS refresh operationsjwks_cache_hits_total/jwks_cache_misses_total- Cache efficiency
Redis Metrics
redis_operations_total- Redis operation countredis_operation_duration_seconds- Redis latencycircuit_breaker_state- Circuit breaker state (0=closed, 2=open)circuit_breaker_failures_total- Circuit breaker failures
Rate Limiting
rate_limit_hits_total- Requests blocked by rate limiting
System Metrics (from prom-client defaults)
process_cpu_seconds_total- CPU usageprocess_resident_memory_bytes- Memory usagenodejs_eventloop_lag_seconds- Event loop lag
Alerting Rules
Example Prometheus alerting rules:
# prometheus-rules.yml
groups:
- name: seed_mcp_server
interval: 30s
rules:
# High error rate
- alert: HighHTTPErrorRate
expr: |
(
sum(rate(http_request_total{status_code=~"5.."}[5m]))
/
sum(rate(http_request_total[5m]))
) > 0.05
for: 5m
labels:
severity: warning
annotations:
summary: "High HTTP error rate on Seed MCP Server"
description: "{{ $value | humanizePercentage }} of requests are failing"
# High auth failure rate
- alert: HighAuthFailureRate
expr: |
(
sum(rate(auth_attempts_total{result="failure"}[5m]))
/
sum(rate(auth_attempts_total[5m]))
) > 0.10
for: 5m
labels:
severity: warning
annotations:
summary: "High authentication failure rate"
description: "{{ $value | humanizePercentage }} of auth attempts are failing"
# Service down
- alert: SeedMCPServerDown
expr: up{job="seed-mcp-server"} == 0
for: 1m
labels:
severity: critical
annotations:
summary: "Seed MCP Server is down"
description: "Seed MCP Server has been down for more than 1 minute"
# High memory usage
- alert: HighMemoryUsage
expr: |
(
process_resident_memory_bytes{job="seed-mcp-server"}
/
256000000 # 256MB limit from docker-stack.yml
) > 0.90
for: 5m
labels:
severity: warning
annotations:
summary: "Seed MCP Server using too much memory"
description: "Memory usage is at {{ $value | humanizePercentage }}"
# JWKS refresh failures
- alert: JWKSRefreshFailures
expr: rate(jwks_refresh_total{result="failure"}[5m]) > 0
for: 5m
labels:
severity: warning
annotations:
summary: "JWKS refresh failures detected"
description: "Unable to refresh JWKS keys from OIDC provider"
# OAuth token exchange failures (Added 2026-01-07)
- alert: HighOAuthTokenExchangeFailureRate
expr: |
(
sum(rate(oauth_token_exchanges_total{result="failure"}[5m]))
/
sum(rate(oauth_token_exchanges_total[5m]))
) > 0.01
for: 5m
labels:
severity: warning
annotations:
summary: "High OAuth token exchange failure rate"
description: "{{ $value | humanizePercentage }} of token exchanges are failing"
# Token refresh failures (Added 2026-01-07)
- alert: HighTokenRefreshFailureRate
expr: |
(
sum(rate(token_refresh_attempts_total{result="failure"}[5m]))
/
sum(rate(token_refresh_attempts_total{result!="skipped"}[5m]))
) > 0.10
for: 5m
labels:
severity: warning
annotations:
summary: "High token refresh failure rate"
description: "{{ $value | humanizePercentage }} of token refreshes are failing"
# Circuit breaker open (Added 2026-01-06)
- alert: RedisCircuitBreakerOpen
expr: circuit_breaker_state{name="redis"} == 2
for: 1m
labels:
severity: critical
annotations:
summary: "Redis circuit breaker is open"
description: "Redis connection failures detected, circuit breaker protecting service"2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
Logging
The production stack uses JSON log format with rotation:
services:
seed:
environment:
- LOG_FORMAT=json # Structured logging for log aggregation
- LOG_LEVEL=info # Production log level
logging:
driver: json-file
options:
max-size: "10m"
max-file: "3"2
3
4
5
6
7
8
9
10
Send logs to centralized logging:
# Example: Using Loki
services:
seed:
logging:
driver: loki
options:
loki-url: "http://loki:3100/loki/api/v1/push"
labels: "service=seed,environment=production"2
3
4
5
6
7
8
Troubleshooting
Issue: Server fails to start
Symptoms:
docker service logs seed_seed
# Configuration validation failed: ...
# Server startup aborted.2
3
Solution: This is the configuration validation feature (implemented 2026-01-06) detecting invalid configuration.
Check the error message - It will specify which configuration value is invalid:
✗ Configuration validation failed: - OIDC_ISSUER must be configured when AUTH_REQUIRED=true - PORT must be between 1 and 65535, got: 999991
2
3Fix the configuration in your stack file or environment variables
Redeploy the service:
bashdocker stack deploy -c docker-stack.production.yml seed1
Common validation errors:
- Missing OIDC_ISSUER when AUTH_REQUIRED=true
- Invalid URL formats (must be valid HTTP/HTTPS URLs)
- Invalid port range (must be 1-65535)
- TTL values too short (MCP_SESSION_TTL_SECONDS must be ≥60)
Issue: Cannot connect to /mcp endpoint
Symptoms:
curl https://seed.yourdomain.com/mcp
# Returns 401 Unauthorized2
Solutions:
Check health endpoint first:
bash# Verify server is running curl https://seed.yourdomain.com/health # Check dependencies are healthy curl https://seed.yourdomain.com/health/ready1
2
3
4
5Check Okta configuration:
bash# Verify OIDC discovery works curl https://dev-12345678.okta.com/oauth2/default/.well-known/openid-configuration1
2Check environment variables:
bashdocker service inspect seed_seed --format='{{json .Spec.TaskTemplate.ContainerSpec.Env}}'1Check service logs:
bashdocker service logs seed_seed -f | grep -i error1Verify JWKS is accessible:
bash# Check if Seed can reach Okta docker exec $(docker ps -qf label=com.docker.swarm.service.name=seed_seed) \ wget -O- https://dev-12345678.okta.com/oauth2/default/v1/keys1
2
3
Issue: /metrics endpoint returning 404
Symptoms:
curl https://seed.yourdomain.com/metrics
# Returns 404 Not Found2
Solutions:
Check if metrics are enabled:
bashdocker service inspect seed_seed | grep METRICS_ENABLED # Should not be set to "false"1
2Check Traefik routing:
bash# Verify Traefik can see the service curl -u admin:password http://traefik:8080/api/http/routers1
2Test directly from container:
bashdocker exec $(docker ps -qf label=com.docker.swarm.service.name=seed_seed) \ wget -qO- http://localhost:3000/metrics1
2
Issue: High memory usage
Symptoms:
- Container OOMKilled
mcp_sessions_activemetric growing unbounded
Solutions:
Check session TTL:
bash# Verify MCP_SESSION_TTL_SECONDS is set docker service inspect seed_seed | grep MCP_SESSION_TTL1
2Check Redis eviction:
bashdocker exec $(docker ps -qf label=com.docker.swarm.service.name=seed_redis) \ redis-cli CONFIG GET maxmemory-policy # Should be: allkeys-lru1
2
3Monitor session metrics:
text# Check if sessions are expiring rate(mcp_sessions_total{status="terminated"}[5m])1
2Increase memory limit:
yamlservices: seed: deploy: resources: limits: memory: 512M # Increased from 256M1
2
3
4
5
6
Issue: Rate limiting false positives
Symptoms:
- Legitimate requests getting 429 responses
rate_limit_hits_totalmetric increasing
Solutions:
Increase rate limits:
yamlenvironment: - MCP_RATE_LIMIT_MAX=200 # Increased from 100 - MCP_RATE_LIMIT_WINDOW_MS=600001
2
3Check if rate limiting is per-IP:
bash# All requests from same IP (e.g., Traefik)? docker service logs seed_seed | grep "rate limit"1
2Disable rate limiting temporarily:
yamlenvironment: - RATE_LIMIT_ENABLED=false1
2
Issue: CORS errors in browser
Symptoms:
Access to fetch at 'https://seed.yourdomain.com/mcp' from origin 'https://app.example.com'
has been blocked by CORS policy2
Solutions:
Add origin to CORS whitelist:
yamlenvironment: - CORS_EXTRA_ORIGINS=https://app.example.com,https://other.example.com1
2Check current CORS config:
bashdocker service logs seed_seed | grep -i cors1
Production Readiness Features
✅ Implemented (2026-01-06):
Graceful Shutdown
- SIGTERM and SIGINT signal handling
- Stops accepting new connections
- Waits for active requests (5-second grace period)
- Closes all MCP sessions in parallel
- Closes Redis connections properly
- Stops JWKS refresh timer
- Health check returns 503 during shutdown
Kubernetes integration: No special configuration needed - graceful shutdown works automatically with Kubernetes pod termination.
Health Checks
- Liveness probe (
/health) - Process health, returns 503 during shutdown - Readiness probe (
/health/ready) - Dependency health checks:- Redis connectivity with circuit breaker state
- JWKS cache with expiration tracking
- Session capacity with utilization metrics
Kubernetes deployment:
livenessProbe:
httpGet:
path: /health
port: 3000
initialDelaySeconds: 10
periodSeconds: 10
failureThreshold: 3
readinessProbe:
httpGet:
path: /health/ready
port: 3000
initialDelaySeconds: 5
periodSeconds: 5
failureThreshold: 32
3
4
5
6
7
8
9
10
11
12
13
14
15
Configuration Validation
- Validates all config values at startup
- URL format validation (HTTP/HTTPS, redis://)
- Port ranges (1-65535)
- Numeric limits and TTL values
- Production-specific security requirements
- Exits with clear error messages if validation fails
Redis Resilience
- Circuit breaker pattern for connection failures
- Graceful degradation when Redis unavailable
- Automatic reconnection with exponential backoff
- Health check integration
Token Revocation
- RFC 7009 compliant
/oauth/revokeendpoint - Access token revocation cache (5-minute TTL)
- Refresh token revocation proxied to IdP
Security Checklist
Before deploying to production:
- [ ] HTTPS enabled with valid TLS certificate (Let's Encrypt)
- [ ]
AUTH_REQUIRED=trueset - [ ] Okta application configured with PKCE required
- [ ] Redirect URIs properly configured in Okta
- [ ]
/metricsendpoint secured (IP whitelist or disabled) - [ ] Docker secrets used for sensitive values (optional but recommended)
- [ ] Redis configured with password (if exposed)
- [ ] Rate limiting enabled with appropriate limits
- [ ] Logging configured with rotation
- [ ] Monitoring and alerting set up
- [ ] Resource limits set on containers
- [ ] ✅ Health checks configured (liveness and readiness probes)
- [ ] ✅ Configuration validation enabled (automatic at startup)
- [ ] Backup strategy for Redis data (if needed)
Related Documentation
- Configuration Reference - All environment variables
- Architecture: Authentication - How authentication works
- Architecture: OAuth - OAuth 2.1 implementation
- Deploy with OIDC - General OIDC deployment guide
- Production Operations - Prometheus metrics and production monitoring