Production Operations Guide
Comprehensive guide for running Seed MCP Server in production, covering security, Redis, logging, monitoring, and troubleshooting.
Table of Contents
- Security Hardening
- Redis Configuration
- Logging & Debugging
- Monitoring
- Performance Tuning
- Backup & Recovery
- Troubleshooting
Security Hardening
Authentication Security
1. Always Use HTTPS in Production
# Nginx configuration
server {
listen 443 ssl http2;
server_name mcp.example.com;
ssl_certificate /path/to/cert.pem;
ssl_certificate_key /path/to/key.pem;
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers HIGH:!aNULL:!MD5;
location / {
proxy_pass http://localhost:3000;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
}2. Short Token Lifetimes
Configure your OIDC provider for short-lived tokens:
- Access tokens: 1 hour or less
- Refresh tokens: 24 hours with rotation
- Session TTL: Match token lifetime
# In Seed configuration
REDIS_SESSION_TTL=3600 # 1 hour in seconds3. Token Validation
Seed automatically validates:
- JWT signature using JWKS
- Token expiration (
expclaim) - Token issuer (
issclaim) - Token audience (
audclaim) - Not-before time (
nbfclaim)
Ensure these environment variables are set:
OIDC_ISSUER=https://your-idp.com/
OIDC_AUDIENCE=your-client-idNetwork Security
1. Firewall Rules
Allow only necessary traffic:
# Allow HTTPS from anywhere
iptables -A INPUT -p tcp --dport 443 -j ACCEPT
# Allow Redis only from localhost or internal network
iptables -A INPUT -p tcp --dport 6379 -s 10.0.0.0/8 -j ACCEPT
iptables -A INPUT -p tcp --dport 6379 -j DROP
# Drop all other traffic
iptables -P INPUT DROP2. IP Allowlisting
Restrict access to known IP ranges:
// src/middleware/ip-allowlist.ts
import {Request, Response, NextFunction} from "express";
const allowedIPs = process.env.ALLOWED_IPS?.split(",") || [];
const allowedCIDRs = process.env.ALLOWED_CIDRS?.split(",") || [];
export function ipAllowlistMiddleware(req: Request, res: Response, next: NextFunction) {
const clientIP = req.ip || req.socket.remoteAddress;
if (isAllowed(clientIP, allowedIPs, allowedCIDRs)) {
return next();
}
res.status(403).json({
jsonrpc: "2.0",
error: {
code: -32000,
message: "Forbidden",
data: {reason: "ip_not_allowed"},
},
id: null,
});
}3. Rate Limiting
Protect against abuse with rate limiting:
// src/middleware/rate-limit.ts
import rateLimit from "express-rate-limit";
export const rateLimiter = rateLimit({
windowMs: 15 * 60 * 1000, // 15 minutes
max: 100, // 100 requests per window
message: {
jsonrpc: "2.0",
error: {
code: -32000,
message: "Too many requests",
data: {reason: "rate_limit_exceeded"},
},
id: null,
},
standardHeaders: true,
legacyHeaders: false,
});
// Apply globally or per-route
app.use("/mcp", rateLimiter);Data Security
1. Redis Security
Protect Redis with authentication:
# Redis configuration (redis.conf)
requirepass YOUR_STRONG_PASSWORD
rename-command CONFIG ""
rename-command FLUSHALL ""
rename-command FLUSHDB ""# Seed configuration
REDIS_URL=redis://:YOUR_STRONG_PASSWORD@localhost:63792. Sensitive Data Handling
Never log sensitive data:
// Bad
console.log(`Token: ${token}`);
// Good
console.log(`Token received: ${token.substring(0, 10)}...`);Sanitize errors before sending to clients:
try {
// Operation
} catch (error) {
// Log full error server-side
logger.error("Operation failed", {error: error.stack});
// Send generic error to client
res.status(500).json({
jsonrpc: "2.0",
error: {
code: -32603,
message: "Internal server error",
},
id: null,
});
}3. Environment Variable Security
Never commit .env files:
# .gitignore
.env
.env.*
!.env.exampleUse secrets management in production:
- AWS Secrets Manager
- Azure Key Vault
- HashiCorp Vault
- Kubernetes Secrets
Security Headers
Add security headers to all responses:
import helmet from "helmet";
app.use(
helmet({
contentSecurityPolicy: {
directives: {
defaultSrc: ["'self'"],
scriptSrc: ["'self'"],
styleSrc: ["'self'", "'unsafe-inline'"],
imgSrc: ["'self'", "data:", "https:"],
},
},
hsts: {
maxAge: 31536000,
includeSubDomains: true,
preload: true,
},
})
);Audit Logging
Log all authentication events:
// Log successful authentications
logger.info("User authenticated", {
userId: user.sub,
email: user.email,
timestamp: new Date().toISOString(),
ip: req.ip,
});
// Log failed authentications
logger.warn("Authentication failed", {
reason: "invalid_token",
ip: req.ip,
timestamp: new Date().toISOString(),
});Redis Configuration
Production Redis Setup
1. Basic Configuration
# redis.conf
# Network
bind 127.0.0.1
port 6379
protected-mode yes
# Security
requirepass YOUR_STRONG_PASSWORD
# Persistence
save 900 1
save 300 10
save 60 10000
stop-writes-on-bgsave-error yes
rdbcompression yes
rdbchecksum yes
dbfilename dump.rdb
dir /var/lib/redis
# Memory
maxmemory 256mb
maxmemory-policy allkeys-lru
# Logging
loglevel notice
logfile /var/log/redis/redis-server.log2. Connection Pooling
Use connection pooling for better performance:
// src/services/redis.ts
import {createClient} from "redis";
const client = createClient({
url: process.env.REDIS_URL,
socket: {
reconnectStrategy: (retries) => {
if (retries > 10) {
return new Error("Max retries reached");
}
return Math.min(retries * 100, 3000);
},
},
});
client.on("error", (err) => {
logger.error("Redis connection error", {error: err});
});
client.on("reconnecting", () => {
logger.warn("Redis reconnecting");
});
client.on("ready", () => {
logger.info("Redis connection established");
});3. Session Management
Configure session storage:
// Session key pattern
const sessionKey = `seed:session:${sessionId}`;
// Store session with TTL
await redis.set(sessionKey, JSON.stringify(sessionData), {
EX: 3600, // 1 hour
});
// Get session
const data = await redis.get(sessionKey);
const session = data ? JSON.parse(data) : null;
// Delete session
await redis.del(sessionKey);
// Clean up expired sessions (automatic with TTL)4. Redis Sentinel for High Availability
import {createClient} from "redis";
const client = createClient({
sentinels: [
{host: "sentinel1.example.com", port: 26379},
{host: "sentinel2.example.com", port: 26379},
{host: "sentinel3.example.com", port: 26379},
],
name: "mymaster",
password: process.env.REDIS_PASSWORD,
});5. Redis Cluster for Scale
import {createCluster} from "redis";
const cluster = createCluster({
rootNodes: [
{url: "redis://node1.example.com:6379"},
{url: "redis://node2.example.com:6379"},
{url: "redis://node3.example.com:6379"},
],
defaults: {
password: process.env.REDIS_PASSWORD,
},
});Redis Monitoring
Monitor these Redis metrics:
# Memory usage
redis-cli INFO memory
# Connected clients
redis-cli INFO clients
# Commands processed
redis-cli INFO stats
# Keyspace information
redis-cli INFO keyspace
# Real-time monitoring
redis-cli MONITORSet up alerts for:
- Memory usage > 80%
- Connected clients > 1000
- Command execution time > 100ms
- Rejected connections > 0
Logging & Debugging
Structured Logging
Use structured logging for better observability:
// src/utils/logger.ts
import winston from "winston";
export const logger = winston.createLogger({
level: process.env.LOG_LEVEL || "info",
format: winston.format.combine(
winston.format.timestamp(),
winston.format.errors({stack: true}),
winston.format.json()
),
defaultMeta: {service: "seed-mcp-server"},
transports: [
new winston.transports.File({filename: "error.log", level: "error"}),
new winston.transports.File({filename: "combined.log"}),
],
});
// Add console transport in development
if (process.env.NODE_ENV !== "production") {
logger.add(
new winston.transports.Console({
format: winston.format.combine(winston.format.colorize(), winston.format.simple()),
})
);
}Log Levels
Use appropriate log levels:
// ERROR - System errors requiring immediate attention
logger.error("Redis connection failed", {error: err.message, stack: err.stack});
// WARN - Warning conditions that should be investigated
logger.warn("JWKS fetch took longer than expected", {duration: 5000});
// INFO - Normal operational events
logger.info("MCP session created", {sessionId, userId: user.sub});
// DEBUG - Detailed information for debugging
logger.debug("JWT claims extracted", {claims: payload});Request Logging
Log all incoming requests:
import morgan from "morgan";
// Combine Morgan with Winston
const stream = {
write: (message: string) => logger.http(message.trim()),
};
app.use(
morgan(":method :url :status :res[content-length] - :response-time ms", {
stream,
})
);Correlation IDs
Track requests across services:
import {v4 as uuidv4} from "uuid";
app.use((req, res, next) => {
req.correlationId = req.headers["x-correlation-id"] || uuidv4();
res.setHeader("X-Correlation-ID", req.correlationId);
next();
});
// Use in logs
logger.info("Request processed", {
correlationId: req.correlationId,
method: req.method,
path: req.path,
});Debug Mode
Enable detailed debugging:
# Debug all
DEBUG=* npm start
# Debug specific modules
DEBUG=seed:* npm start
DEBUG=seed:auth,seed:mcp npm startimport debug from "debug";
const log = debug("seed:auth");
export function authMiddleware(req, res, next) {
log("Processing authentication for %s %s", req.method, req.path);
// ...
}Error Tracking
Integrate error tracking service:
import * as Sentry from "@sentry/node";
Sentry.init({
dsn: process.env.SENTRY_DSN,
environment: process.env.NODE_ENV,
tracesSampleRate: 0.1,
});
// Capture all errors
app.use(Sentry.Handlers.errorHandler());
// Manual error capture
try {
// Operation
} catch (error) {
Sentry.captureException(error);
throw error;
}Monitoring
Health Check Endpoint
Monitor server health:
# Basic health check
curl https://mcp.example.com/health
# Expected response
{
"status": "ok",
"timestamp": "2025-01-05T12:00:00.000Z",
"uptime": 3600.5,
"redis": {
"connected": true,
"ping": "PONG"
}
}Key Metrics to Track
Application Metrics:
- Request rate (requests/second)
- Response time (p50, p95, p99)
- Error rate (percentage)
- Active MCP sessions
- Tool invocation rate
System Metrics:
- CPU usage
- Memory usage
- Disk I/O
- Network I/O
Redis Metrics:
- Memory usage
- Connected clients
- Commands processed/second
- Key count
- Hit rate
Authentication Metrics:
- OAuth success rate
- JWT validation time
- JWKS fetch latency
- Token refresh rate
Prometheus Integration
Seed includes built-in Prometheus metrics support.
Enabling Metrics
Metrics are enabled by default. To disable:
# Disable metrics collection
METRICS_ENABLED=falseWhen enabled, metrics are exposed at /metrics endpoint in Prometheus format.
Available Metrics
HTTP Metrics:
http_requests_total- Total HTTP requests by method, route, and statushttp_request_duration_seconds- Request duration histogram
MCP Metrics:
mcp_sessions_active- Current active MCP sessionsmcp_sessions_total- Total MCP sessions createdmcp_tool_invocations_total- Tool invocations by tool name and resultmcp_tool_duration_seconds- Tool execution duration
Authentication Metrics:
auth_attempts_total- Authentication attempts by result (success/failure)auth_token_validation_duration_seconds- Token validation durationjwks_refresh_total- JWKS refresh operations by resultjwks_cache_hits_total- JWKS cache hitsjwks_cache_misses_total- JWKS cache misses
Redis Metrics:
redis_operations_total- Redis operations by operation type and resultredis_operation_duration_seconds- Redis operation duration
Rate Limiting Metrics:
rate_limit_hits_total- Rate limit hits by endpoint
System Metrics:
process_cpu_seconds_total- CPU usageprocess_resident_memory_bytes- Memory usagenodejs_eventloop_lag_seconds- Event loop lagnodejs_heap_size_total_bytes- Heap sizenodejs_heap_size_used_bytes- Heap used
Securing the Metrics Endpoint
Production Security
The /metrics endpoint is publicly accessible when metrics are enabled. Always restrict access in production.
Option 1: Disable metrics entirely
METRICS_ENABLED=falseOption 2: IP Whitelisting via Traefik
# docker-stack.yml
labels:
- "traefik.http.middlewares.metrics-ipwhitelist.ipwhitelist.sourcerange=10.0.0.0/8,172.16.0.0/12"
- "traefik.http.routers.seed-metrics.middlewares=metrics-ipwhitelist"Option 3: Internal network only
# docker-stack.yml
- "traefik.http.services.seed.loadbalancer.server.port=3000"
# Don't expose /metrics route to external TraefikOption 4: HTTP BasicAuth via reverse proxy
location /metrics {
auth_basic "Metrics";
auth_basic_user_file /etc/nginx/.htpasswd;
proxy_pass http://seed:3000;
}See Production Deployment Guide for detailed examples.
Grafana Dashboard
A pre-built Grafana dashboard is available at grafana/seed-mcp-server-dashboard.json. Import it to visualize all metrics:
# Import via Grafana UI
Dashboard → Import → Upload JSON file
# Or via API
curl -X POST -H "Content-Type: application/json" \
-H "Authorization: Bearer <token>" \
-d @grafana/seed-mcp-server-dashboard.json \
http://grafana:3000/api/dashboards/dbAlerting Rules
Set up alerts for critical conditions:
# Prometheus alert rules
groups:
- name: seed-mcp-server
rules:
- alert: HighErrorRate
expr: rate(http_requests_total{status=~"5.."}[5m]) > 0.05
for: 5m
annotations:
summary: "High error rate detected"
- alert: SlowResponses
expr: histogram_quantile(0.95, http_request_duration_seconds) > 2
for: 10m
annotations:
summary: "95th percentile response time > 2s"
- alert: RedisMemoryHigh
expr: redis_memory_used_bytes / redis_memory_max_bytes > 0.8
for: 5m
annotations:
summary: "Redis memory usage > 80%"Performance Tuning
Node.js Optimization
# Increase memory limit
NODE_OPTIONS="--max-old-space-size=4096" npm start
# Enable cluster mode
PM2_INSTANCES=4 pm2 start dist/index.jsJWKS Caching
Reduce identity provider calls:
// Increase cache TTL
jwks: {
cacheTtlMs: 3600000, // 1 hour (default)
refreshBeforeExpiryMs: 300000, // 5 minutes
}
// Or set via environment
JWKS_CACHE_TTL=7200000 # 2 hoursConnection Pooling
Optimize database connections:
// Redis connection pool
const pool = new Pool({
min: 2,
max: 10,
idleTimeoutMillis: 30000,
});Response Compression
Enable gzip compression:
import compression from "compression";
app.use(compression());Load Balancing
Deploy multiple instances:
upstream seed_backend {
least_conn;
server 127.0.0.1:3000;
server 127.0.0.1:3001;
server 127.0.0.1:3002;
server 127.0.0.1:3003;
}
server {
location / {
proxy_pass http://seed_backend;
}
}Backup & Recovery
Redis Backup
# Manual backup
redis-cli BGSAVE
# Automated backup script
#!/bin/bash
BACKUP_DIR=/backups/redis
DATE=$(date +%Y%m%d_%H%M%S)
redis-cli BGSAVE
cp /var/lib/redis/dump.rdb $BACKUP_DIR/dump_$DATE.rdb
# Keep only last 7 days
find $BACKUP_DIR -name "dump_*.rdb" -mtime +7 -deleteConfiguration Backup
# Backup environment configuration
cp .env .env.backup.$(date +%Y%m%d)
# Backup Redis configuration
cp /etc/redis/redis.conf /backups/redis.conf.$(date +%Y%m%d)Disaster Recovery
Recovery Steps:
- Stop Seed server
- Restore Redis from backup:bash
redis-cli SHUTDOWN cp /backups/redis/dump_20250105.rdb /var/lib/redis/dump.rdb redis-server /etc/redis/redis.conf - Verify Redis data
- Restart Seed server
- Test connections
Troubleshooting
See Troubleshooting Guide for common issues.
Production-Specific Issues
High Memory Usage
# Check memory usage
ps aux | grep node
redis-cli INFO memory
# Enable heap snapshots
node --heapsnapshot-signal=SIGUSR2 dist/index.js
# Trigger snapshot
kill -USR2 <PID>
# Analyze with Chrome DevToolsMemory Leaks
// Enable heap profiling
import v8 from "v8";
import fs from "fs";
function takeHeapSnapshot() {
const filename = `heap-${Date.now()}.heapsnapshot`;
const stream = v8.writeHeapSnapshot(filename);
console.log(`Heap snapshot written to ${filename}`);
}
// Trigger on SIGUSR2
process.on("SIGUSR2", takeHeapSnapshot);Connection Pool Exhaustion
# Check Redis connections
redis-cli CLIENT LIST
# Monitor connection count
watch -n 1 'redis-cli CLIENT LIST | wc -l'Solutions:
- Increase pool size
- Reduce connection timeout
- Find and fix connection leaks
Slow Queries
# Enable slow log in Redis
redis-cli CONFIG SET slowlog-log-slower-than 10000 # 10ms
redis-cli CONFIG SET slowlog-max-len 128
# View slow queries
redis-cli SLOWLOG GET 10Security Checklist
- [ ] HTTPS enabled with valid certificates
- [ ] OAuth endpoints configured correctly
- [ ] Token lifetimes set appropriately (≤1 hour)
- [ ] Redis authentication enabled
- [ ] Firewall rules configured
- [ ] Rate limiting enabled
- [ ] IP allowlisting configured (if needed)
- [ ] Security headers enabled
- [ ] Audit logging enabled
- [ ] Sensitive data sanitized from logs
- [ ] Environment variables secured
- [ ] Regular security updates applied
Performance Checklist
- [ ] JWKS caching optimized
- [ ] Redis connection pooling configured
- [ ] Response compression enabled
- [ ] Load balancing implemented
- [ ] Monitoring and alerting set up
- [ ] Log retention configured
- [ ] Backup automation configured
- [ ] Resource limits set appropriately
Monitoring Checklist
- [ ] Health check endpoint monitored
- [ ] Application metrics collected
- [ ] System metrics tracked
- [ ] Redis metrics monitored
- [ ] Authentication metrics tracked
- [ ] Alerting rules configured
- [ ] Dashboards created
- [ ] On-call rotation established
Related Documentation
- Deployment Guide - Deployment strategies
- Configuration - Environment variables
- Architecture - System design
- Troubleshooting - Common issues and solutions
- API Reference - API documentation