Skip to content

Audit Logging System

Priority: 🟢 MEDIUM Estimated Time: 20-24 hours Compliance Impact: HIGH (SOC2, HIPAA, GDPR)

← Back to Enhancements


Overview

Implement comprehensive audit logging for MCP tool invocations and system events to provide compliance trails, security forensics, and usage analytics.


Current Gap

Currently, Seed has:

  • ✅ Prometheus metrics for operational monitoring
  • ✅ Structured application logging
  • ❌ No audit trail for tool executions
  • ❌ No compliance-ready event records
  • ❌ No user activity history

Proposed Implementation

Core Components

1. Audit Service (src/services/audit.ts)

  • Redis-backed storage with configurable retention (default: 90 days)
  • User-isolated event storage
  • Automatic expiration via Redis TTL
  • Structured event format with metadata

2. Event Types

  • tool_invocation - MCP tool calls
  • session_created - New MCP session initialization
  • session_terminated - Session cleanup
  • auth_success - Successful authentication
  • auth_failure - Failed authentication attempts

3. Audit API (src/routes/audit.ts)

  • GET /audit/events - Query user's audit history
  • GET /audit/events/:eventId - Fetch specific event details
  • Authentication-protected endpoints
  • User isolation enforced

4. Instrumentation

  • Route-level instrumentation in MCP handler
  • Captures tool name, arguments, results, duration
  • Records IP address, user agent, session context

Event Schema

typescript
interface AuditEvent {
  id: string;                    // UUID
  timestamp: string;             // ISO 8601
  eventType: string;             // Event category
  userId: string;                // User subject from JWT
  sessionId?: string;            // MCP session ID
  toolName?: string;             // Tool that was invoked
  toolArgs?: Record<string, unknown>;  // Tool arguments (sanitized)
  result?: 'success' | 'error'; // Invocation result
  errorMessage?: string;         // Error details if failed
  duration?: number;             // Execution time (ms)
  ipAddress?: string;            // Client IP
  userAgent?: string;            // Client user agent
  metadata?: Record<string, unknown>;  // Additional context
}

Storage Architecture

Redis Keys

audit:{eventId}           → Full event JSON (90-day TTL)
audit:user:{userId}       → Sorted set of event IDs (by timestamp)

Benefits:

  • Fast lookups by event ID
  • Efficient user-scoped queries
  • Automatic cleanup via TTL
  • Minimal storage overhead

API Examples

Query Recent Activity

bash
curl -H "Authorization: Bearer $TOKEN" \
  https://seed.example.com/audit/events?limit=50&offset=0

Response

json
{
  "events": [
    {
      "id": "550e8400-e29b-41d4-a716-446655440000",
      "timestamp": "2026-01-05T18:30:00Z",
      "eventType": "tool_invocation",
      "userId": "user|auth0|12345",
      "sessionId": "mcp-session-abc123",
      "toolName": "system-status",
      "toolArgs": {},
      "result": "success",
      "duration": 234,
      "ipAddress": "203.0.113.42",
      "userAgent": "Claude Desktop/1.0.0"
    }
  ],
  "count": 1,
  "limit": 50,
  "offset": 0
}

Configuration

Add to .env.example:

bash
# Audit Logging
AUDIT_ENABLED=true
AUDIT_RETENTION_DAYS=90

Open Questions & Design Decisions

Before implementing this enhancement, address the following questions:

1. Access Control & Audience

Question: Who needs to view audit trails?

Options:

  • End Users Only - Users can only see their own activity

    • ✅ Privacy-friendly
    • ✅ Simple authorization model
    • ❌ No cross-user compliance reporting
  • Admin Dashboard - Privileged users can view all activity

    • ✅ Compliance investigations
    • ✅ Security incident response
    • ❌ Requires admin role system
    • ❌ Privacy implications (GDPR considerations)
  • Hybrid Approach - Users see their own, admins see all

    • ✅ Best of both worlds
    • ❌ More complex implementation

Recommendation: Start with end-user-only access. Add admin endpoints as needed based on compliance requirements.


2. User Interface

Question: How should audit trails be consumed?

Options:

  • API-Only - No built-in UI, integrate with external tools

    • ✅ Minimal scope
    • ✅ Flexible integration
    • ❌ Requires external tooling
  • Simple Web Dashboard - Basic HTML/JS interface

    • ✅ Self-contained viewing
    • ✅ Low complexity
    • ❌ Limited functionality
  • Advanced Dashboard - Rich filtering, search, export

    • ✅ Full-featured audit viewer
    • ❌ Significant development effort
    • ❌ Maintenance burden

Recommendation: Start API-only. Document integration patterns with Grafana, Kibana, or custom dashboards.


3. Data Sensitivity & Privacy

Question: How much detail should be logged?

Considerations:

  • Tool Arguments - May contain sensitive data (PII, credentials)
  • Tool Results - May contain sensitive responses
  • User Context - Email, groups, metadata

Options:

  • Full Logging - Log everything for complete audit trail

    • ✅ Maximum forensic value
    • ❌ Privacy concerns
    • ❌ Potential PII/sensitive data issues
  • Sanitized Logging - Redact/mask sensitive fields

    • ✅ Privacy-friendly
    • ✅ Compliance-ready
    • ❌ May miss important context
  • Configurable Logging - Admin-controlled sanitization rules

    • ✅ Flexible per-deployment
    • ❌ Configuration complexity

Recommendation: Default to sanitized logging with configurable exclusion patterns. Document sensitive field handling.


4. Real-Time vs Historical Access

Question: Is real-time monitoring required?

Options:

  • Historical Query Only - REST API for past events

    • ✅ Simple implementation
    • ✅ Meets compliance needs
    • ❌ No real-time alerting
  • Real-Time Streaming - WebSocket/SSE for live events

    • ✅ Real-time security monitoring
    • ✅ Live dashboards
    • ❌ More complex infrastructure
    • ❌ Scalability considerations

Recommendation: Start with historical query-only. Add streaming if real-time monitoring becomes a requirement.


5. Export & Reporting

Question: How should audit data be exported for compliance?

Options:

  • No Export - Query via API only

    • ✅ Minimal implementation
    • ❌ Manual compliance reporting
  • CSV/JSON Export - Download audit logs for period

    • ✅ Standard format
    • ✅ Excel-compatible
    • ❌ Limited to smaller datasets
  • SIEM Integration - Push events to Splunk, ELK, etc.

    • ✅ Enterprise-ready
    • ✅ Advanced analytics
    • ❌ Requires additional infrastructure
  • Scheduled Reports - Automated periodic summaries

    • ✅ Compliance-friendly
    • ❌ Email/storage complexity

Recommendation: Start with CSV/JSON export endpoint. Document SIEM integration patterns using structured logs.


6. Retention & Compliance

Question: How long should audit data be retained?

Considerations:

  • SOC2 - Typically 1 year minimum
  • HIPAA - 6 years for covered entities
  • GDPR - Minimal necessary retention
  • Storage Cost - Redis memory usage

Options:

  • Fixed Retention - Hardcoded 90 days

    • ✅ Simple implementation
    • ❌ May not meet all compliance needs
  • Configurable Retention - Environment variable control

    • ✅ Flexible per-deployment
    • ✅ Compliance-adaptable
    • ❌ User must understand requirements
  • Tiered Storage - Hot (Redis) + Cold (S3/disk)

    • ✅ Cost-effective long-term storage
    • ✅ Meets extended compliance needs
    • ❌ Complex implementation

Recommendation: Start with configurable retention (env var). Document compliance mapping for common frameworks.


7. Performance & Scalability

Question: How to minimize performance impact?

Considerations:

  • Audit logging adds latency to every tool invocation
  • Redis operations on critical path
  • High-throughput deployments

Options:

  • Synchronous Logging - Block until audit recorded

    • ✅ Guaranteed audit trail
    • ❌ Adds latency (~5-10ms)
  • Asynchronous Logging - Fire-and-forget

    • ✅ No user-facing latency
    • ❌ Risk of lost events on crash
  • Batched Logging - Buffer and flush periodically

    • ✅ High throughput
    • ❌ Delayed audit availability
    • ❌ Risk of lost events

Recommendation: Start with synchronous logging with Redis pipelining to minimize latency. Add async option if performance becomes an issue.


Implementation Phases

Note: Redis is included in the local development environment when using ./scripts/local (part of the Docker stack).

Phase 1: Core Audit Service (8-10 hours)

  • [ ] Create src/services/audit.ts
  • [ ] Implement Redis storage with TTL
  • [ ] Add structured logging integration
  • [ ] Write comprehensive unit tests
  • [ ] Document event schema

Phase 2: API Endpoints (4-6 hours)

  • [ ] Create src/routes/audit.ts
  • [ ] Implement user-scoped event queries
  • [ ] Add pagination support
  • [ ] Write API integration tests
  • [ ] Document API endpoints

Phase 3: MCP Instrumentation (4-6 hours)

  • [ ] Instrument /mcp route handler
  • [ ] Capture tool invocations
  • [ ] Handle auth success/failure events
  • [ ] Add session lifecycle events
  • [ ] Performance testing

Phase 4: Configuration & Documentation (4-6 hours)

  • [ ] Add AUDIT_ENABLED and AUDIT_RETENTION_DAYS config
  • [ ] Update .env.example
  • [ ] Write architecture documentation
  • [ ] Add compliance mapping guide
  • [ ] Create integration examples (Splunk, ELK, etc.)

Acceptance Criteria

  • [ ] All tool invocations audited
  • [ ] Audit events queryable via API
  • [ ] User isolation enforced (users only see their own events)
  • [ ] Configurable retention period
  • [ ] Performance overhead < 10ms per request
  • [ ] Comprehensive test coverage
  • [ ] Documentation includes compliance guidance

  • Admin Dashboard - Web UI for viewing audit logs across users
  • Advanced Search - Full-text search across audit events
  • Real-Time Alerts - WebSocket streaming for security monitoring
  • SIEM Connectors - Pre-built integrations with Splunk, Datadog, etc.
  • Compliance Reports - Automated SOC2/HIPAA report generation

Resources


← Back to Enhancements

Released under the MIT License.