Skip to content

sampling-demo

Purpose

Demonstrate MCP sampling protocol

Description

The sampling-demo tool demonstrates how MCP servers can request LLM operations from clients. This is the "reverse" of normal tool calling - instead of the client asking the server to do something, the server asks the client to perform an LLM task.

This powerful feature enables:

  • Content summarization by the server
  • Text analysis (sentiment, entities, etc.)
  • Content generation for dynamic responses
  • Format transformation (JSON to prose, etc.)

The server sends a prompt to the client, and the client uses its LLM (like Claude) to process the request and return the result.

Parameters

ParameterTypeRequiredDescription
taskenumNoTask type: summarize, analyze, generate, transform (default: summarize)
inputstringNoText to process (default: example text)
modelPreferencestringNoPreferred model to use (e.g., "claude-3-opus")

Example Usage

Ask Claude:

"Show me how the sampling protocol works with a summarization task"
"Demonstrate text analysis using MCP sampling"
"Use sampling-demo to generate content"
"How does reverse LLM calling work in MCP?"

Response

# MCP Sampling Protocol Demo

## Task: SUMMARIZE
**Instruction:** Summarize the input text into key points
**Input:** [example text shown]

## Example Sampling Request

The server would send this request to the client:
{
  "method": "sampling/createMessage",
  "params": {
    "messages": [
      {
        "role": "user",
        "content": "Summarize this text: ..."
      }
    ],
    "modelPreferences": {
      "hints": [
        {
          "name": "claude-3-opus"
        }
      ]
    }
  }
}

The client performs the LLM operation and returns the result.

Task Types

summarize

Condenses text into key points or a shorter form.

Use cases:

  • Summarizing long documents
  • Creating bullet-point summaries
  • Extracting main ideas

Example:

"Use sampling to summarize this article"

analyze

Performs analysis on text content.

Use cases:

  • Sentiment analysis
  • Entity extraction
  • Topic classification
  • Intent detection

Example:

"Analyze the sentiment of this customer feedback"

generate

Creates new content based on input or instructions.

Use cases:

  • Writing descriptions
  • Generating examples
  • Creating variations
  • Completing partial text

Example:

"Generate a product description based on these features"

transform

Converts text from one format to another.

Use cases:

  • JSON to prose
  • Structured data to narrative
  • Code to documentation
  • Format conversion

Example:

"Transform this JSON data into a readable paragraph"

Authentication

Required: No

The sampling-demo tool is a demonstration tool and does not require authentication.

How MCP Sampling Works

The sampling protocol enables server-initiated LLM requests:

  1. Server needs LLM capability (summarization, analysis, etc.)
  2. Server sends sampling request to client with prompt and preferences
  3. Client invokes LLM (Claude or other model)
  4. Client returns LLM response to server
  5. Server uses result in its logic or returns it to user

This allows servers without direct LLM access to leverage the client's LLM capabilities.

Sampling Request Structure

typescript
{
  method: "sampling/createMessage",
  params: {
    messages: [
      {
        role: "user" | "assistant",
        content: "Prompt text..."
      }
    ],
    modelPreferences?: {
      hints: [
        {
          name: "claude-3-opus" // Preferred model
        }
      ],
      costPriority?: number,     // 0-1, prefer cheaper models
      speedPriority?: number,    // 0-1, prefer faster models
      intelligencePriority?: number  // 0-1, prefer smarter models
    },
    systemPrompt?: string,
    includeContext?: "none" | "thisServer" | "allServers",
    temperature?: number,
    maxTokens?: number,
    stopSequences?: string[],
    metadata?: Record<string, any>
  }
}

Model Preferences

Servers can specify preferences for which model to use:

By Name

typescript
{
  hints: [{ name: "claude-3-opus" }]
}

By Priority

typescript
{
  costPriority: 0.8,        // Prefer cheaper models
  speedPriority: 0.5,       // Medium speed preference
  intelligencePriority: 0.3 // Less emphasis on capability
}

The client chooses the best available model based on these preferences.

Common Use Cases

Dynamic Content Generation

Server generates personalized responses using sampling
→ Creates context-aware messages for each user

Content Analysis

Server analyzes user input for intent or sentiment
→ Routes requests based on analysis results

Batch Processing

Server processes multiple items through LLM
→ Summarizes, categorizes, or transforms data

Smart Caching

Server uses sampling for expensive operations
→ Caches results to reduce future LLM calls

Real-World Examples

Automated Summarization

MCP server receives long document
→ Requests summarization via sampling
→ Returns concise summary to user
→ User gets instant summary without manual request

Intelligent Routing

User sends message to MCP server
→ Server uses sampling to analyze intent
→ Routes to appropriate backend service
→ Provides smart request handling

Content Enhancement

Server has partial data (JSON)
→ Uses sampling to generate descriptions
→ Returns enriched content to client
→ Better user experience with minimal data

Context Inclusion

The includeContext parameter controls what context the client includes:

ValueDescription
noneNo additional context (default)
thisServerInclude context from this MCP server only
allServersInclude context from all connected MCP servers

This allows the LLM to make informed decisions based on the current conversation context.

Next Steps

Released under the MIT License.