Osprey

Osprey is a lightweight Bash library for interacting with the DMR (Docker Model Runner) API. It provides simple functions to perform chat completions, streaming responses, and conversation memory management with LLM models through OpenAI-compatible APIs.

Features

Chat Completions: Send messages to LLM models and receive responses
Streaming Support: Real-time streaming responses for interactive applications
Conversation Memory: Built-in functions to manage chat history and context
Simple Integration: Easy-to-use Bash functions that work with any OpenAI-compatible API

Requirements

jq - A lightweight and flexible command-line JSON processor.
curl - A command-line tool for transferring data with URLs.
bash - A Unix shell and command language.
gum - A tool for creating interactive command-line applications.

Install the library

curl -fsSL https://github.com/k33g/osprey/releases/download/v0.0.6/osprey.sh -o ./osprey.sh
chmod +x ./osprey.sh

Usage

Source the library in your script:

. "./osprey.sh"

Basic Chat Completion

DMR_BASE_URL="http://localhost:12434/engines/llama.cpp/v1"
MODEL="ai/qwen2.5:latest"

read -r -d '' DATA <<- EOM
{
  "model":"'${MODEL}'",
  "messages": [
    {"role":"user", "content": "Hello, how are you?"}
  ],
  "stream": false
}
EOM

response=$(osprey_chat ${DMR_BASE_URL} "${DATA}")
echo "${response}"

Streaming Chat

DMR_BASE_URL="http://localhost:12434/engines/llama.cpp/v1"
MODEL="ai/qwen2.5:latest"

read -r -d '' DATA <<- EOM
{
  "model":"'${MODEL}'",
  "messages": [
    {"role":"user", "content": "Hello, how are you?"}
  ],
  "stream": true
}
EOM

function callback() {
  echo -ne "$1" 
}

osprey_chat_stream ${DMR_BASE_URL} "${DATA}" callback

Function Calling

DMR_BASE_URL="http://localhost:12434/engines/llama.cpp/v1"
MODEL="hf.co/salesforce/xlam-2-3b-fc-r-gguf:q4_k_s"

# Define your tools in JSON format
read -r -d '' TOOLS <<- EOM
[
  {
    "type": "function",
    "function": {
      "name": "calculate_sum",
      "description": "Calculate the sum of two numbers",
      "parameters": {
        "type": "object",
        "properties": {
          "a": {"type": "number", "description": "The first number"},
          "b": {"type": "number", "description": "The second number"}
        },
        "required": ["a", "b"]
      }
    }
  }
]
EOM

read -r -d '' DATA <<- EOM
{
  "model":"'${MODEL}'",
  "messages": [
    {"role":"user", "content": "Calculate the sum of 5 and 10"}
  ],
  "tools": '${TOOLS}',
  "tool_choice": "auto"
}
EOM

# Make the function call request
response=$(osprey_tool_calls ${DMR_BASE_URL} "${DATA}")

# Extract and process tool calls
TOOL_CALLS=$(get_tool_calls "${response}")
for tool_call in $TOOL_CALLS; do
    FUNCTION_NAME=$(get_function_name "$tool_call")
    FUNCTION_ARGS=$(get_function_args "$tool_call")
    CALL_ID=$(get_call_id "$tool_call")
    
    # Execute your function logic here
    case "$FUNCTION_NAME" in
        "calculate_sum")
            A=$(echo "$FUNCTION_ARGS" | jq -r '.a')
            B=$(echo "$FUNCTION_ARGS" | jq -r '.b')
            SUM=$((A + B))
            echo "Result: $SUM"
            ;;
    esac
done

Note on Parallel Tool Calls: The parallel_tool_calls parameter enables models to make multiple function calls simultaneously. However, only a few local models support this feature effectively:

hf.co/salesforce/llama-xlam-2-8b-fc-r-gguf:q4_k_m
hf.co/salesforce/xlam-2-3b-fc-r-gguf:q4_k_m
hf.co/salesforce/xlam-2-3b-fc-r-gguf:q4_k_s
hf.co/salesforce/xlam-2-3b-fc-r-gguf:q3_k_l

Example with parallel tool calls:

read -r -d '' DATA <<- EOM
{
  "model": "${MODEL}",
  "options": {
    "temperature": 0.0
  },
  "messages": [
    {
      "role": "user",
      "content": "Say hello to Bob and to Sam, make the sum of 5 and 37"
    }
  ],
  "tools": ${TOOLS},
  "parallel_tool_calls": true,
  "tool_choice": "auto"
}
EOM

See the examples/ directory for more detailed usage examples including conversation memory management.

Using STDIO MCP Server

Osprey supports Model Context Protocol (MCP) servers with STDIO transport for extended function calling capabilities. You can use custom MCP servers that communicate via standard input/output to provide additional tools and functionalities.

Setting up an MCP Server

First, build your MCP server Docker image:

cd examples/07-use-mcp/mcp-server
docker build -t osprey-mcp-server:demo .

Using MCP Tools

#!/bin/bash
. "./osprey.sh"

DMR_BASE_URL="http://localhost:12434/engines/llama.cpp/v1"
MODEL="hf.co/salesforce/xlam-2-3b-fc-r-gguf:q4_k_s"

# Define the MCP server command
SERVER_CMD="docker run --rm -i osprey-mcp-server:demo"

# Get available tools from MCP server
MCP_TOOLS=$(get_mcp_tools "$SERVER_CMD")
TOOLS=$(transform_to_openai_format "$MCP_TOOLS")

read -r -d '' DATA <<- EOM
{
  "model": "${MODEL}",
  "options": {
    "temperature": 0.0
  },
  "messages": [
    {
      "role": "user",
      "content": "Say hello to Bob and calculate the sum of 5 and 37"
    }
  ],
  "tools": ${TOOLS},
  "parallel_tool_calls": true,
  "tool_choice": "auto"
}
EOM

# Make function call request
RESULT=$(osprey_tool_calls ${DMR_BASE_URL} "${DATA}")
TOOL_CALLS=$(get_tool_calls "${RESULT}")

# Process tool calls
for tool_call in $TOOL_CALLS; do
    FUNCTION_NAME=$(get_function_name "$tool_call")
    FUNCTION_ARGS=$(get_function_args "$tool_call")
    
    # Execute function via MCP
    MCP_RESPONSE=$(call_mcp_tool "$SERVER_CMD" "$FUNCTION_NAME" "$FUNCTION_ARGS")
    RESULT_CONTENT=$(get_tool_content "$MCP_RESPONSE")
    
    echo "Function result: $RESULT_CONTENT"
done

Using Streamable HTTP MCP Server

Osprey supports MCP servers with streamable HTTP transport for real-time tool execution and response streaming. This allows for more interactive experiences with MCP tools that can provide streaming responses.

Setting up a Streamable HTTP MCP Server

First, build your streamable HTTP MCP server Docker image:

cd examples/10-use-streamable-mcp/mcp-server
docker build -t osprey-streamable-mcp-server:demo .

Start the server:

docker run --rm -p 8080:8080 osprey-streamable-mcp-server:demo

Using Streamable MCP Tools

#!/bin/bash
. "./osprey.sh"

DMR_BASE_URL="http://localhost:12434/engines/llama.cpp/v1"
MODEL="hf.co/salesforce/xlam-2-3b-fc-r-gguf:q4_k_s"

# Define the streamable HTTP MCP server endpoint
MCP_SERVER="http://localhost:9090"

# Get available tools from streamable MCP server
MCP_TOOLS=$(get_mcp_http_tools "$MCP_SERVER")
TOOLS=$(transform_to_openai_format "$MCP_TOOLS")

read -r -d '' DATA <<- EOM
{
  "model": "${MODEL}",
  "options": {
    "temperature": 0.0
  },
  "messages": [
    {
      "role": "user",
      "content": "Say hello to Bob and to Sam, make the sum of 5 and 37"
    }
  ],
  "tools": ${TOOLS},
  "parallel_tool_calls": true,
  "tool_choice": "auto"
}
EOM

# Make function call request
RESULT=$(osprey_tool_calls ${DMR_BASE_URL} "${DATA}")
TOOL_CALLS=$(get_tool_calls "${RESULT}")

# Process tool calls with streaming support
for tool_call in $TOOL_CALLS; do
    FUNCTION_NAME=$(get_function_name "$tool_call")
    FUNCTION_ARGS=$(get_function_args "$tool_call")
        
    # Execute function via MCP
    MCP_RESPONSE=$(call_mcp_http_tool "$MCP_SERVER" "$FUNCTION_NAME" "$FUNCTION_ARGS")
    RESULT_CONTENT=$(get_tool_content_http "$MCP_RESPONSE")
    
    echo "Function result: $RESULT_CONTENT"
done

Benefits of Streamable HTTP Transport

HTTP Standards: Leverages standard HTTP streaming protocols
Scalability: Easier to deploy and scale than STDIO servers

Using Docker MCP Gateway

The Docker MCP Gateway provides access to a collection of pre-built MCP tools through Docker's MCP integration. This allows you to leverage existing MCP tools without setting up individual servers.

Basic Usage

#!/bin/bash
. "./osprey.sh"

DMR_BASE_URL="http://localhost:12434/engines/llama.cpp/v1"
MODEL="hf.co/salesforce/xlam-2-3b-fc-r-gguf:q4_k_s"

# Use Docker MCP Gateway
SERVER_CMD="docker mcp gateway run"

# Get available tools and filter specific ones
MCP_TOOLS=$(get_mcp_tools "$SERVER_CMD")
TOOLS=$(transform_to_openai_format_with_filter "${MCP_TOOLS}" "search" "fetch")

read -r -d '' DATA <<- EOM
{
  "model": "${MODEL}",
  "options": {
    "temperature": 0.0
  },
  "messages": [
    {
      "role": "user",
      "content": "fetch https://raw.githubusercontent.com/k33g/osprey/refs/heads/main/README.md"
    }
  ],
  "tools": ${TOOLS},
  "tool_choice": "auto"
}
EOM

# Execute the request
RESULT=$(osprey_tool_calls ${DMR_BASE_URL} "${DATA}")
TOOL_CALLS=$(get_tool_calls "${RESULT}")

# Process tool calls
for tool_call in $TOOL_CALLS; do
    FUNCTION_NAME=$(get_function_name "$tool_call")
    FUNCTION_ARGS=$(get_function_args "$tool_call")
    
    # Execute function via MCP Gateway
    MCP_RESPONSE=$(call_mcp_tool "$SERVER_CMD" "$FUNCTION_NAME" "$FUNCTION_ARGS")
    RESULT_CONTENT=$(get_tool_content "$MCP_RESPONSE")
    
    echo "Function result: $RESULT_CONTENT"
done

Tool Filtering

You can filter available tools using the transform_to_openai_format_with_filter function to only include tools that match specific criteria:

# Filter tools containing "search" or "fetch"
TOOLS=$(transform_to_openai_format_with_filter "${MCP_TOOLS}" "search" "fetch")

Creating an Agent with Agentic Compose

You can create containerized AI agents using Docker Compose for easy deployment and management. The examples/05-compose-agent/ directory demonstrates how to build a complete agentic system.

Quick Start

cd examples/05-compose-agent/
docker compose up --build -d
docker attach $(docker compose ps -q seven-of-nine-agent)

Agent Architecture

The agentic compose setup includes:

Containerized Environment: Complete isolation with all dependencies
Interactive Interface: Uses gum for enhanced command-line interactions
Conversation Memory: Persistent chat history throughout sessions
Streaming Responses: Real-time token generation
Character Personas: Configurable system instructions for roleplay

Configuration

Configure your agent through compose.yml:

services:
  your-agent:
    build:
      context: .
      dockerfile: Dockerfile
      args:
        - OSPREY_VERSION=v0.0.1
    tty: true
    stdin_open: true
    environment:
      SYSTEM_INSTRUCTION: |
        You are a helpful AI assistant.
        Your role is to...
    models:
      chat_model:
        endpoint_var: MODEL_RUNNER_BASE_URL
        model_var: MODEL_RUNNER_CHAT_MODEL

models:
  chat_model:
    model: ai/qwen2.5:latest

Agent Script Structure

#!/bin/bash
. "./osprey.sh"

# Initialize conversation history array
CONVERSATION_HISTORY=()

function callback() {
  echo -ne "$1"
  ASSISTANT_RESPONSE+="$1"
}

while true; do
  USER_CONTENT=$(gum write --placeholder "How can I help you?")
  
  if [[ "$USER_CONTENT" == "/bye" ]]; then
    break
  fi

  # Add user message to conversation history
  add_user_message CONVERSATION_HISTORY "${USER_CONTENT}"

  # Build messages array with conversation history
  MESSAGES=$(build_messages_array CONVERSATION_HISTORY)
  
  # Create API request with conversation history
  read -r -d '' DATA <<- EOM
{
  "model":"${MODEL}",
  "options": {
    "temperature": 0.5,
    "repeat_last_n": 2
  },
  "messages": [${MESSAGES}],
  "stream": true
}
EOM
  
  ASSISTANT_RESPONSE=""
  osprey_chat_stream ${DMR_BASE_URL} "${DATA}" callback
  
  # Add assistant response to conversation history
  add_assistant_message CONVERSATION_HISTORY "${ASSISTANT_RESPONSE}"
  
  echo -e "\n"
done

This creates a fully interactive, containerized AI agent with conversation memory and streaming responses.

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
.devcontainer		.devcontainer
.github/workflows		.github/workflows
.vscode		.vscode
examples		examples
lib		lib
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
git.sh		git.sh
release.env		release.env
release.go		release.go
release.sh		release.sh
remove-tag.sh		remove-tag.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Osprey

Features

Requirements

Install the library

Usage

Basic Chat Completion

Streaming Chat

Function Calling

Using STDIO MCP Server

Setting up an MCP Server

Using MCP Tools

Using Streamable HTTP MCP Server

Setting up a Streamable HTTP MCP Server

Using Streamable MCP Tools

Benefits of Streamable HTTP Transport

Using Docker MCP Gateway

Basic Usage

Tool Filtering

Creating an Agent with Agentic Compose

Quick Start

Agent Architecture

Configuration

Agent Script Structure

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Osprey

Features

Requirements

Install the library

Usage

Basic Chat Completion

Streaming Chat

Function Calling

Using STDIO MCP Server

Setting up an MCP Server

Using MCP Tools

Using Streamable HTTP MCP Server

Setting up a Streamable HTTP MCP Server

Using Streamable MCP Tools

Benefits of Streamable HTTP Transport

Using Docker MCP Gateway

Basic Usage

Tool Filtering

Creating an Agent with Agentic Compose

Quick Start

Agent Architecture

Configuration

Agent Script Structure

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages