🐳 Docker Deployment

Easy deployment with Docker Compose - just mount your models directory and go!

Quick Start

Create your models directory:

mkdir models

Download some models:

# Example: Download a small model
curl -L "https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-gguf/resolve/main/Phi-3-mini-4k-instruct-q4.gguf" -o models/phi-3-mini.gguf

Start Shimmy:

docker-compose up -d

Test the API:

curl http://localhost:11434/v1/models

Configuration

Environment Variables

Variable	Default	Description
`SHIMMY_PORT`	`11434`	Server port
`SHIMMY_HOST`	`0.0.0.0`	Listen address
`SHIMMY_BASE_GGUF`	`/app/models`	Models directory

Volumes

./models:/app/models - Mount your local models directory
shimmy-cache:/root/.cache - Persistent cache for downloads

GPU Support

For NVIDIA GPU support, ensure you have:

NVIDIA Container Toolkit installed
Docker Compose v2.3+ with GPU support

GPU access is automatically configured in the provided docker-compose.yml.

Usage Examples

Basic Usage

# Start server
docker-compose up -d

# Check logs
docker-compose logs -f shimmy

# Stop server
docker-compose down

Custom Configuration

# docker-compose.override.yml
services:
  shimmy:
    ports:
      - "8080:11434"  # Use port 8080 instead
    environment:
      - SHIMMY_PORT=11434
      - SHIMMY_LOG_LEVEL=debug

Multiple Models

# Your models directory structure
models/
├── phi-3-mini.gguf
├── llama-2-7b.gguf
└── mistral-7b.gguf

Shimmy will automatically discover and serve all .gguf models in the mounted directory.

API Endpoints

Once running, Shimmy provides OpenAI-compatible endpoints:

GET /v1/models - List available models
POST /v1/chat/completions - Chat completions
POST /v1/completions - Text completions
GET /health - Health check

Troubleshooting

Container won't start

# Check logs
docker-compose logs shimmy

# Check if port is available
netstat -tulpn | grep 11434

Models not loading

# Verify models directory is mounted
docker-compose exec shimmy ls -la /app/models

# Check file permissions
ls -la models/

GPU not detected

# Check NVIDIA runtime
docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi

# Verify Docker Compose GPU config
docker-compose config

Building from Source

To build your own image:

# Build the image
docker build -t shimmy:local .

# Use local image in docker-compose.yml
# Replace: image: ghcr.io/michael-a-kuykendall/shimmy:latest
# With:    image: shimmy:local

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

🐳 Docker Deployment

Quick Start

Configuration

Environment Variables

Volumes

GPU Support

Usage Examples

Basic Usage

Custom Configuration

Multiple Models

API Endpoints

Troubleshooting

Container won't start

Models not loading

GPU not detected

Building from Source

Uh oh!

FilesExpand file tree

README-DOCKER.md

Latest commit

History

README-DOCKER.md

File metadata and controls

🐳 Docker Deployment

Quick Start

Configuration

Environment Variables

Volumes

GPU Support

Usage Examples

Basic Usage

Custom Configuration

Multiple Models

API Endpoints

Troubleshooting

Container won't start

Models not loading

GPU not detected

Building from Source