-
-
Notifications
You must be signed in to change notification settings - Fork 329
Home
Michael A. Kuykendall edited this page Dec 16, 2025
·
4 revisions
Welcome to the Shimmy documentation wiki - your comprehensive guide to the 5MB Ollama alternative.
- Installation Guide - Install shimmy on all platforms
- Quick Start - Get running in 5 minutes
- Configuration - Configure shimmy for your needs
- GPU Support - CUDA, Metal, and MOE hybrid acceleration
- MOE Support - Mixture of Experts CPU/GPU offloading (documentation coming soon)
- Model Filtering - Smart LLM-only model discovery (documentation coming soon)
- OpenAI API - Full compatibility with existing tools (documentation coming soon)
- Shimmy Vision - Local AI vision API overview
- Quick Start - Get Vision running in 5 minutes
- API Reference - Complete endpoint documentation
- Analysis Modes - OCR, Layout, Web, Brief, Full
- Troubleshooting - Common issues and solutions
- Building from Source - Compile shimmy yourself
- Contributing - How to contribute to shimmy
- Architecture - Technical architecture overview
- Release Process - How releases are created
- Common Issues - Solutions to common problems
- Performance Tuning - Optimize shimmy performance
- Model Loading Issues - Fix model discovery problems
Shimmy is a lightweight Ollama alternative that provides:
- ✅ OpenAI API compatibility - drop-in replacement for existing tools
- ✅ MOE CPU offloading - run 70B+ models on consumer hardware
- ✅ Smart model filtering - automatically excludes non-LLM models
- ✅ Multi-GPU support - CUDA, Metal, Vulkan, OpenCL acceleration
- ✅ Release gate quality - 6-gate validation ensures reliability
- ✅ Cross-platform - Windows, macOS, Linux binaries
- ✅ Lightweight - sub-10MB binary vs 500MB+ alternatives
- ✅ Vision API - AI-powered OCR, layout analysis, web scraping
# Linux/macOS
curl -L https://github.com/Michael-A-Kuykendall/shimmy/releases/latest/download/shimmy -o shimmy
chmod +x shimmy
# Windows
curl -L https://github.com/Michael-A-Kuykendall/shimmy/releases/latest/download/shimmy.exe -o shimmy.exe
# With Rust/Cargo (recommended - includes MOE)
cargo install shimmy --features moe
# CUDA + MOE for NVIDIA GPUs
cargo install shimmy --features llama-cuda,moe📝 This wiki is automatically maintained and updated with each release.