The all-in-one AI productivity accelerator. On device and privacy first with no annoying setup or configuration.
-
Updated
Apr 5, 2026 - JavaScript
The all-in-one AI productivity accelerator. On device and privacy first with no annoying setup or configuration.
🦞 OpenClaw 可视化管理面板 — 内置 AI 助手(工具调用 + 图片识别 + 多模态),一键安装 | Visual management panel with built-in AI assistant (tool calling + vision + multimodal + i18n(11))
A simple "Be My Eyes" web app with a llama.cpp/llava backend
📱 ClawApp — OpenClaw AI 智能体手机聊天客户端 | 流式对话 · 图片收发 · 工具调用 · PWA + APK | Mobile chat client for OpenClaw AI Agent
SGS, is a user-friendly, collaborative and versatile browser for visualizing single-cell and spatial multiomics data.
Deep Research through Multi-Agents, using GraphRAG
[ICLR 2026] DecAlign: Aligning Cross-Modal Semantics for Multimodal Foundation Models
React component library for crafting user-friendly and engaging conversational experiences
[CVPR 2021] SUTD-TrafficQA: A Question Answering Benchmark and an Efficient Network for Video Reasoning over Traffic Events
一条命令,把微信变成任何 AI Agent 的入口 | Connect WeChat to any AI Agent with one command
Build and explore multimodal web interactives with pieces of paper!
Employee Productivity GenAI Assistant Example is an innovative code sample and architecture pattern designed to enhance writing tasks efficiency using AWS serverless technologies and Amazon Bedrock's generative AI models.
Gemini Vision & Image Generation MCP for Claude Desktop and Claude Code
Sample skill which demonstrates the new Alexa Presentation Language (APL). The multi modal skill functionality is same as Alexa Fact Skill template it will select a fact at random and tell it to the user when the multi modal skill is invoked and is compatible with devices having display.
A multimodal RAG dashboard and interactive 3D Knowledge Graph. Process documents locally with Ollama or via Cloud APIs, powered by LightRAG, RAG-Anything and Neo4j.
Universal MCP server for logistics. Connect any TMS/WMS to AI agents. Shipments, carriers, tenders, tracking — all via Model Context Protocol.
Google Earth Engine tool to generate multi-modal and multi-temporal datasets, including spatially and temporally aligned Sentinel-1 SAR data, Sentinel-2 multispectral data, weather and DEM-based data. A supplementary material for Paluba et al. 2024: "Identification of Optimal Sentinel-1 SAR Polarimetric Parameters for Forest Monitoring in Czechia
How you can add semantic search to your applications. This sample shows how you can use a multimodal model to find images which are semantically similar to some text. New blog coming out soon.
OpenAI-compatible bridge that exposes Codex CLI/SDK as /v1/chat/completions with multimodal + JSON schema
Add a description, image, and links to the multimodal topic page so that developers can more easily learn about it.
To associate your repository with the multimodal topic, visit your repo's landing page and select "manage topics."