multimodal

Here are 72 public repositories matching this topic...

Mintplex-Labs / anything-llm

The all-in-one AI productivity accelerator. On device and privacy first with no annoying setup or configuration.

mcp web-scraping no-code ai-agents kimi multimodal rag moonshot vector-database llm localai local-llm ollama lmstudio deepseek llama3 custom-ai-agents mcp-servers qwen3

Updated Apr 5, 2026
JavaScript

qingchencloud / clawpanel

Star

🦞 OpenClaw 可视化管理面板 — 内置 AI 助手（工具调用 + 图片识别 + 多模态），一键安装 | Visual management panel with built-in AI assistant (tool calling + vision + multimodal + i18n(11))

Updated Apr 5, 2026
JavaScript

lxe / llavavision

Star

A simple "Be My Eyes" web app with a llama.cpp/llava backend

machine-learning ai computer-vision artificial-intelligence webapp llama multimodal llm llamacpp local-llm

Updated Nov 28, 2023
JavaScript

qingchencloud / clawapp

Star

📱 ClawApp — OpenClaw AI 智能体手机聊天客户端 | 流式对话 · 图片收发 · 工具调用 · PWA + APK | Mobile chat client for OpenClaw AI Agent

android i18n markdown streaming pwa websocket self-hosted chinese capacitor chat-client dark-mode h5 ai-agents multimodal voice-input ai-assistant tool-calling openclaw mobile-chat

Updated Mar 24, 2026
JavaScript

fanglu0411 / sgs

Star

SGS, is a user-friendly, collaborative and versatile browser for visualizing single-cell and spatial multiomics data.

visualization single-cell genome-browser sgs multimodal scrna scatac zarr anndata mudata spatial-omics sceqtl scmethylc schic

Updated Mar 4, 2025
JavaScript

aymenfurter / smartrag

Star

Deep Research through Multi-Agents, using GraphRAG

azure openai multi-agent-systems autogen multimodal voice-mode llm graphrag gpt-4o deep-research

Updated Aug 21, 2025
JavaScript

taco-group / DecAlign

Star

[ICLR 2026] DecAlign: Aligning Cross-Modal Semantics for Multimodal Foundation Models

video alignment decoupling video-understanding emotion-recognition interpretability multimodal-learning multimodal-sentiment-analysis multimodal multimodal-deep-learning

Updated Feb 5, 2026
JavaScript

rustic-ai / rustic-ui-components

Star

React component library for crafting user-friendly and engaging conversational experiences

chat ai reactjs mui reactjs-components conversational-ai multimodal

Updated Feb 11, 2026
JavaScript

sutdcv / SUTD-TrafficQA

Star

[CVPR 2021] SUTD-TrafficQA: A Question Answering Benchmark and an Efficient Network for Video Reasoning over Traffic Events

paper annotations dataset vqa cvpr video-qa vqa-dataset traffic-events multimodal multimodal-deep-learning cvpr2021 video-reasoning

Updated Feb 9, 2026
JavaScript

aj-archipelago / cortex

Star

Simplify and accelerate AI-powered application development with structured interfaces to models and powerful prompt execution environments.

graphql router ai rest-api entities gemini openai llama claude multimodal vertex-ai llm chatgpt

Updated Feb 11, 2026
JavaScript

kellyvv / WeiClaw

Star

一条命令，把微信变成任何 AI Agent 的入口 | Connect WeChat to any AI Agent with one command

nodejs open-source chatbot multi-agent gemini openai weixin wechat acp wechat-bot codex voice-assistant claude multimodal ai-agent llm chatgpt ai-gateway agent-protocol

Updated Mar 24, 2026
JavaScript

phetsims / paper-land

Star

Build and explore multimodal web interactives with pieces of paper!

javascript open-source community paper ar augmented codesign multimodal

Updated Mar 11, 2026
JavaScript

aws-samples / improve-employee-productivity-using-genai

Star

Employee Productivity GenAI Assistant Example is an innovative code sample and architecture pattern designed to enhance writing tasks efficiency using AWS serverless technologies and Amazon Bedrock's generative AI models.

aws aws-lambda aws-s3 aws-apigateway aws-serverless aws-dynamodb aws-sam multimodal servereless aws-cloud9 generative-ai anthropic-claude genai aws-bedrock bedrock-claude-llm

Updated Feb 14, 2026
JavaScript

YCSE / nanobanana-mcp

Star

Gemini Vision & Image Generation MCP for Claude Desktop and Claude Code

ai mcp gemini image-generation claude multimodal google-ai vision-ai claude-desktop model-context-protocol

Updated Mar 3, 2026
JavaScript

rimmi21-zz / Alexa-APL-Fact-Skill

Star

Sample skill which demonstrates the new Alexa Presentation Language (APL). The multi modal skill functionality is same as Alexa Fact Skill template it will select a fact at random and tell it to the user when the multi modal skill is invoked and is compatible with devices having display.

Updated Jun 26, 2019
JavaScript

Hastur-HP / The-Brain

Star

A multimodal RAG dashboard and interactive 3D Knowledge Graph. Process documents locally with Ollama or via Cloud APIs, powered by LightRAG, RAG-Anything and Neo4j.

knowledge-graph multimodal rag llm rag-pipeline ai-dashboard lightrag rag-anything

Updated Mar 23, 2026
JavaScript

cargofy / ATLAS

Star

Universal MCP server for logistics. Connect any TMS/WMS to AI agents. Shipments, carriers, tenders, tracking — all via Model Context Protocol.

open-source mcp transportation supply-chain self-hosted logistics freight ai-agents trucking multimodal mcp-server mcp-tools

Updated Mar 6, 2026
JavaScript

palubad / MMTS-GEE

Star

Google Earth Engine tool to generate multi-modal and multi-temporal datasets, including spatially and temporally aligned Sentinel-1 SAR data, Sentinel-2 multispectral data, weather and DEM-based data. A supplementary material for Paluba et al. 2024: "Identification of Optimal Sentinel-1 SAR Polarimetric Parameters for Forest Monitoring in Czechia

machine-learning time-series dataset digital-elevation-model remote-sensing google-earth-engine earth-observation gee time-series-analysis synthetic-aperture-radar multitemporal-remote-sensing sentinel-2 earth-engine multimodal sentinel-1 era5-land multitemporal-data copernicus-dem

Updated Nov 9, 2025
JavaScript

aws-samples / semantic-image-search-for-articles

Star

How you can add semantic search to your applications. This sample shows how you can use a multimodal model to find images which are semantically similar to some text. New blog coming out soon.

search aws semantic vector multimodal vector-search generative-ai

Updated Jan 30, 2026
JavaScript

begonia599 / CodexBridge

Star

OpenAI-compatible bridge that exposes Codex CLI/SDK as /v1/chat/completions with multimodal + JSON schema

docker json-schema bridge codex non-commercial multimodal openai-compatible codex-sdk

Updated Nov 13, 2025
JavaScript

Improve this page

Add a description, image, and links to the multimodal topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the multimodal topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

multimodal

Here are 72 public repositories matching this topic...

Mintplex-Labs / anything-llm

qingchencloud / clawpanel

lxe / llavavision

qingchencloud / clawapp

fanglu0411 / sgs

aymenfurter / smartrag

taco-group / DecAlign

rustic-ai / rustic-ui-components

sutdcv / SUTD-TrafficQA

aj-archipelago / cortex

kellyvv / WeiClaw

phetsims / paper-land

aws-samples / improve-employee-productivity-using-genai

YCSE / nanobanana-mcp

rimmi21-zz / Alexa-APL-Fact-Skill

Hastur-HP / The-Brain

cargofy / ATLAS

palubad / MMTS-GEE

aws-samples / semantic-image-search-for-articles

begonia599 / CodexBridge

Improve this page

Add this topic to your repo