Skip to content

Pull requests: vllm-project/vllm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[Core] Per-group BlockPool for hybrid Mamba/attention models v1
#39031 opened Apr 5, 2026 by arbi-dev Loading…
4 of 5 tasks
nano_nemotron_vl: fix tensor device mismatch exception when video profiling ready ONLY add when PR is ready to merge/full CI is needed
#39029 opened Apr 5, 2026 by netanel-haber Loading…
Gemma4 multi-turn, tool calling, and reasoning fixes documentation Improvements or additions to documentation frontend tool-calling
#39027 opened Apr 5, 2026 by bbrowning Draft
5 tasks
Add structure to requirements/ directory ci/build cpu Related to CPU backends documentation Improvements or additions to documentation intel-gpu Related to Intel GPU nvidia ready ONLY add when PR is ready to merge/full CI is needed ready-run-all-tests Trigger CI with all tests for wide-ranging PRs rocm Related to AMD ROCm
#39024 opened Apr 5, 2026 by hmellor Loading…
[MoE][Fix] Fix DeepEP HT hardcoded per_act_token_quant=False
#39023 opened Apr 5, 2026 by thc1006 Loading…
2 tasks
[MoE] BF16 Triton MoE Perf regression - restore low latency path ready ONLY add when PR is ready to merge/full CI is needed
#39016 opened Apr 5, 2026 by milesial Loading…
[vLLM IR] rework gemma_rms_norm ready-run-all-tests Trigger CI with all tests for wide-ranging PRs
#39014 opened Apr 5, 2026 by ZJY0516 Loading…
5 tasks
Refactor move experts ci/build documentation Improvements or additions to documentation nvidia performance Performance-related issues ready ONLY add when PR is ready to merge/full CI is needed rocm Related to AMD ROCm
#39013 opened Apr 5, 2026 by Jackmin801 Loading…
1 task
Update MusicFlamingo and add AudioFlamingoNext documentation Improvements or additions to documentation multi-modality Related to multi-modality (#4194) new-model Requests to new models
#39011 opened Apr 5, 2026 by lashahub Loading…
4 of 5 tasks
[MoE] Move remaining PrepareAndFinalize to prepare finalize folder ready ONLY add when PR is ready to merge/full CI is needed ready-run-all-tests Trigger CI with all tests for wide-ranging PRs
#39009 opened Apr 5, 2026 by Jackmin801 Loading…
1 task
[MoE] Move GPT OSS Triton kernel experts into fused_moe/experts/ documentation Improvements or additions to documentation gpt-oss Related to GPT-OSS models ready ONLY add when PR is ready to merge/full CI is needed ready-run-all-tests Trigger CI with all tests for wide-ranging PRs
#39007 opened Apr 5, 2026 by Jackmin801 Loading…
3 tasks done
[MoE] Move DEEP_GEMM into experts/ subdirectory documentation Improvements or additions to documentation needs-rebase performance Performance-related issues ready ONLY add when PR is ready to merge/full CI is needed ready-run-all-tests Trigger CI with all tests for wide-ranging PRs
#39005 opened Apr 5, 2026 by Jackmin801 Loading…
5 tasks done
[Frontend] Add /v1/files upload endpoint for multimodal inputs (#38531) documentation Improvements or additions to documentation frontend multi-modality Related to multi-modality (#4194)
#39003 opened Apr 4, 2026 by Alberto-Codes Loading…
4 of 5 tasks
[Bugfix] Fix FlashInfer crash with kv_cache_dtype_skip_layers bug Something isn't working nvidia ready ONLY add when PR is ready to merge/full CI is needed v1
#39002 opened Apr 4, 2026 by yzong-rh Loading…
3 of 5 tasks
[BugFix][Parser] Fixing Qwen3.5 tool call parsing bug Something isn't working qwen Related to Qwen models ready ONLY add when PR is ready to merge/full CI is needed tool-calling
#38996 opened Apr 4, 2026 by Gregory-Pereira Loading…
[Bug] Fix routing bias dtype for trtllm per-block fp8 moe bug Something isn't working nvidia
#38989 opened Apr 4, 2026 by wzhao18 Loading…
5 tasks
ProTip! Type g p on any issue or pull request to go back to the pull request listing page.