Jump to content
vLLM
Pipelines
Test Suites
Log In
Sign Up
Learn More...
Pipelines
Test Suites
🐎
Performance Benchmark
Public
Builds
GitHub Icon
All branches
All users
All states
All dates
Bugfix for PixtralHF models without spatial_merge_size (#16513)
#11879
43m
Michael Goin
·
main
GitHub Icon
87b836ba7
·
Created
yesterday at 11:32 PM
Loading steps…
[Bugfix] clean up duplicated code (#16485)
#11878
55m
Isotr0py
·
main
GitHub Icon
56c76c2e0
·
Created
yesterday at 11:19 PM
Loading steps…
Update openai_compatible_server.md (#16507)
#11877
1h
Christian Sears
·
main
GitHub Icon
c09632a66
·
Created
yesterday at 10:55 PM
Loading steps…
[Kernel] Add tuned FusedMoE kernel config for Llama4 Scout, TP=8 on H100 (#16488)
#11876
1h
Yong Hoon Shin
·
main
GitHub Icon
a3bf8d4a2
·
Created
yesterday at 10:26 PM
Loading steps…
[Frontend] Added chat templates for LLaMa4 pythonic tool calling (#16463)
#11875
1h
Ye (Charlotte) Qi
·
main
GitHub Icon
16eda8c43
·
Created
yesterday at 10:26 PM
Loading steps…
Improve configs - `LoadConfig` (#16422)
#11874
3h
Harry Mellor
·
main
GitHub Icon
cd77382ac
·
Created
yesterday at 8:27 PM
Loading steps…
[Bugfix] handle alignment of encoder_seq_lens in mllama.py (#14784)
#11873
4h
Travis Johnson
·
main
GitHub Icon
71b9cde01
·
Created
yesterday at 7:59 PM
Loading steps…
[Doc] Document InternVL3 support (#16495)
#11872
4h
Michael Goin
·
main
GitHub Icon
5285589f3
·
Created
yesterday at 7:41 PM
Loading steps…
[Kernel] Support W8A8 channel-wise weights and per-token activations in triton fused_moe_kernel (#16366)
#11871
6h
Michael Goin
·
main
GitHub Icon
f41647ee6
·
Created
yesterday at 5:54 PM
Loading steps…
[TPU][V1] Make `--disable_chunked_mm_input` mandatory for serving MM models (#16483)
#11870
7h
Nicolò Lucchesi
·
main
GitHub Icon
4d022cbc7
·
Created
yesterday at 5:06 PM
Loading steps…
Fix erroneous "model doesn't support compile" warning (#16486)
#11869
7h
Tyler Michael Smith
·
main
GitHub Icon
70de35a88
·
Created
yesterday at 4:24 PM
Loading steps…
[Hardware][Intel-Gaudi] Multi-step scheduling implementation for HPU (#12779)
#11868
9h
Tomasz Zielinski
·
main
GitHub Icon
34b2cf3b3
·
Created
yesterday at 2:38 PM
Loading steps…
more amd tweaks
#11867
9h
Lucas Wilkinson
·
neuralmagic:lwilkinson/no-pad-fa3
GitHub Icon
38d11aedd
·
Created
yesterday at 2:37 PM
Loading steps…
amd fixes
#11866
7m
Lucas Wilkinson
·
neuralmagic:lwilkinson/no-pad-fa3
GitHub Icon
32f0abe8e
·
Created
yesterday at 2:29 PM
Loading steps…
[Bugfix] Fix bugs of running Quark quantized models (#16236)
#11865
9h
Michael Goin
·
main
GitHub Icon
9e90c9f73
·
Created
yesterday at 2:18 PM
Loading steps…
[Kernel] support merge_attn_states CUDA kernel, 3x speedup (#16173)
#11864
11h
Michael Goin
·
main
GitHub Icon
e9528f6dc
·
Created
yesterday at 12:50 PM
Loading steps…
Don't install triton on `ppc64le` platform (#16470)
#11863
14h
Harry Mellor
·
main
GitHub Icon
51baa9c33
·
Created
yesterday at 10:11 AM
Loading steps…
[Misc] update api_client example (#16459)
#11862
14h
Reid
·
main
GitHub Icon
35e076b3a
·
Created
yesterday at 10:05 AM
Loading steps…
[Misc] Raise error for V1 not supporting Long LoRA. (#16415)
#11861
15h
Jee Jee Li
·
main
GitHub Icon
a26f59ccb
·
Created
yesterday at 8:51 AM
Loading steps…
Enforce valid max_num_batched_tokens when disable_chunked_mm_input=True (#16447)
#11860
16h
Michael Goin
·
main
GitHub Icon
aa3b3d76e
·
Created
yesterday at 8:09 AM
Loading steps…
Next ›