馃悗
Performance Benchmark
Publicbump up cuda version to 12.4, also update sglang version
Passed in 6h 35m and blocked
bootstrap
馃殌 Ready for comparing vllm against alternatives? This will take 4 hours.
A100 vllm latest main
A100 sglang benchmark
A100 lmdeploy benchmark
A100 trt llama-8B
A100 trt llama-70B
Collect the results
馃殌 check the results!
Wait for container to be ready
A100
Description
This file contains the downloading link for benchmarking results.
Please download the visualization scripts in the post
Results reproduction
- Find the docker we use in
benchmarking pipeline
- Deploy the docker, and inside the docker:
- Download
nightly-benchmarks.zip
. - In the same folder, run the following code
- Download
export HF_TOKEN=<your HF token>
apt update
apt install -y git
unzip nightly-benchmarks.zip
VLLM_SOURCE_CODE_LOC=./ bash .buildkite/nightly-benchmarks/scripts/run-nightly-benchmarks.sh
And the results will be inside ./benchmarks/results
.
bootstrapcurl -sSL https://raw.githubusercontent.com/vllm-project/buildkite-ci/main/scripts/kickoff-benchmark.sh | bash
Waited 34s
Ran in 12s

Wait for container to be ready
A100
Total Job Run Time: 4h 54m