[Firecracker] PR
PublicFirecracker PR main pipeline
Refactor: Move `GuestMemoryMmap` into `struct Vm`
Canceled automatically after 7m 40s
Nightly benchmark
The main goal of this benchmarking is two-fold:
- Performance clarity: Provide clarity on which one (vllm, tensorrt-llm, lmdeploy and tgi) leads in performance in what workload.
- Reproducible: one can run the exact same set of benchmarking commands inside the exact same docker by following reproducing instructions in reproduce.md.
Versions
We benchmark vllm, tensorrt-llm, lmdeploy and tgi using the following docker images:
- vllm/vllm-openai:v0.5.0.post1
- nvcr.io/nvidia/tritonserver:24.04-trtllm-python-py3
- openmmlab/lmdeploy:v0.5.0
- ghcr.io/huggingface/text-generation-inference:2.1
Check nightly-pipeline.yaml artifact for more details.
Workload description
We benchmark vllm, tensorrt-llm, lmdeploy and tgi using the following workload:
- Input length: randomly sample 1000 prompts from ShareGPT dataset (with fixed random seed).
- Output length: the corresponding output length of these 1000 prompts.
- Batch size: dynamically determined by vllm and the arrival pattern of the requests.
- Average QPS (query per second): 4 for 8B model and 2 for larger models. For each QPS, the arrival time of each query is determined using a random Poisson process (with fixed random seed).
- Models: llama-3 8B, llama-3 70B, mixtral 8x7B.
- Evaluation metrics: Throughput, TTFT (time to the first token, with mean and std), ITL (inter-token latency, with mean and std).
Check nightly-tests.json artifact for more details.
Known crashes
- TGI v2.1 crashes when running mixtral model, see TGI PR #2122
Results
Test name | GPU | Successful req. | Tput (req/s) | Mean TTFT (ms) | Std TTFT (ms) | Mean ITL (ms) | Std ITL (ms) | Engine |
---|---|---|---|---|---|---|---|---|
tgi_llama8B_tp1_qps_4 | A100-SXM4-80GB | 500 | 3.7438 | 106.226 | 100.277 | 16.6865 | 8.14355 | tgi |
Plots
In the following plots, the error bar shows the standard error of the mean.
๐ m6i.metal al2023 linux_6.1./tools/devtool -y build --rev main --release && ./tools/devtool -y build --release && du -sh build/* && tar czf build_$(uname -m)_ee9eED0F.tar.gz build && buildkite-agent artifact upload build_$(uname -m)_ee9eED0F.tar.gz
Canceled
Waited 4m 31s
Ran in 3m 23s
๐ m7g.metal al2023 linux_6.1./tools/devtool -y build --rev main --release && ./tools/devtool -y build --release && du -sh build/* && tar czf build_$(uname -m)_ee9eED0F.tar.gz build && buildkite-agent artifact upload build_$(uname -m)_ee9eED0F.tar.gz
Canceled
Waited 2m 44s
Ran in 5m 10s
๐ Kani./tools/devtool -y test --no-build -- ../tests/integration_tests/test_kani.py -n auto
Canceled
Waited 3m 2s
Ran in 4m 55s
๐ Kani./tools/devtool -y test --no-build -- ../tests/integration_tests/test_kani.py -n auto
Waited 10s
Ran in 1m 18s
๐ฆ c5n.metal al2 linux_5.10./tools/devtool -y test --no-build -- integration_tests/build/
Canceled
Waited 4s
Ran in 7m 46s
๐ฆ c5n.metal al2023 linux_6.1./tools/devtool -y test --no-build -- integration_tests/build/
Canceled
Waited 8s
Ran in 7m 37s
๐ฆ m5n.metal al2 linux_5.10./tools/devtool -y test --no-build -- integration_tests/build/
Canceled
Waited 4s
Ran in 7m 39s
๐ฆ m5n.metal al2023 linux_6.1./tools/devtool -y test --no-build -- integration_tests/build/
Canceled
Waited 2m 55s
Ran in 4m 46s
๐ฆ m6i.metal al2 linux_5.10./tools/devtool -y test --no-build -- integration_tests/build/
Canceled
Waited 1s
Ran in 7m 54s
๐ฆ m6i.metal al2023 linux_6.1./tools/devtool -y test --no-build -- integration_tests/build/
Canceled
Waited 4m 35s
Ran in 3m 21s
๐ฆ m6a.metal al2 linux_5.10./tools/devtool -y test --no-build -- integration_tests/build/
Canceled
Waited 8s
Ran in 7m 41s
๐ฆ m6a.metal al2023 linux_6.1./tools/devtool -y test --no-build -- integration_tests/build/
Canceled
Waited 1m 26s
Ran in 6m 29s
๐ฆ m7a.metal-48xl al2 linux_5.10./tools/devtool -y test --no-build -- integration_tests/build/
Canceled
Waited 3m 16s
Ran in 4m 31s
๐ฆ m7a.metal-48xl al2023 linux_6.1./tools/devtool -y test --no-build -- integration_tests/build/
Canceled
Waited 2m 17s
Ran in 5m 38s
๐ฆ m6g.metal al2 linux_5.10./tools/devtool -y test --no-build -- integration_tests/build/
Canceled
Waited 9s
Ran in 7m 32s
๐ฆ m6g.metal al2023 linux_6.1./tools/devtool -y test --no-build -- integration_tests/build/
Canceled
Waited 2m 48s
Ran in 4m 58s
๐ฆ m7g.metal al2 linux_5.10./tools/devtool -y test --no-build -- integration_tests/build/
Canceled
Waited 4s
Ran in 7m 42s
๐ฆ m7g.metal al2023 linux_6.1./tools/devtool -y test --no-build -- integration_tests/build/
Canceled
Waited 2m 47s
Ran in 5m 0s
โ c5n.metal al2 linux_5.10buildkite-agent artifact download "build_$(uname -m)_ee9eED0F.tar.gz" . && tar xzf build_$(uname -m)_ee9eED0F.tar.gz && ./tools/devtool -y test --no-build -- -n 16 --dist worksteal integration_tests/{functional,security}
Canceled
โ c5n.metal al2023 linux_6.1buildkite-agent artifact download "build_$(uname -m)_ee9eED0F.tar.gz" . && tar xzf build_$(uname -m)_ee9eED0F.tar.gz && ./tools/devtool -y test --no-build -- -n 16 --dist worksteal integration_tests/{functional,security}
Canceled
โ m5n.metal al2 linux_5.10buildkite-agent artifact download "build_$(uname -m)_ee9eED0F.tar.gz" . && tar xzf build_$(uname -m)_ee9eED0F.tar.gz && ./tools/devtool -y test --no-build -- -n 16 --dist worksteal integration_tests/{functional,security}
Canceled
โ m5n.metal al2023 linux_6.1buildkite-agent artifact download "build_$(uname -m)_ee9eED0F.tar.gz" . && tar xzf build_$(uname -m)_ee9eED0F.tar.gz && ./tools/devtool -y test --no-build -- -n 16 --dist worksteal integration_tests/{functional,security}
Canceled
โ m6i.metal al2 linux_5.10buildkite-agent artifact download "build_$(uname -m)_ee9eED0F.tar.gz" . && tar xzf build_$(uname -m)_ee9eED0F.tar.gz && ./tools/devtool -y test --no-build -- -n 16 --dist worksteal integration_tests/{functional,security}
Canceled
โ m6i.metal al2023 linux_6.1buildkite-agent artifact download "build_$(uname -m)_ee9eED0F.tar.gz" . && tar xzf build_$(uname -m)_ee9eED0F.tar.gz && ./tools/devtool -y test --no-build -- -n 16 --dist worksteal integration_tests/{functional,security}
Canceled
โ m6a.metal al2 linux_5.10buildkite-agent artifact download "build_$(uname -m)_ee9eED0F.tar.gz" . && tar xzf build_$(uname -m)_ee9eED0F.tar.gz && ./tools/devtool -y test --no-build -- -n 16 --dist worksteal integration_tests/{functional,security}
Canceled
โ m6a.metal al2023 linux_6.1buildkite-agent artifact download "build_$(uname -m)_ee9eED0F.tar.gz" . && tar xzf build_$(uname -m)_ee9eED0F.tar.gz && ./tools/devtool -y test --no-build -- -n 16 --dist worksteal integration_tests/{functional,security}
Canceled
โ m7a.metal-48xl al2 linux_5.10buildkite-agent artifact download "build_$(uname -m)_ee9eED0F.tar.gz" . && tar xzf build_$(uname -m)_ee9eED0F.tar.gz && ./tools/devtool -y test --no-build -- -n 16 --dist worksteal integration_tests/{functional,security}
Canceled
โ m7a.metal-48xl al2023 linux_6.1buildkite-agent artifact download "build_$(uname -m)_ee9eED0F.tar.gz" . && tar xzf build_$(uname -m)_ee9eED0F.tar.gz && ./tools/devtool -y test --no-build -- -n 16 --dist worksteal integration_tests/{functional,security}
Canceled
โ m6g.metal al2 linux_5.10buildkite-agent artifact download "build_$(uname -m)_ee9eED0F.tar.gz" . && tar xzf build_$(uname -m)_ee9eED0F.tar.gz && ./tools/devtool -y test --no-build -- -n 16 --dist worksteal integration_tests/{functional,security}
Canceled
โ m6g.metal al2023 linux_6.1buildkite-agent artifact download "build_$(uname -m)_ee9eED0F.tar.gz" . && tar xzf build_$(uname -m)_ee9eED0F.tar.gz && ./tools/devtool -y test --no-build -- -n 16 --dist worksteal integration_tests/{functional,security}
Canceled
โ m7g.metal al2 linux_5.10buildkite-agent artifact download "build_$(uname -m)_ee9eED0F.tar.gz" . && tar xzf build_$(uname -m)_ee9eED0F.tar.gz && ./tools/devtool -y test --no-build -- -n 16 --dist worksteal integration_tests/{functional,security}
Canceled
โ m7g.metal al2023 linux_6.1buildkite-agent artifact download "build_$(uname -m)_ee9eED0F.tar.gz" . && tar xzf build_$(uname -m)_ee9eED0F.tar.gz && ./tools/devtool -y test --no-build -- -n 16 --dist worksteal integration_tests/{functional,security}
Canceled
โฑ c5n.metal al2 linux_5.10buildkite-agent artifact download "build_$(uname -m)_ee9eED0F.tar.gz" . && tar xzf build_$(uname -m)_ee9eED0F.tar.gz && ./tools/devtool -y test --no-build --performance -c 1-10 -m 0 -- ../tests/integration_tests/performance/
Canceled
โฑ c5n.metal al2023 linux_6.1buildkite-agent artifact download "build_$(uname -m)_ee9eED0F.tar.gz" . && tar xzf build_$(uname -m)_ee9eED0F.tar.gz && ./tools/devtool -y test --no-build --performance -c 1-10 -m 0 -- ../tests/integration_tests/performance/
Canceled
โฑ m5n.metal al2 linux_5.10buildkite-agent artifact download "build_$(uname -m)_ee9eED0F.tar.gz" . && tar xzf build_$(uname -m)_ee9eED0F.tar.gz && ./tools/devtool -y test --no-build --performance -c 1-10 -m 0 -- ../tests/integration_tests/performance/
Canceled
โฑ m5n.metal al2023 linux_6.1buildkite-agent artifact download "build_$(uname -m)_ee9eED0F.tar.gz" . && tar xzf build_$(uname -m)_ee9eED0F.tar.gz && ./tools/devtool -y test --no-build --performance -c 1-10 -m 0 -- ../tests/integration_tests/performance/
Canceled
โฑ m6i.metal al2 linux_5.10buildkite-agent artifact download "build_$(uname -m)_ee9eED0F.tar.gz" . && tar xzf build_$(uname -m)_ee9eED0F.tar.gz && ./tools/devtool -y test --no-build --performance -c 1-10 -m 0 -- ../tests/integration_tests/performance/
Canceled
โฑ m6i.metal al2023 linux_6.1buildkite-agent artifact download "build_$(uname -m)_ee9eED0F.tar.gz" . && tar xzf build_$(uname -m)_ee9eED0F.tar.gz && ./tools/devtool -y test --no-build --performance -c 1-10 -m 0 -- ../tests/integration_tests/performance/
Canceled
โฑ m6a.metal al2 linux_5.10buildkite-agent artifact download "build_$(uname -m)_ee9eED0F.tar.gz" . && tar xzf build_$(uname -m)_ee9eED0F.tar.gz && ./tools/devtool -y test --no-build --performance -c 1-10 -m 0 -- ../tests/integration_tests/performance/
Canceled
โฑ m6a.metal al2023 linux_6.1buildkite-agent artifact download "build_$(uname -m)_ee9eED0F.tar.gz" . && tar xzf build_$(uname -m)_ee9eED0F.tar.gz && ./tools/devtool -y test --no-build --performance -c 1-10 -m 0 -- ../tests/integration_tests/performance/
Canceled
โฑ m7a.metal-48xl al2 linux_5.10buildkite-agent artifact download "build_$(uname -m)_ee9eED0F.tar.gz" . && tar xzf build_$(uname -m)_ee9eED0F.tar.gz && ./tools/devtool -y test --no-build --performance -c 1-10 -m 0 -- ../tests/integration_tests/performance/
Canceled
โฑ m7a.metal-48xl al2023 linux_6.1buildkite-agent artifact download "build_$(uname -m)_ee9eED0F.tar.gz" . && tar xzf build_$(uname -m)_ee9eED0F.tar.gz && ./tools/devtool -y test --no-build --performance -c 1-10 -m 0 -- ../tests/integration_tests/performance/
Canceled
โฑ m6g.metal al2 linux_5.10buildkite-agent artifact download "build_$(uname -m)_ee9eED0F.tar.gz" . && tar xzf build_$(uname -m)_ee9eED0F.tar.gz && ./tools/devtool -y test --no-build --performance -c 1-10 -m 0 -- ../tests/integration_tests/performance/
Canceled
โฑ m6g.metal al2023 linux_6.1buildkite-agent artifact download "build_$(uname -m)_ee9eED0F.tar.gz" . && tar xzf build_$(uname -m)_ee9eED0F.tar.gz && ./tools/devtool -y test --no-build --performance -c 1-10 -m 0 -- ../tests/integration_tests/performance/
Canceled
โฑ m7g.metal al2 linux_5.10buildkite-agent artifact download "build_$(uname -m)_ee9eED0F.tar.gz" . && tar xzf build_$(uname -m)_ee9eED0F.tar.gz && ./tools/devtool -y test --no-build --performance -c 1-10 -m 0 -- ../tests/integration_tests/performance/
Canceled
โฑ m7g.metal al2023 linux_6.1buildkite-agent artifact download "build_$(uname -m)_ee9eED0F.tar.gz" . && tar xzf build_$(uname -m)_ee9eED0F.tar.gz && ./tools/devtool -y test --no-build --performance -c 1-10 -m 0 -- ../tests/integration_tests/performance/
Canceled
Total Job Run Time: 1h 45m