Firecracker PR main pipeline

Refactor: Move `GuestMemoryMmap` into `struct Vm`

Canceled automatically after 7m 40s
:pipeline:
๐Ÿชถ Style

Nightly benchmark

The main goal of this benchmarking is two-fold:

  • Performance clarity: Provide clarity on which one (vllm, tensorrt-llm, lmdeploy and tgi) leads in performance in what workload.
  • Reproducible: one can run the exact same set of benchmarking commands inside the exact same docker by following reproducing instructions in reproduce.md.

Versions

We benchmark vllm, tensorrt-llm, lmdeploy and tgi using the following docker images:

  • vllm/vllm-openai:v0.5.0.post1
  • nvcr.io/nvidia/tritonserver:24.04-trtllm-python-py3
  • openmmlab/lmdeploy:v0.5.0
  • ghcr.io/huggingface/text-generation-inference:2.1

Check nightly-pipeline.yaml artifact for more details.

Workload description

We benchmark vllm, tensorrt-llm, lmdeploy and tgi using the following workload:

  • Input length: randomly sample 1000 prompts from ShareGPT dataset (with fixed random seed).
  • Output length: the corresponding output length of these 1000 prompts.
  • Batch size: dynamically determined by vllm and the arrival pattern of the requests.
  • Average QPS (query per second): 4 for 8B model and 2 for larger models. For each QPS, the arrival time of each query is determined using a random Poisson process (with fixed random seed).
  • Models: llama-3 8B, llama-3 70B, mixtral 8x7B.
  • Evaluation metrics: Throughput, TTFT (time to the first token, with mean and std), ITL (inter-token latency, with mean and std).

Check nightly-tests.json artifact for more details.

Known crashes

  • TGI v2.1 crashes when running mixtral model, see TGI PR #2122

Results

Test name GPU Successful req. Tput (req/s) Mean TTFT (ms) Std TTFT (ms) Mean ITL (ms) Std ITL (ms) Engine
tgi_llama8B_tp1_qps_4 A100-SXM4-80GB 500 3.7438 106.226 100.277 16.6865 8.14355 tgi

Plots

In the following plots, the error bar shows the standard error of the mean.

Benchmarking results
:pipeline:.buildkite/pipeline_pr.py | buildkite-agent pipeline upload
Waited 4s
ยท
Ran in 4s
๐Ÿ— m6i.metal al2023 linux_6.1./tools/devtool -y build --rev main --release && ./tools/devtool -y build --release && du -sh build/* && tar czf build_$(uname -m)_ee9eED0F.tar.gz build && buildkite-agent artifact upload build_$(uname -m)_ee9eED0F.tar.gz
Canceled
Waited 4m 31s
ยท
Ran in 3m 23s
๐Ÿ— m7g.metal al2023 linux_6.1./tools/devtool -y build --rev main --release && ./tools/devtool -y build --release && du -sh build/* && tar czf build_$(uname -m)_ee9eED0F.tar.gz build && buildkite-agent artifact upload build_$(uname -m)_ee9eED0F.tar.gz
Canceled
Waited 2m 44s
ยท
Ran in 5m 10s
๐Ÿชถ Style./tools/devtool -y checkstyle
Waited 1s
ยท
Ran in 1m 43s
๐Ÿ” Kani./tools/devtool -y test --no-build -- ../tests/integration_tests/test_kani.py -n auto
Canceled
Waited 3m 2s
ยท
Ran in 4m 55s
๐Ÿ” Kani./tools/devtool -y test --no-build -- ../tests/integration_tests/test_kani.py -n auto
Waited 10s
ยท
Ran in 1m 18s
๐Ÿ“ฆ c5n.metal al2 linux_5.10./tools/devtool -y test --no-build -- integration_tests/build/
Canceled
Waited 4s
ยท
Ran in 7m 46s
๐Ÿ“ฆ c5n.metal al2023 linux_6.1./tools/devtool -y test --no-build -- integration_tests/build/
Canceled
Waited 8s
ยท
Ran in 7m 37s
๐Ÿ“ฆ m5n.metal al2 linux_5.10./tools/devtool -y test --no-build -- integration_tests/build/
Canceled
Waited 4s
ยท
Ran in 7m 39s
๐Ÿ“ฆ m5n.metal al2023 linux_6.1./tools/devtool -y test --no-build -- integration_tests/build/
Canceled
Waited 2m 55s
ยท
Ran in 4m 46s
๐Ÿ“ฆ m6i.metal al2 linux_5.10./tools/devtool -y test --no-build -- integration_tests/build/
Canceled
Waited 1s
ยท
Ran in 7m 54s
๐Ÿ“ฆ m6i.metal al2023 linux_6.1./tools/devtool -y test --no-build -- integration_tests/build/
Canceled
Waited 4m 35s
ยท
Ran in 3m 21s
๐Ÿ“ฆ m6a.metal al2 linux_5.10./tools/devtool -y test --no-build -- integration_tests/build/
Canceled
Waited 8s
ยท
Ran in 7m 41s
๐Ÿ“ฆ m6a.metal al2023 linux_6.1./tools/devtool -y test --no-build -- integration_tests/build/
Canceled
Waited 1m 26s
ยท
Ran in 6m 29s
๐Ÿ“ฆ m7a.metal-48xl al2 linux_5.10./tools/devtool -y test --no-build -- integration_tests/build/
Canceled
Waited 3m 16s
ยท
Ran in 4m 31s
๐Ÿ“ฆ m7a.metal-48xl al2023 linux_6.1./tools/devtool -y test --no-build -- integration_tests/build/
Canceled
Waited 2m 17s
ยท
Ran in 5m 38s
๐Ÿ“ฆ m6g.metal al2 linux_5.10./tools/devtool -y test --no-build -- integration_tests/build/
Canceled
Waited 9s
ยท
Ran in 7m 32s
๐Ÿ“ฆ m6g.metal al2023 linux_6.1./tools/devtool -y test --no-build -- integration_tests/build/
Canceled
Waited 2m 48s
ยท
Ran in 4m 58s
๐Ÿ“ฆ m7g.metal al2 linux_5.10./tools/devtool -y test --no-build -- integration_tests/build/
Canceled
Waited 4s
ยท
Ran in 7m 42s
๐Ÿ“ฆ m7g.metal al2023 linux_6.1./tools/devtool -y test --no-build -- integration_tests/build/
Canceled
Waited 2m 47s
ยท
Ran in 5m 0s
โš™ c5n.metal al2 linux_5.10buildkite-agent artifact download "build_$(uname -m)_ee9eED0F.tar.gz" . && tar xzf build_$(uname -m)_ee9eED0F.tar.gz && ./tools/devtool -y test --no-build -- -n 16 --dist worksteal integration_tests/{functional,security}
Canceled
โš™ c5n.metal al2023 linux_6.1buildkite-agent artifact download "build_$(uname -m)_ee9eED0F.tar.gz" . && tar xzf build_$(uname -m)_ee9eED0F.tar.gz && ./tools/devtool -y test --no-build -- -n 16 --dist worksteal integration_tests/{functional,security}
Canceled
โš™ m5n.metal al2 linux_5.10buildkite-agent artifact download "build_$(uname -m)_ee9eED0F.tar.gz" . && tar xzf build_$(uname -m)_ee9eED0F.tar.gz && ./tools/devtool -y test --no-build -- -n 16 --dist worksteal integration_tests/{functional,security}
Canceled
โš™ m5n.metal al2023 linux_6.1buildkite-agent artifact download "build_$(uname -m)_ee9eED0F.tar.gz" . && tar xzf build_$(uname -m)_ee9eED0F.tar.gz && ./tools/devtool -y test --no-build -- -n 16 --dist worksteal integration_tests/{functional,security}
Canceled
โš™ m6i.metal al2 linux_5.10buildkite-agent artifact download "build_$(uname -m)_ee9eED0F.tar.gz" . && tar xzf build_$(uname -m)_ee9eED0F.tar.gz && ./tools/devtool -y test --no-build -- -n 16 --dist worksteal integration_tests/{functional,security}
Canceled
โš™ m6i.metal al2023 linux_6.1buildkite-agent artifact download "build_$(uname -m)_ee9eED0F.tar.gz" . && tar xzf build_$(uname -m)_ee9eED0F.tar.gz && ./tools/devtool -y test --no-build -- -n 16 --dist worksteal integration_tests/{functional,security}
Canceled
โš™ m6a.metal al2 linux_5.10buildkite-agent artifact download "build_$(uname -m)_ee9eED0F.tar.gz" . && tar xzf build_$(uname -m)_ee9eED0F.tar.gz && ./tools/devtool -y test --no-build -- -n 16 --dist worksteal integration_tests/{functional,security}
Canceled
โš™ m6a.metal al2023 linux_6.1buildkite-agent artifact download "build_$(uname -m)_ee9eED0F.tar.gz" . && tar xzf build_$(uname -m)_ee9eED0F.tar.gz && ./tools/devtool -y test --no-build -- -n 16 --dist worksteal integration_tests/{functional,security}
Canceled
โš™ m7a.metal-48xl al2 linux_5.10buildkite-agent artifact download "build_$(uname -m)_ee9eED0F.tar.gz" . && tar xzf build_$(uname -m)_ee9eED0F.tar.gz && ./tools/devtool -y test --no-build -- -n 16 --dist worksteal integration_tests/{functional,security}
Canceled
โš™ m7a.metal-48xl al2023 linux_6.1buildkite-agent artifact download "build_$(uname -m)_ee9eED0F.tar.gz" . && tar xzf build_$(uname -m)_ee9eED0F.tar.gz && ./tools/devtool -y test --no-build -- -n 16 --dist worksteal integration_tests/{functional,security}
Canceled
โš™ m6g.metal al2 linux_5.10buildkite-agent artifact download "build_$(uname -m)_ee9eED0F.tar.gz" . && tar xzf build_$(uname -m)_ee9eED0F.tar.gz && ./tools/devtool -y test --no-build -- -n 16 --dist worksteal integration_tests/{functional,security}
Canceled
โš™ m6g.metal al2023 linux_6.1buildkite-agent artifact download "build_$(uname -m)_ee9eED0F.tar.gz" . && tar xzf build_$(uname -m)_ee9eED0F.tar.gz && ./tools/devtool -y test --no-build -- -n 16 --dist worksteal integration_tests/{functional,security}
Canceled
โš™ m7g.metal al2 linux_5.10buildkite-agent artifact download "build_$(uname -m)_ee9eED0F.tar.gz" . && tar xzf build_$(uname -m)_ee9eED0F.tar.gz && ./tools/devtool -y test --no-build -- -n 16 --dist worksteal integration_tests/{functional,security}
Canceled
โš™ m7g.metal al2023 linux_6.1buildkite-agent artifact download "build_$(uname -m)_ee9eED0F.tar.gz" . && tar xzf build_$(uname -m)_ee9eED0F.tar.gz && ./tools/devtool -y test --no-build -- -n 16 --dist worksteal integration_tests/{functional,security}
Canceled
โฑ c5n.metal al2 linux_5.10buildkite-agent artifact download "build_$(uname -m)_ee9eED0F.tar.gz" . && tar xzf build_$(uname -m)_ee9eED0F.tar.gz && ./tools/devtool -y test --no-build --performance -c 1-10 -m 0 -- ../tests/integration_tests/performance/
Canceled
โฑ c5n.metal al2023 linux_6.1buildkite-agent artifact download "build_$(uname -m)_ee9eED0F.tar.gz" . && tar xzf build_$(uname -m)_ee9eED0F.tar.gz && ./tools/devtool -y test --no-build --performance -c 1-10 -m 0 -- ../tests/integration_tests/performance/
Canceled
โฑ m5n.metal al2 linux_5.10buildkite-agent artifact download "build_$(uname -m)_ee9eED0F.tar.gz" . && tar xzf build_$(uname -m)_ee9eED0F.tar.gz && ./tools/devtool -y test --no-build --performance -c 1-10 -m 0 -- ../tests/integration_tests/performance/
Canceled
โฑ m5n.metal al2023 linux_6.1buildkite-agent artifact download "build_$(uname -m)_ee9eED0F.tar.gz" . && tar xzf build_$(uname -m)_ee9eED0F.tar.gz && ./tools/devtool -y test --no-build --performance -c 1-10 -m 0 -- ../tests/integration_tests/performance/
Canceled
โฑ m6i.metal al2 linux_5.10buildkite-agent artifact download "build_$(uname -m)_ee9eED0F.tar.gz" . && tar xzf build_$(uname -m)_ee9eED0F.tar.gz && ./tools/devtool -y test --no-build --performance -c 1-10 -m 0 -- ../tests/integration_tests/performance/
Canceled
โฑ m6i.metal al2023 linux_6.1buildkite-agent artifact download "build_$(uname -m)_ee9eED0F.tar.gz" . && tar xzf build_$(uname -m)_ee9eED0F.tar.gz && ./tools/devtool -y test --no-build --performance -c 1-10 -m 0 -- ../tests/integration_tests/performance/
Canceled
โฑ m6a.metal al2 linux_5.10buildkite-agent artifact download "build_$(uname -m)_ee9eED0F.tar.gz" . && tar xzf build_$(uname -m)_ee9eED0F.tar.gz && ./tools/devtool -y test --no-build --performance -c 1-10 -m 0 -- ../tests/integration_tests/performance/
Canceled
โฑ m6a.metal al2023 linux_6.1buildkite-agent artifact download "build_$(uname -m)_ee9eED0F.tar.gz" . && tar xzf build_$(uname -m)_ee9eED0F.tar.gz && ./tools/devtool -y test --no-build --performance -c 1-10 -m 0 -- ../tests/integration_tests/performance/
Canceled
โฑ m7a.metal-48xl al2 linux_5.10buildkite-agent artifact download "build_$(uname -m)_ee9eED0F.tar.gz" . && tar xzf build_$(uname -m)_ee9eED0F.tar.gz && ./tools/devtool -y test --no-build --performance -c 1-10 -m 0 -- ../tests/integration_tests/performance/
Canceled
โฑ m7a.metal-48xl al2023 linux_6.1buildkite-agent artifact download "build_$(uname -m)_ee9eED0F.tar.gz" . && tar xzf build_$(uname -m)_ee9eED0F.tar.gz && ./tools/devtool -y test --no-build --performance -c 1-10 -m 0 -- ../tests/integration_tests/performance/
Canceled
โฑ m6g.metal al2 linux_5.10buildkite-agent artifact download "build_$(uname -m)_ee9eED0F.tar.gz" . && tar xzf build_$(uname -m)_ee9eED0F.tar.gz && ./tools/devtool -y test --no-build --performance -c 1-10 -m 0 -- ../tests/integration_tests/performance/
Canceled
โฑ m6g.metal al2023 linux_6.1buildkite-agent artifact download "build_$(uname -m)_ee9eED0F.tar.gz" . && tar xzf build_$(uname -m)_ee9eED0F.tar.gz && ./tools/devtool -y test --no-build --performance -c 1-10 -m 0 -- ../tests/integration_tests/performance/
Canceled
โฑ m7g.metal al2 linux_5.10buildkite-agent artifact download "build_$(uname -m)_ee9eED0F.tar.gz" . && tar xzf build_$(uname -m)_ee9eED0F.tar.gz && ./tools/devtool -y test --no-build --performance -c 1-10 -m 0 -- ../tests/integration_tests/performance/
Canceled
โฑ m7g.metal al2023 linux_6.1buildkite-agent artifact download "build_$(uname -m)_ee9eED0F.tar.gz" . && tar xzf build_$(uname -m)_ee9eED0F.tar.gz && ./tools/devtool -y test --no-build --performance -c 1-10 -m 0 -- ../tests/integration_tests/performance/
Canceled
Total Job Run Time: 1h 45m