🧪

CI

Public

CI Testing of the vLLM Repo

Passed in 43m 23s and blocked
bootstrap
:docker: build image
Build CUDA 12.1 image
:docker: build image CUDA 12.1
Build CUDA 11.8 image
:docker: build image CUDA 11.8
Documentation Build
Run Async Engine, Inputs, Utils, Worker Test
Async Engine, Inputs, Utils, Worker Test
Run Python-only Installation Test
Python-only Installation Test
Run Basic Correctness Test
Basic Correctness Test
Run Chunked Prefill Test
Chunked Prefill Test
Run Core Test
Core Test
Run Entrypoints Test
Entrypoints Test
Run Distributed Tests (4 GPUs)
Distributed Tests (4 GPUs)
Run Metrics, Tracing Test
Metrics, Tracing Test
Run Regression Test
Regression Test
Run Engine Test
Engine Test
Run V1 Test
V1 Test
Run Examples Test
Examples Test
Run Prefix Caching Test
Prefix Caching Test
Run Samplers Test
Samplers Test
Run LogitsProcessor Test
LogitsProcessor Test
Run Speculative decoding tests
Speculative decoding tests
Run LoRA Test %N
Run PyTorch Compilation Unit Tests
PyTorch Compilation Unit Tests
Run PyTorch Fullgraph Smoke Test
PyTorch Fullgraph Smoke Test
Run PyTorch Fullgraph Test
PyTorch Fullgraph Test
Run Kernels Test %N
Run Tensorizer Test
Tensorizer Test
Run Benchmarks
Benchmarks
Run Quantization Test
Quantization Test
Run LM Eval Small Models
LM Eval Small Models
Run OpenAI API correctness
OpenAI API correctness
Run Encoder Decoder tests
Encoder Decoder tests
Run OpenAI-Compatible Tool Use
OpenAI-Compatible Tool Use
Run Basic Models Test
Basic Models Test
Run Language Models Test (Standard)
Language Models Test (Standard)
Run Language Models Test (Extended)
Language Models Test (Extended)
Run Multi-Modal Models Test (Standard)
Multi-Modal Models Test (Standard)
Run Multi-Modal Models Test (Extended) 1
Multi-Modal Models Test (Extended) 1
Run Multi-Modal Models Test (Extended) 2
Multi-Modal Models Test (Extended) 2
Run Custom Models Test
Custom Models Test
Run Distributed Comm Ops Test
Distributed Comm Ops Test
Run 2 Node Tests (4 GPUs in total)
2 Node Tests (4 GPUs in total)
Run Distributed Tests (2 GPUs)
Distributed Tests (2 GPUs)
Run Plugin Tests (2 GPUs)
Plugin Tests (2 GPUs)
Run Multi-step Tests (4 GPUs)
Multi-step Tests (4 GPUs)
Run Pipeline Parallelism Test
Pipeline Parallelism Test
Run LoRA TP Test (Distributed)
LoRA TP Test (Distributed)
Run Weight Loading Multiple GPU Test
Weight Loading Multiple GPU Test
Run Weight Loading Multiple GPU Test - Large Models
Weight Loading Multiple GPU Test - Large Models
Run Distributed Tests (A100)
Distributed Tests (A100)
Run LM Eval Large Models
LM Eval Large Models
Neuron Test
Run Intel CPU test
Intel CPU Test
Intel HPU Test
Intel GPU Test
Run IBM Power(ppc64le) CPU Test
IBM Power(ppc64le) CPU Test
TPU V0 Test
TPU V1 Test

The following flags didn't break any passing modules

  • --incompatible_disable_native_repo_rules :github:
  • --incompatible_disallow_struct_provider_syntax :github:
  • --incompatible_enable_deprecated_label_apis :github:

The following flags didn't break any passing Bazel team owned/co-owned modules

  • --incompatible_autoload_externally= :github: (77 other jobs need migration)
  • --incompatible_disable_starlark_host_transitions :github: (7 other jobs need migration)
  • --incompatible_config_setting_private_default_visibility :github: (5 other jobs need migration)

The following jobs already fail without incompatible flags

The following modules need migration

28 modules need migration, click to see details

Projects need to migrate for the following flags:

Projects marked with :bazel: need to be migrated by the Bazel team.

bootstrapif [[ -n "" ]]; then VLLM_CI_BRANCH= curl -sSL "https://raw.githubusercontent.com/vllm-project/buildkite-ci//scripts/bootstrap.sh" | bash && exit 0; fi && curl -sSL "https://raw.githubusercontent.com/vllm-project/buildkite-ci/main/scripts/bootstrap.sh" | bash
Waited 37s
·
Ran in 19s
:docker: build imageaws ecr-public get-login-password --region us-east-1 | docker login --username AWS --password-stdin public.ecr.aws/q9t5s3a7 && #!/bin/bash && if [[ -z $(docker manifest inspect public.ecr.aws/q9t5s3a7/vllm-ci-test-repo:97264b7de926f3aba545021d4e3dae41758b592e) ]]; then && echo "Image not found, proceeding with build..." && else && echo "Image found" && exit 0 && fi && docker build --file docker/Dockerfile --build-arg max_jobs=16 --build-arg buildkite_commit=97264b7de926f3aba545021d4e3dae41758b592e --build-arg USE_SCCACHE=1 --tag public.ecr.aws/q9t5s3a7/vllm-ci-test-repo:97264b7de926f3aba545021d4e3dae41758b592e --target test --progress plain . && docker push public.ecr.aws/q9t5s3a7/vllm-ci-test-repo:97264b7de926f3aba545021d4e3dae41758b592e
Waited 40s
·
Ran in 31m 47s
:docker: build image CUDA 12.1aws ecr-public get-login-password --region us-east-1 | docker login --username AWS --password-stdin public.ecr.aws/q9t5s3a7 && #!/bin/bash && if [[ -z $(docker manifest inspect public.ecr.aws/q9t5s3a7/vllm-ci-test-repo:97264b7de926f3aba545021d4e3dae41758b592e-cu121) ]]; then && echo "Image not found, proceeding with build..." && else && echo "Image found" && exit 0 && fi && docker build --file docker/Dockerfile --build-arg max_jobs=16 --build-arg buildkite_commit=97264b7de926f3aba545021d4e3dae41758b592e --build-arg USE_SCCACHE=1 --build-arg CUDA_VERSION=12.1.0 --tag public.ecr.aws/q9t5s3a7/vllm-ci-test-repo:97264b7de926f3aba545021d4e3dae41758b592e-cu121 --target test --progress plain . && docker push public.ecr.aws/q9t5s3a7/vllm-ci-test-repo:97264b7de926f3aba545021d4e3dae41758b592e-cu121
:docker: build image CUDA 11.8aws ecr-public get-login-password --region us-east-1 | docker login --username AWS --password-stdin public.ecr.aws/q9t5s3a7 && #!/bin/bash && if [[ -z $(docker manifest inspect public.ecr.aws/q9t5s3a7/vllm-ci-test-repo:97264b7de926f3aba545021d4e3dae41758b592e-cu118) ]]; then && echo "Image not found, proceeding with build..." && else && echo "Image found" && exit 0 && fi && docker build --file docker/Dockerfile --build-arg max_jobs=16 --build-arg buildkite_commit=97264b7de926f3aba545021d4e3dae41758b592e --build-arg USE_SCCACHE=1 --build-arg CUDA_VERSION=11.8.0 --tag public.ecr.aws/q9t5s3a7/vllm-ci-test-repo:97264b7de926f3aba545021d4e3dae41758b592e-cu118 --target test --progress plain . && docker push public.ecr.aws/q9t5s3a7/vllm-ci-test-repo:97264b7de926f3aba545021d4e3dae41758b592e-cu118
Documentation Build
Waited 29s
·
Ran in 4m 40s
Async Engine, Inputs, Utils, Worker Test
Python-only Installation Test
Basic Correctness Test
Chunked Prefill Test
Core Test
Entrypoints Test
Distributed Tests (4 GPUs)
Metrics, Tracing Test
Regression Test
Engine Test
V1 Test
Examples Test
Prefix Caching Test
Samplers Test
LogitsProcessor Test
Speculative decoding tests
1/4
LoRA Test 1
2/4
LoRA Test 2
3/4
LoRA Test 3
4/4
LoRA Test 4
PyTorch Compilation Unit Tests
PyTorch Fullgraph Smoke Test
PyTorch Fullgraph Test
1/4
Kernels Test 1
2/4
Kernels Test 2
3/4
Kernels Test 3
4/4
Kernels Test 4
Tensorizer Test
Benchmarks
Quantization Test
LM Eval Small Models
OpenAI API correctness
Encoder Decoder tests
OpenAI-Compatible Tool Use
Basic Models Test
Language Models Test (Standard)
Language Models Test (Extended)
Multi-Modal Models Test (Standard)
Multi-Modal Models Test (Extended) 1
Multi-Modal Models Test (Extended) 2
Custom Models Test
Distributed Comm Ops Test
2 Node Tests (4 GPUs in total)./.buildkite/scripts/run-multi-node-test.sh /vllm-workspace/tests 2 2 public.ecr.aws/q9t5s3a7/vllm-ci-test-repo:97264b7de926f3aba545021d4e3dae41758b592e "VLLM_TEST_SAME_HOST=0 torchrun --nnodes 2 --nproc-per-node=2 --rdzv_backend=c10d --rdzv_endpoint=192.168.10.10 distributed/test_same_node.py | grep 'Same node test passed' && VLLM_MULTI_NODE=1 pytest -v -s distributed/test_multi_node_assignment.py && VLLM_MULTI_NODE=1 pytest -v -s distributed/test_pipeline_parallel.py" "VLLM_TEST_SAME_HOST=0 torchrun --nnodes 2 --nproc-per-node=2 --rdzv_backend=c10d --rdzv_endpoint=192.168.10.10 distributed/test_same_node.py | grep 'Same node test passed'"
Distributed Tests (2 GPUs)
Plugin Tests (2 GPUs)
Multi-step Tests (4 GPUs)
Pipeline Parallelism Test
LoRA TP Test (Distributed)
Weight Loading Multiple GPU Test
Weight Loading Multiple GPU Test - Large Models
Distributed Tests (A100)
LM Eval Large Models
AMD: :docker: build imagegrep -i 'from base as test' docker/Dockerfile.rocm && docker build --build-arg max_jobs=16 --tag rocm/vllm-ci:97264b7de926f3aba545021d4e3dae41758b592e -f docker/Dockerfile.rocm --target test --progress plain . || docker build --build-arg max_jobs=16 --tag rocm/vllm-ci:97264b7de926f3aba545021d4e3dae41758b592e -f docker/Dockerfile.rocm --progress plain . && docker push rocm/vllm-ci:97264b7de926f3aba545021d4e3dae41758b592e
Waited 1s
·
Ran in 22m 32s
AMD: Core Testbash .buildkite/scripts/hardware_ci/run-amd-test.sh "(command rocm-smi || true) && export VLLM_LOGGING_LEVEL=DEBUG && export VLLM_ALLOW_DEPRECATED_BEAM_SEARCH=1 && cd /vllm-workspace/tests ; pytest -v -s core"
Waited 13s
·
Ran in 19m 58s
AMD: Metrics, Tracing Testbash .buildkite/scripts/hardware_ci/run-amd-test.sh "(command rocm-smi || true) && export VLLM_LOGGING_LEVEL=DEBUG && export VLLM_ALLOW_DEPRECATED_BEAM_SEARCH=1 && cd /vllm-workspace/tests ; pytest -v -s metrics && pytest -v -s tracing"
Waited 12s
·
Ran in 18m 25s
AMD: Engine Testbash .buildkite/scripts/hardware_ci/run-amd-test.sh "(command rocm-smi || true) && export VLLM_LOGGING_LEVEL=DEBUG && export VLLM_ALLOW_DEPRECATED_BEAM_SEARCH=1 && cd /vllm-workspace/tests ; pytest -v -s engine test_sequence.py test_config.py test_logger.py && pytest -v -s tokenization"
Waited 12s
·
Ran in 20m 24s
AMD: Prefix Caching Testbash .buildkite/scripts/hardware_ci/run-amd-test.sh "(command rocm-smi || true) && export VLLM_LOGGING_LEVEL=DEBUG && export VLLM_ALLOW_DEPRECATED_BEAM_SEARCH=1 && cd /vllm-workspace/tests ; pytest -v -s prefix_caching"
Waited 13s
·
Ran in 15m 27s
AMD: LogitsProcessor Testbash .buildkite/scripts/hardware_ci/run-amd-test.sh "(command rocm-smi || true) && export VLLM_LOGGING_LEVEL=DEBUG && export VLLM_ALLOW_DEPRECATED_BEAM_SEARCH=1 && cd /vllm-workspace/tests ; pytest -v -s test_logits_processor.py && pytest -v -s model_executor/test_guided_processors.py"
Waited 14s
·
Ran in 10m 15s
AMD: Benchmarksbash .buildkite/scripts/hardware_ci/run-amd-test.sh "(command rocm-smi || true) && export VLLM_LOGGING_LEVEL=DEBUG && export VLLM_ALLOW_DEPRECATED_BEAM_SEARCH=1 && cd /vllm-workspace/.buildkite ; bash scripts/run-benchmarks.sh"
Waited 3s
·
Ran in 9m 43s
AMD: Custom Models Testbash .buildkite/scripts/hardware_ci/run-amd-test.sh "(command rocm-smi || true) && export VLLM_LOGGING_LEVEL=DEBUG && export VLLM_ALLOW_DEPRECATED_BEAM_SEARCH=1 && cd /vllm-workspace/tests ; echo 'Testing custom models...'"
Waited 16s
·
Ran in 7m 33s
AMD: Distributed Comm Ops Testbash .buildkite/scripts/hardware_ci/run-amd-test.sh "(command rocm-smi || true) && export VLLM_LOGGING_LEVEL=DEBUG && export VLLM_ALLOW_DEPRECATED_BEAM_SEARCH=1 && cd /vllm-workspace/tests ; pytest -v -s distributed/test_comm_ops.py && pytest -v -s distributed/test_shm_broadcast.py"
Waited 10s
·
Ran in 4m 19s
Neuron Testbash .buildkite/scripts/hardware_ci/run-neuron-test.sh
Waited 6s
·
Ran in 3m 47s
Intel CPU Testbash .buildkite/scripts/hardware_ci/run-cpu-test.sh
Intel HPU Testbash .buildkite/scripts/hardware_ci/run-hpu-test.sh
Waited 6s
·
Ran in 1m 19s
Intel GPU Testbash .buildkite/scripts/hardware_ci/run-xpu-test.sh
Waited 12s
·
Ran in 5m 15s
Harry Mellor unblocked Run IBM Power(ppc64le) CPU Test
IBM Power(ppc64le) CPU Testbash .buildkite/scripts/hardware_ci/run-cpu-test-ppc64le.sh
Waited 3s
·
Ran in 22m 52s
TPU V0 Testyes | docker system prune -a && if [[ -f ".buildkite/scripts/hardware_ci/run-tpu-test.sh" ]]; then bash .buildkite/scripts/hardware_ci/run-tpu-test.sh; fi
Waited 8s
·
Ran in 1s
TPU V1 Testif [[ -f ".buildkite/scripts/hardware_ci/run-tpu-v1-test.sh" ]]; then bash .buildkite/scripts/hardware_ci/run-tpu-v1-test.sh; fi && yes | docker system prune -a
Waited 10s
·
Ran in 35m 21s
Total Job Run Time: 3h 53m