Jump to content

🧪

CI

Public

CI Testing of the vLLM Repo

Queue Paused

Kevin Luu•Friday at 6:51 PM

add test

#11782

/

bigPYJ1151:moe/46621ac83e(#11831)

Failed in 1h 18m

Documentation Build

Async Engine, Inputs, Utils, Worker Test

Run Python-only Installation Test

Python-only Installation Test

Basic Correctness Test

Chunked Prefill Test

Run Core Test

Entrypoints Test

Run Distributed Tests (4 GPUs)

Distributed Tests (4 GPUs)

Metrics, Tracing Test

Regression Test

Run Examples Test

Prefix Caching Test

LogitsProcessor Test

Run Speculative decoding tests

Speculative decoding tests

Run LoRA Test %N

PyTorch Fullgraph Smoke Test

PyTorch Fullgraph Test

Run Kernels Test %N

Run Tensorizer Test

Tensorizer Test

Run Benchmarks

Run Quantization Test

Quantization Test

Run LM Eval Small Models

LM Eval Small Models

Encoder Decoder tests

OpenAI-Compatible Tool Use

Basic Models Test

Language Models Test (Standard)

Run Language Models Test (Extended)

Language Models Test (Extended)

Multi-Modal Models Test (Standard)

Run Multi-Modal Models Test (Extended) 1

Multi-Modal Models Test (Extended) 1

Run Multi-Modal Models Test (Extended) 2

Multi-Modal Models Test (Extended) 2

Run Custom Models Test

Custom Models Test

Run Distributed Comm Ops Test

Distributed Comm Ops Test

Run 2 Node Tests (4 GPUs in total)

2 Node Tests (4 GPUs in total)

Run Distributed Tests (2 GPUs)

Distributed Tests (2 GPUs)

Run Plugin Tests (2 GPUs)

Plugin Tests (2 GPUs)

Run Multi-step Tests (4 GPUs)

Multi-step Tests (4 GPUs)

Run Pipeline Parallelism Test

Pipeline Parallelism Test

Run LoRA TP Test (Distributed)

LoRA TP Test (Distributed)

Weight Loading Multiple GPU Test

Run Weight Loading Multiple GPU Test - Large Models

Weight Loading Multiple GPU Test - Large Models

Run Distributed Tests (A100)

Distributed Tests (A100)

Run LM Eval Large Models

LM Eval Large Models

IBM Power(ppc64le) CPU Test

Michael Goin

Created Thu 9th Jan at 11:08 PM

Triggered from Webhook

Entrypoints Test

Ran in 1h 16m

Samplers Test

Ran in 34m 11s

AMD: Entrypoints Testbash .buildkite/run-amd-test.sh "(command rocm-smi || true) && export VLLM_LOGGING_LEVEL=DEBUG && export VLLM_ALLOW_DEPRECATED_BEAM_SEARCH=1 && cd /vllm-workspace/tests ; pytest -v -s entrypoints/llm --ignore=entrypoints/llm/test_lazy_outlines.py --ignore=entrypoints/llm/test_generate.py --ignore=entrypoints/llm/test_generate_multiple_loras.py --ignore=entrypoints/llm/test_guided_generate.py && pytest -v -s entrypoints/llm/test_lazy_outlines.py && pytest -v -s entrypoints/llm/test_generate.py && pytest -v -s entrypoints/llm/test_generate_multiple_loras.py && pytest -v -s entrypoints/llm/test_guided_generate.py && pytest -v -s entrypoints/openai --ignore=entrypoints/openai/test_oot_registration.py && pytest -v -s entrypoints/test_chat_utils.py && pytest -v -s entrypoints/offline_mode"

Ran in 3m 27s

AMD: Kernels Test %Nbash .buildkite/run-amd-test.sh "(command rocm-smi || true) && export VLLM_LOGGING_LEVEL=DEBUG && export VLLM_ALLOW_DEPRECATED_BEAM_SEARCH=1 && cd /vllm-workspace/tests ; pytest -v -s kernels --shard-id=$BUILDKITE_PARALLEL_JOB --num-shards=$BUILDKITE_PARALLEL_JOB_COUNT"

Ran in 2m 51s

Intel CPU Testbash .buildkite/run-cpu-test.sh

Ran in 34m 16s

Total Job Run Time: 11h 22m