Jump to content

🧪

CI

Public

CI Testing of the vLLM Repo

Queue Paused

Kevin Luu•Fri 11th Apr at 6:51 PM

update docstring

#12063

/

heheda12345:v1_kv_init/2aa75096d3(#11960)

Failed in 1h 44m

Documentation Build

Async Engine, Inputs, Utils, Worker Test

Run Python-only Installation Test

Python-only Installation Test

Basic Correctness Test

Chunked Prefill Test

Run Core Test

Entrypoints Test

Run Distributed Tests (4 GPUs)

Distributed Tests (4 GPUs)

Metrics, Tracing Test

Regression Test

Run Examples Test

Prefix Caching Test

Run Samplers Test

Run LogitsProcessor Test

LogitsProcessor Test

Run Speculative decoding tests

Speculative decoding tests

Run LoRA Test %N

PyTorch Fullgraph Smoke Test

PyTorch Fullgraph Test

Run Tensorizer Test

Tensorizer Test

Run Benchmarks

Run Quantization Test

Quantization Test

Run LM Eval Small Models

LM Eval Small Models

Encoder Decoder tests

OpenAI-Compatible Tool Use

Basic Models Test

Language Models Test (Standard)

Run Language Models Test (Extended)

Language Models Test (Extended)

Multi-Modal Models Test (Standard)

Run Multi-Modal Models Test (Extended) 1

Multi-Modal Models Test (Extended) 1

Run Multi-Modal Models Test (Extended) 2

Multi-Modal Models Test (Extended) 2

Run Custom Models Test

Custom Models Test

Run Distributed Comm Ops Test

Distributed Comm Ops Test

Run 2 Node Tests (4 GPUs in total)

2 Node Tests (4 GPUs in total)

Run Distributed Tests (2 GPUs)

Distributed Tests (2 GPUs)

Run Plugin Tests (2 GPUs)

Plugin Tests (2 GPUs)

Run Multi-step Tests (4 GPUs)

Multi-step Tests (4 GPUs)

Run Pipeline Parallelism Test

Pipeline Parallelism Test

Run LoRA TP Test (Distributed)

LoRA TP Test (Distributed)

Weight Loading Multiple GPU Test

Run Weight Loading Multiple GPU Test - Large Models

Weight Loading Multiple GPU Test - Large Models

Run Distributed Tests (A100)

Distributed Tests (A100)

Run LM Eval Large Models

LM Eval Large Models

IBM Power(ppc64le) CPU Test

Chen Zhang

Created Thu 16th Jan at 5:21 PM

Triggered from Webhook

V1 Test

Ran in 8m 27s

AMD: Entrypoints Testbash .buildkite/run-amd-test.sh "(command rocm-smi || true) && export VLLM_LOGGING_LEVEL=DEBUG && export VLLM_ALLOW_DEPRECATED_BEAM_SEARCH=1 && cd /vllm-workspace/tests ; pytest -v -s entrypoints/llm --ignore=entrypoints/llm/test_lazy_outlines.py --ignore=entrypoints/llm/test_generate.py --ignore=entrypoints/llm/test_generate_multiple_loras.py --ignore=entrypoints/llm/test_guided_generate.py && pytest -v -s entrypoints/llm/test_lazy_outlines.py && pytest -v -s entrypoints/llm/test_generate.py && pytest -v -s entrypoints/llm/test_generate_multiple_loras.py && pytest -v -s entrypoints/llm/test_guided_generate.py && pytest -v -s entrypoints/openai --ignore=entrypoints/openai/test_oot_registration.py && pytest -v -s entrypoints/test_chat_utils.py && pytest -v -s entrypoints/offline_mode"

Ran in 3m 25s

AMD: Kernels Test %Nbash .buildkite/run-amd-test.sh "(command rocm-smi || true) && export VLLM_LOGGING_LEVEL=DEBUG && export VLLM_ALLOW_DEPRECATED_BEAM_SEARCH=1 && cd /vllm-workspace/tests ; pytest -v -s kernels --shard-id=$BUILDKITE_PARALLEL_JOB --num-shards=$BUILDKITE_PARALLEL_JOB_COUNT"

Ran in 2m 51s

Neuron Testbash .buildkite/run-neuron-test.sh

Ran in 5s

Intel CPU Testbash .buildkite/run-cpu-test.sh

Ran in 9m 40s

Total Job Run Time: 14h 50m