Nightly

Public

Tests that are too slow or non-deterministic for the regular Test pipeline

Scheduled build

#11818

main/0261db8e3c

Failed in 7h 36m

Miri test (full)

Extended SSH connection tests

CRDB rolling restarts

PubSub disruption

Test for incident 70

Tests for balancerd

CRDB / Persist backup and restore

Postgres / Persist backup and restore

Created Sun 13th Apr at 11:30 PM

Triggered from Pipeline Schedule

Feature benchmark against merge base or 'latest' 2 failed, main history:

Unknown error in Scenario 'ManyKafkaSourcesOnSameCluster':

New regression against v0.141.2

NAME                                | TYPE            |      THIS       |      OTHER      |  UNIT  | THRESHOLD  |  Regression?  | 'THIS' is
--------------------------------------------------------------------------------------------------------------------------------------------------------
ManyKafkaSourcesOnSameCluster       | wallclock       |          26.855 |          26.192 |   s    |    10%     |      no       | worse:   2.5% slower
ManyKafkaSourcesOnSameCluster       | memory_mz       |        2561.569 |        2535.820 |   MB   |    20%     |      no       | worse:   1.0% more
ManyKafkaSourcesOnSameCluster       | memory_clusterd |          70.353 |          34.466 |   MB   |    50%     |    !!YES!!    | worse:   2.0 TIMES more

Test details & reproducer

Simple benchmark of mostly individual queries using testdrive. Can find wallclock/memorys regressions in single-connection query executions, not suitable for concurrency.

BUILDKITE_PARALLEL_JOB=1 BUILDKITE_PARALLEL_JOB_COUNT=8 bin/mzcompose --find feature-benchmark run default --other-tag common-ancestor

Platform checks upgrade in Cloudtest/K8s failed, main history:

Unknown error in journalctl-merge.log:

Apr 14 00:00:01 ip-10-61-119-168.ec2.internal kernel: Memory cgroup out of memory: Killed process 108444 (clusterd) total-vm:22466392kB, anon-rss:4097280kB, file-rss:7212kB, shmem-rss:0kB, UID:999 pgtables:28588kB oom_score_adj:870

Test details & reproducer

bin/pytest --junitxml=junit_cloudtest_0196317f-6800-4e12-bf3c-fbd87a0153df.xml -m=long test/cloudtest/test_upgrade.py --splits=1 --group=1

Feature benchmark against merge base or 'latest' 6 failed, main history:

Unknown error in Scenario 'PgCdcInitialLoad':

New regression against v0.141.2

NAME                                | TYPE            |      THIS       |      OTHER      |  UNIT  | THRESHOLD  |  Regression?  | 'THIS' is
--------------------------------------------------------------------------------------------------------------------------------------------------------
PgCdcInitialLoad                    | wallclock       |           1.290 |           1.145 |   s    |    10%     |    !!YES!!    | worse:  12.6% slower
PgCdcInitialLoad                    | memory_mz       |         772.285 |         751.781 |   MB   |    20%     |      no       | worse:   2.7% more
PgCdcInitialLoad                    | memory_clusterd |          64.106 |          63.686 |   MB   |    50%     |      no       | worse:   0.7% more

Test details & reproducer

Simple benchmark of mostly individual queries using testdrive. Can find wallclock/memorys regressions in single-connection query executions, not suitable for concurrency.

BUILDKITE_PARALLEL_JOB=5 BUILDKITE_PARALLEL_JOB_COUNT=8 bin/mzcompose --find feature-benchmark run default --other-tag common-ancestor

🏎️ testdrive with SIZE 8 and blob store failed, main history:

Unknown error in introspection-sources.td:

introspection-sources.td:120:1: non-matching rows: expected:
[["<TIMESTAMP>", "1", "true"]]
got:
[]
Poor diff:
- <TIMESTAMP> 1 true

     |
  21 | $ postgres-execute c ... [rest of line truncated for security]
 119 |   WHERE f.export_id = s.id AND time > 0)
 120 | > FETCH 1 c WITH (timeout='20s')
     | ^

Test details & reproducer

Testdrive is the basic framework and language for defining product tests under the expected-result/actual-result (aka golden testing) paradigm. A query is retried until it produces the desired result.

bin/mzcompose --find testdrive run default --default-size=8 --azurite

Parallel Workload (0dt deploy) succeeded with known error logs, main history:

Known issue parallel-workload: 0dt: thread 'coordinator' panicked at src/storage-controller/src/lib.rs:703:17: dependency since has advanced past dependent (u417) upper (#8425) in services.log:

parallel-workload-materialized2-1    | 2025-04-13T23:54:11.995635Z  thread 'coordinator' panicked at src/storage-controller/src/lib.rs:974:17: dependency since has advanced past dependent (u357) upper

Test details & reproducer

Runs a randomized parallel workload stressing all parts of Materialize, can mostly find panics and unexpected errors. See zippy for a sequential randomized tests which can verify correctness.

bin/mzcompose --find parallel-workload run default --runtime=1500 --scenario=0dt-deploy --threads=16

Checks 0dt upgrade across four versions 2 succeeded with known error logs, main history:

Known issue parallel-workload: 0dt: thread 'coordinator' panicked at src/storage-controller/src/lib.rs:703:17: dependency since has advanced past dependent (u417) upper (#8425) in services.log:

platform-checks-mz_4-1              | 2025-04-13T23:50:47.947402Z  thread 'coordinator' panicked at src/storage-controller/src/lib.rs:974:17: dependency since has advanced past dependent (u407) upper

Test details & reproducer

Write a single set of .td fragments for a particular feature or functionality and then have Zippy execute them in upgrade, 0dt-upgrade, restart, recovery and failure contexts.

BUILDKITE_PARALLEL_JOB=1 BUILDKITE_PARALLEL_JOB_COUNT=2 bin/mzcompose --find platform-checks run default --scenario=ZeroDowntimeUpgradeEntireMzFourVersions --seed=0196317d-36a1-4763-b5c3-216fb0acb809

Checks 0dt upgrade across two versions 2 succeeded with known error logs, main history:

Known issue parallel-workload: 0dt: thread 'coordinator' panicked at src/storage-controller/src/lib.rs:703:17: dependency since has advanced past dependent (u417) upper (#8425) in services.log:

platform-checks-mz_4-1              | 2025-04-13T23:51:24.083458Z  thread 'coordinator' panicked at src/storage-controller/src/lib.rs:974:17: dependency since has advanced past dependent (u407) upper

Test details & reproducer

Write a single set of .td fragments for a particular feature or functionality and then have Zippy execute them in upgrade, 0dt-upgrade, restart, recovery and failure contexts.

BUILDKITE_PARALLEL_JOB=1 BUILDKITE_PARALLEL_JOB_COUNT=2 bin/mzcompose --find platform-checks run default --scenario=ZeroDowntimeUpgradeEntireMzTwoVersions --seed=0196317d-36a1-4763-b5c3-216fb0acb809

Checks 0dt restart of the entire Mz with forced migrations 2 succeeded with known error logs, main history:

Known issue parallel-workload: 0dt: thread 'coordinator' panicked at src/storage-controller/src/lib.rs:703:17: dependency since has advanced past dependent (u417) upper (#8425) in services.log:

platform-checks-mz_4-1              | 2025-04-13T23:50:39.994394Z  thread 'coordinator' panicked at src/storage-controller/src/lib.rs:974:17: dependency since has advanced past dependent (u422) upper

Test details & reproducer

Write a single set of .td fragments for a particular feature or functionality and then have Zippy execute them in upgrade, 0dt-upgrade, restart, recovery and failure contexts.

BUILDKITE_PARALLEL_JOB=1 BUILDKITE_PARALLEL_JOB_COUNT=2 bin/mzcompose --find platform-checks run default --scenario=ZeroDowntimeRestartEntireMzForcedMigrations --seed=0196317d-36a1-4763-b5c3-216fb0acb809

Checks 0dt upgrade, whole-Mz restart 2 succeeded with known error logs, main history:

Known issue parallel-workload: 0dt: thread 'coordinator' panicked at src/storage-controller/src/lib.rs:703:17: dependency since has advanced past dependent (u417) upper (#8425) in services.log:

platform-checks-mz_3-1              | 2025-04-13T23:50:14.662706Z  thread 'coordinator' panicked at src/storage-controller/src/lib.rs:974:17: dependency since has advanced past dependent (u425) upper

Test details & reproducer

Write a single set of .td fragments for a particular feature or functionality and then have Zippy execute them in upgrade, 0dt-upgrade, restart, recovery and failure contexts.

BUILDKITE_PARALLEL_JOB=1 BUILDKITE_PARALLEL_JOB_COUNT=2 bin/mzcompose --find platform-checks run default --scenario=ZeroDowntimeUpgradeEntireMz --seed=0196317d-36a1-4763-b5c3-216fb0acb809

2/8

Feature benchmark against merge base or 'latest' 2

Ran in 5h 40m

Platform checks upgrade in Cloudtest/K8s

Ran in 4h 3m

Total Job Run Time: 5d 22h