Engineering Council Test Reliability Report

Executive Snapshot

7

Daily Runs

0/7

Daily Green

27m 38s

Avg Daily Runtime

16

Smoke Attempts

15/16

Smoke Green

4m 11s

Avg Smoke Runtime

4m 34s

Median Smoke Time

0

Current Green Streak

Executive Analysis

Bottom line: the regression system is informative but not calm. The data suggest repeatable problem areas rather than random breakage, which means focused ownership should move the needle quickly.

What Matters

Daily regression passed 0 of 7 runs (0.0%), with a current green streak of 0 and a best streak of 0 in this window. The latest daily run (162155) failed, so the system is ending the week under tension rather than in a clean state. 2 failed run(s) never reached complete daily-suite counts, which points to some infrastructure or setup noise mixed into the product signal.
Smoke passed 15 of 16 attempts (93.8%) across 12 production pipelines. 1 failed attempt(s) never reached test execution counts at all.
Failure concentration is not random: Frontend has the highest strict failure ratio at 3.29%, while Frontend has the broadest non-pass footprint at 4.33%.
Frontend is the weakest smoke surface in this window at 11/12 green (91.7%).
Daily-suite runtime averaged 27m 38s, while observed daily test volume moved from 520 to 456.

Engineering Analysis

A release gate should fail loudly for product regressions and quietly for infrastructure noise. Rerun recoveries plus incomplete daily or smoke attempts suggest those two failure modes are still partially mixed together.
The failure profile is concentrated enough to act on. Frontend and Frontend are carrying the strongest signal, which means reliability work should be assigned by category ownership instead of treating the suite as one undifferentiated problem.
The broader daily suite is carrying more instability than smoke, which usually means product regressions are escaping into wider coverage areas even when the narrow deploy gate looks acceptable.
The daily suite is now large enough that runtime itself is becoming a management variable at 27m 38s average duration. At that size, every additional flaky or redundant test has a measurable cost on feedback speed.

Recommended Actions

Split incomplete execution failures from real assertion failures in the report narrative. Setup breakage should stay visible, but it should not look identical to a product regression in the executive readout.
Assign one owner to Frontend for the next cycle and expect a short written burn-down: top failing tests, suspected root causes, flake versus regression breakdown, and what gets fixed or quarantined first.
Treat the daily regression suite like an operations queue until it is calm again: triage failures after each red run, close known-noise items fast, and avoid letting multiple unrelated red signals pile up between runs.
Put Frontend smoke under closer guardrails for the next release cycle. It is the best place to improve first-pass deploy confidence quickly.

Improvement Ideas

Introduce a small reliability budget for tests: every flaky or quarantined case needs an owner and an expiry, and the team should review that budget weekly the same way it reviews bugs or incidents.
Track first-fail to root-cause time as a core metric. Fast diagnosis is as important as raw pass rate because the practical value of a test gate depends on how quickly it helps the team recover.
Define a runtime budget per suite and require justification when test count or duration grows. Reliable feedback systems stay trusted when they remain both stable and proportionate.

Category Execution Ratios

How computed

Category total executions means the sum of that category's observed test executions across every daily-suite run in the selected window.

Strict Failure Ratio = failed executions for that category divided by total executions for that category across the window.

Non-pass Ratio = (failed + pending + skipped) executions for that category divided by total executions for that category across the window.

Example: if Billing executed 800 times across the week and 2 of those executions failed, Billing strict failure ratio is 0.25%. That does not mean 0.25% of pipelines failed; it means 0.25% of observed Billing executions ended in failed.

How computed

Category total executions means the sum of that category's observed test executions across every daily-suite run in the selected window.

Strict Failure Ratio = failed executions for that category divided by total executions for that category across the window.

Non-pass Ratio = (failed + pending + skipped) executions for that category divided by total executions for that category across the window.

Example: if Billing executed 800 times across the week and 2 of those executions failed, Billing strict failure ratio is 0.25%. That does not mean 0.25% of pipelines failed; it means 0.25% of observed Billing executions ended in failed.

Strict Failure Ratio

Share of category executions that ended in failed across all daily runs in this window.

Billing2.51%

Web0.00%

Frontend3.29%

Library0.46%

Non-pass Ratio

Share of category executions that ended in failed, pending, or skipped across all daily runs in this window.

Billing2.51%

Web0.00%

Frontend4.33%

Library0.46%

Category Aggregate Table

How computed

Category total executions means the sum of that category's observed test executions across every daily-suite run in the selected window.

Strict Failure Ratio = failed executions for that category divided by total executions for that category across the window.

Non-pass Ratio = (failed + pending + skipped) executions for that category divided by total executions for that category across the window.

Example: if Billing executed 800 times across the week and 2 of those executions failed, Billing strict failure ratio is 0.25%. That does not mean 0.25% of pipelines failed; it means 0.25% of observed Billing executions ended in failed.

How computed

Category total executions means the sum of that category's observed test executions across every daily-suite run in the selected window.

Strict Failure Ratio = failed executions for that category divided by total executions for that category across the window.

Non-pass Ratio = (failed + pending + skipped) executions for that category divided by total executions for that category across the window.

Example: if Billing executed 800 times across the week and 2 of those executions failed, Billing strict failure ratio is 0.25%. That does not mean 0.25% of pipelines failed; it means 0.25% of observed Billing executions ended in failed.

Category	Total	Failed	Skipped	Failure Ratio	Non-pass Ratio	Runs With Failures
Billing	796	20	0	2.51%	2.51%	2
Web	70	0	0	0.00%	0.00%	0
Frontend	2218	73	23	3.29%	4.33%	7
Library	432	2	0	0.46%	0.46%	2

Billing

Pend 0Skip 0Runs 2

20

2.51%

796

Web

Pend 0Skip 0Runs 0

0

0.00%

70

Frontend

Pend 0Skip 23Runs 7

73

3.29%

4.33%

2218

Library

Pend 0Skip 0Runs 2

2

0.46%

432

Recent Runs

Recent Daily Suite Runs

Date	Pipeline	Suites	Status	Summary
2026-06-21 18:14	161024	BillingWebFrontendLibrary	FAILED	Total 520 \| Passed 509 \| Failed 11
2026-06-22 18:13	161190	BillingWebFrontendLibrary	FAILED	Total 521 \| Passed 517 \| Failed 2
2026-06-23 18:14	161325	BillingWebFrontendLibrary	FAILED	Total 521 \| Passed 516 \| Failed 4
2026-06-24 20:08	161648	BillingWebFrontendLibrary	FAILED	Total 521 \| Passed 492 \| Failed 14
2026-06-25 18:14	161855	BillingWebFrontendLibrary	FAILED	Total 521 \| Passed 488 \| Failed 30
2026-06-26 18:16	162153	BillingWebFrontendLibrary	FAILED	Total 456 \| Passed 425 \| Failed 30 \| Incomplete suite counts
2026-06-27 18:14	162155	BillingWebFrontendLibrary	FAILED	Total 456 \| Passed 451 \| Failed 4 \| Incomplete suite counts

2026-06-21 18:14Pipeline 161024BillingWebFrontendLibrary

FAILED

T 520 | P 509 | F 11 | Pend 0

2026-06-22 18:13Pipeline 161190BillingWebFrontendLibrary

FAILED

T 521 | P 517 | F 2 | Pend 0

2026-06-23 18:14Pipeline 161325BillingWebFrontendLibrary

FAILED

T 521 | P 516 | F 4 | Pend 0

2026-06-24 20:08Pipeline 161648BillingWebFrontendLibrary

FAILED

T 521 | P 492 | F 14 | Pend 0

2026-06-25 18:14Pipeline 161855BillingWebFrontendLibrary

FAILED

T 521 | P 488 | F 30 | Pend 0

2026-06-26 18:16Pipeline 162153BillingWebFrontendLibrary

FAILED

T 456 | P 425 | F 30 | Pend 0 | Incomplete

2026-06-27 18:14Pipeline 162155BillingWebFrontendLibrary

FAILED

T 456 | P 451 | F 4 | Pend 0 | Incomplete

Recent Smoke Attempts

Date	Suite	Pipeline	Job	Status	Passed	Failed	Duration
2026-06-22 11:28	University	161065	University smoke	PASSED	60	0	3m 58s
2026-06-22 11:32	Frontend	161065	Frontend smoke	PASSED	110	0	4m 35s
2026-06-22 13:50	Frontend	161134	Frontend smoke	PASSED	110	0	4m 42s
2026-06-24 11:16	Frontend	161380	Frontend smoke	PASSED	110	0	4m 42s
2026-06-24 15:00	Frontend	161543	Frontend smoke	PASSED	110	0	4m 49s
2026-06-25 08:06	Frontend	161611	Frontend smoke	PASSED	110	0	4m 53s
2026-06-25 10:50	Frontend	161673	Frontend smoke	PASSED	110	0	4m 34s
2026-06-25 14:43	Frontend	161779	Frontend smoke	PASSED	110	0	4m 41s
2026-06-25 16:36	University	161825	University smoke	PASSED	60	0	3m 46s
2026-06-25 16:40	Frontend	161825	Frontend smoke	PASSED	110	0	4m 32s
2026-06-25 17:08	University	161842	University smoke	PASSED	60	0	3m 33s
2026-06-25 17:12	Frontend	161842	Frontend smoke	PASSED	110	0	4m 45s
2026-06-25 22:41	University	161863	University smoke	PASSED	60	0	3m 24s
2026-06-25 22:44	Frontend	161863	Frontend smoke	PASSED	110	0	4m 34s
2026-06-26 10:29	Frontend	161923	Frontend smoke	PASSED	110	0	4m 30s
2026-06-26 14:10	Frontend	162039	Frontend smoke	FAILED	n/a	n/a	1m 01s

2026-06-22 11:28UniversityPipeline 161065Job University smoke

PASSED

P 60 | F 0 | 3m 58s

2026-06-22 11:32FrontendPipeline 161065Job Frontend smoke

PASSED

P 110 | F 0 | 4m 35s

2026-06-22 13:50FrontendPipeline 161134Job Frontend smoke

PASSED

P 110 | F 0 | 4m 42s

2026-06-24 11:16FrontendPipeline 161380Job Frontend smoke

PASSED

P 110 | F 0 | 4m 42s

2026-06-24 15:00FrontendPipeline 161543Job Frontend smoke

PASSED

P 110 | F 0 | 4m 49s

2026-06-25 08:06FrontendPipeline 161611Job Frontend smoke

PASSED

P 110 | F 0 | 4m 53s

2026-06-25 10:50FrontendPipeline 161673Job Frontend smoke

PASSED

P 110 | F 0 | 4m 34s

2026-06-25 14:43FrontendPipeline 161779Job Frontend smoke

PASSED

P 110 | F 0 | 4m 41s

2026-06-25 16:36UniversityPipeline 161825Job University smoke

PASSED

P 60 | F 0 | 3m 46s

2026-06-25 16:40FrontendPipeline 161825Job Frontend smoke

PASSED

P 110 | F 0 | 4m 32s

2026-06-25 17:08UniversityPipeline 161842Job University smoke

PASSED

P 60 | F 0 | 3m 33s

2026-06-25 17:12FrontendPipeline 161842Job Frontend smoke

PASSED

P 110 | F 0 | 4m 45s

2026-06-25 22:41UniversityPipeline 161863Job University smoke

PASSED

P 60 | F 0 | 3m 24s

2026-06-25 22:44FrontendPipeline 161863Job Frontend smoke

PASSED

P 110 | F 0 | 4m 34s

2026-06-26 10:29FrontendPipeline 161923Job Frontend smoke

PASSED

P 110 | F 0 | 4m 30s

2026-06-26 14:10FrontendPipeline 162039Job Frontend smoke

FAILED

P n/a | F n/a | 1m 01s

Smoke Suite Breakdown

Frontend

12 attempts across 12 pipelines

92% green

Passed11

Failed1

Incomplete1

Avg runtime4m 22s

Median passing runtime4m 41s

Pipelines12

University

4 attempts across 4 pipelines

100% green

Passed4

Failed0

Incomplete0

Avg runtime3m 40s

Median passing runtime3m 39s

Pipelines4