Engineering Council Test Reliability Report

Scope aligned with Slack channel #dezvoltare, covering 2026-05-23 07:00 to 2026-05-30 07:00. Metrics and timings are sourced from GitLab pipelines, jobs, and test-report artifacts for the daily 6 PM regression suite and the production smoke suite. Trend charts use daily buckets across this window.

Executive Snapshot

7
Daily Runs
0/7
Daily Green
20m 40s
Avg Daily Runtime
26
Smoke Attempts
9/26
Smoke Green
2m 17s
Avg Smoke Runtime
4m 34s
Median Smoke Time
0
Current Green Streak

Executive Analysis

Bottom line: release confidence is unstable in both the broad regression path and the deploy smoke path. The immediate job is to separate real product regressions from execution noise, then burn down the concentrated failure clusters.

What Matters

  • Daily regression passed 0 of 7 runs (0.0%), with a current green streak of 0 and a best streak of 0 in this window. The latest daily run (158268) failed, so the system is ending the week under tension rather than in a clean state. 5 failed run(s) never reached complete daily-suite counts, which points to some infrastructure or setup noise mixed into the product signal.
  • Smoke passed 9 of 26 attempts (34.6%) across 14 production pipelines. 3 pipeline(s) recovered on rerun, which is useful for continuity but also a sign that first-pass deploy signal is noisier than it should be. 15 failed attempt(s) never reached test execution counts at all.
  • Failure concentration is not random: Frontend has the highest strict failure ratio at 1.09%, while Social has the broadest non-pass footprint at 3.17%.
  • University is the weakest smoke surface in this window at 1/10 green (10.0%).
  • Daily-suite runtime averaged 20m 40s, while observed daily test volume moved from 450 to 520.

Engineering Analysis

  • A release gate should fail loudly for product regressions and quietly for infrastructure noise. Rerun recoveries plus incomplete daily or smoke attempts suggest those two failure modes are still partially mixed together.
  • The failure profile is concentrated enough to act on. Frontend and Social are carrying the strongest signal, which means reliability work should be assigned by category ownership instead of treating the suite as one undifferentiated problem.
  • The broader daily suite is carrying more instability than smoke, which usually means product regressions are escaping into wider coverage areas even when the narrow deploy gate looks acceptable.

Recommended Actions

  • Split incomplete execution failures from real assertion failures in the report narrative. Setup breakage should stay visible, but it should not look identical to a product regression in the executive readout.
  • Assign one owner to Frontend for the next cycle and expect a short written burn-down: top failing tests, suspected root causes, flake versus regression breakdown, and what gets fixed or quarantined first.
  • Treat the daily regression suite like an operations queue until it is calm again: triage failures after each red run, close known-noise items fast, and avoid letting multiple unrelated red signals pile up between runs.
  • Put University smoke under closer guardrails for the next release cycle. It is the best place to improve first-pass deploy confidence quickly.

Improvement Ideas

  • Introduce a small reliability budget for tests: every flaky or quarantined case needs an owner and an expiry, and the team should review that budget weekly the same way it reviews bugs or incidents.
  • Track first-fail to root-cause time as a core metric. Fast diagnosis is as important as raw pass rate because the practical value of a test gate depends on how quickly it helps the team recover.
  • Define a runtime budget per suite and require justification when test count or duration grows. Reliable feedback systems stay trusted when they remain both stable and proportionate.

Category Execution Ratios

How computed

Category total executions means the sum of that category's observed test executions across every daily-suite run in the selected window.

Strict Failure Ratio = failed executions for that category divided by total executions for that category across the window.

Non-pass Ratio = (failed + pending + skipped) executions for that category divided by total executions for that category across the window.

Example: if Billing executed 800 times across the week and 2 of those executions failed, Billing strict failure ratio is 0.25%. That does not mean 0.25% of pipelines failed; it means 0.25% of observed Billing executions ended in failed.

How computed

Category total executions means the sum of that category's observed test executions across every daily-suite run in the selected window.

Strict Failure Ratio = failed executions for that category divided by total executions for that category across the window.

Non-pass Ratio = (failed + pending + skipped) executions for that category divided by total executions for that category across the window.

Example: if Billing executed 800 times across the week and 2 of those executions failed, Billing strict failure ratio is 0.25%. That does not mean 0.25% of pipelines failed; it means 0.25% of observed Billing executions ended in failed.

Daily Daily Suite Status0000105-2305-2505-2705-29
Daily Smoke Attempts0246905-2305-2505-2705-29
Daily Average Daily Suite Runtime9m 24s15m 01s20m 38s26m 15s31m 52s05-2305-2505-2705-29
Daily Average Smoke Runtime0m 00s1m 03s2m 05s3m 08s4m 11s05-2305-2505-2705-29
Daily Suite Total Test Growth (Recent 7 Runs)2234867491012127505-2305-2505-2705-29
Smoke Suite Total Test Growth (Latest Run Per Day)
FrontendUniversity
0275582110Frontend 05-25: 0Frontend 05-26: 110Frontend 05-27: 0Frontend 05-28: 110Frontend 05-29: 110University 05-25: 0University 05-26: 60University 05-27: 0University 05-28: 0University 05-29: 6005-2505-2605-2705-2805-29

Category Aggregate Table

How computed

Category total executions means the sum of that category's observed test executions across every daily-suite run in the selected window.

Strict Failure Ratio = failed executions for that category divided by total executions for that category across the window.

Non-pass Ratio = (failed + pending + skipped) executions for that category divided by total executions for that category across the window.

Example: if Billing executed 800 times across the week and 2 of those executions failed, Billing strict failure ratio is 0.25%. That does not mean 0.25% of pipelines failed; it means 0.25% of observed Billing executions ended in failed.

How computed

Category total executions means the sum of that category's observed test executions across every daily-suite run in the selected window.

Strict Failure Ratio = failed executions for that category divided by total executions for that category across the window.

Non-pass Ratio = (failed + pending + skipped) executions for that category divided by total executions for that category across the window.

Example: if Billing executed 800 times across the week and 2 of those executions failed, Billing strict failure ratio is 0.25%. That does not mean 0.25% of pipelines failed; it means 0.25% of observed Billing executions ended in failed.

CategoryTotalFailedPendingSkippedFailure RatioNon-pass RatioRuns With Failures
Billing7561000.13%0.13%1
Web15920000.00%0.00%0
Frontend165618001.09%1.09%4
Library6020000.00%0.00%0
University60000.00%0.00%5
Subscriptions40000.00%0.00%3
Admission40000.00%0.00%3
Social1260400.00%3.17%0
CatFailF%NP%Tot
Billing
Pend 0Skip 0Runs 1
1
0.13%
0.13%
756
Web
Pend 0Skip 0Runs 0
0
0.00%
0.00%
1592
Frontend
Pend 0Skip 0Runs 4
18
1.09%
1.09%
1656
Library
Pend 0Skip 0Runs 0
0
0.00%
0.00%
602
University
Pend 0Skip 0Runs 5
0
0.00%
0.00%
6
Subscriptions
Pend 0Skip 0Runs 3
0
0.00%
0.00%
4
Admission
Pend 0Skip 0Runs 3
0
0.00%
0.00%
4
Social
Pend 4Skip 0Runs 0
0
0.00%
3.17%
126

Recent Runs

Recent Daily Suite Runs

DatePipelineSuitesStatusSummary
2026-05-23 18:24157201BillingWebFrontendLibraryUniversitySubscriptionsAdmissionSocialFAILEDTotal 450 | Passed 448 | Failed 3 | Pending 1 | Incomplete suite counts
2026-05-24 18:25157212BillingWebFrontendLibraryUniversitySubscriptionsAdmissionSocialFAILEDTotal 1273 | Passed 1272 | Failed 0 | Pending 1 | Incomplete suite counts
2026-05-25 18:35157402BillingWebFrontendLibraryUniversitySubscriptionsAdmissionSocialFAILEDTotal 497 | Passed 496 | Failed 0 | Pending 1 | Incomplete suite counts
2026-05-26 18:29157644BillingWebFrontendLibraryUniversitySubscriptionsAdmissionSocialFAILEDTotal 1275 | Passed 1270 | Failed 4 | Pending 1 | Incomplete suite counts
2026-05-27 18:27157886BillingWebFrontendLibraryUniversitySubscriptionsAdmissionSocialFAILEDTotal 223 | Passed 223 | Failed 0 | Incomplete suite counts
2026-05-28 18:12158062BillingWebFrontendLibraryUniversitySubscriptionsAdmissionSocialFAILEDTotal 508 | Passed 502 | Failed 6
2026-05-29 18:12158268BillingWebFrontendLibraryUniversitySubscriptionsAdmissionSocialFAILEDTotal 520 | Passed 514 | Failed 6
2026-05-23 18:24Pipeline 157201BillingWebFrontendLibraryUniversitySubscriptionsAdmissionSocial
FAILED
T 450 | P 448 | F 3 | Pend 1 | Incomplete
2026-05-24 18:25Pipeline 157212BillingWebFrontendLibraryUniversitySubscriptionsAdmissionSocial
FAILED
T 1273 | P 1272 | F 0 | Pend 1 | Incomplete
2026-05-25 18:35Pipeline 157402BillingWebFrontendLibraryUniversitySubscriptionsAdmissionSocial
FAILED
T 497 | P 496 | F 0 | Pend 1 | Incomplete
2026-05-26 18:29Pipeline 157644BillingWebFrontendLibraryUniversitySubscriptionsAdmissionSocial
FAILED
T 1275 | P 1270 | F 4 | Pend 1 | Incomplete
2026-05-27 18:27Pipeline 157886BillingWebFrontendLibraryUniversitySubscriptionsAdmissionSocial
FAILED
T 223 | P 223 | F 0 | Pend 0 | Incomplete
2026-05-28 18:12Pipeline 158062BillingWebFrontendLibraryUniversitySubscriptionsAdmissionSocial
FAILED
T 508 | P 502 | F 6 | Pend 0
2026-05-29 18:12Pipeline 158268BillingWebFrontendLibraryUniversitySubscriptionsAdmissionSocial
FAILED
T 520 | P 514 | F 6 | Pend 0

Recent Smoke Attempts

DateSuitePipelineJobStatusPassedFailedDuration
2026-05-25 12:53University157293University smokeFAILEDn/an/a0m 13s
2026-05-25 12:57Frontend157293Frontend smokeFAILEDn/an/a0m 19s
2026-05-25 17:17Frontend157392Frontend smokeFAILEDn/an/a0m 14s
2026-05-25 17:23University157392University smokeFAILEDn/an/a0m 14s
2026-05-26 13:57Frontend157552Frontend smokeFAILEDn/an/a0m 14s
2026-05-26 14:10University157558University smokeFAILEDn/an/a0m 22s
2026-05-26 14:12Frontend157558Frontend smokeFAILEDn/an/a0m 14s
2026-05-26 18:32University157637University smokeFAILED5733m 41s
2026-05-26 18:36Frontend157637Frontend smokePASSED11003m 05s
2026-05-27 12:16University157763University smokeFAILED5733m 35s
2026-05-27 12:19Frontend157763Frontend smokePASSED11003m 11s
2026-05-27 15:31University157848University smokeFAILEDn/an/a1m 33s
2026-05-27 15:34Frontend157848Frontend smokeFAILEDn/an/a1m 30s
2026-05-27 16:21Frontend157848Frontend smokeFAILEDn/an/a1m 27s
2026-05-27 16:21University157848University smokeFAILEDn/an/a1m 33s
2026-05-27 16:37Frontend157848Frontend smokeFAILEDn/an/a1m 24s
2026-05-27 16:37University157848University smokeFAILEDn/an/a1m 24s
2026-05-27 19:16Frontend157887Frontend smokeFAILEDn/an/a1m 39s
2026-05-28 15:15Frontend158021Frontend smokePASSED11005m 23s
2026-05-28 15:49Frontend158040Frontend smokePASSED11005m 12s
2026-05-28 21:22University158085University smokeFAILEDn/an/a0m 08s
2026-05-28 21:28Frontend158085Frontend smokePASSED11004m 36s
2026-05-28 23:31Frontend158101Frontend smokePASSED11005m 31s
2026-05-29 15:12Frontend158231Frontend smokePASSED11004m 33s
2026-05-29 16:41Frontend158247Frontend smokePASSED11004m 34s
2026-05-29 17:53University158247University smokePASSED6003m 26s

Smoke Suite Breakdown

Frontend
16 attempts across 14 pipelines
50% green
Passed8
Failed8
Incomplete8
Avg runtime2m 42s
Median passing runtime4m 35s
Pipelines14
University
10 attempts across 8 pipelines
10% green
Passed1
Failed9
Incomplete7
Avg runtime1m 37s
Median passing runtime3m 26s
Pipelines8
Generated from GitLab project adservio/helm2. Times are shown in Europe/Bucharest. Daily-suite runtime is measured from GitLab pipeline and job timestamps. Category counts come from GitLab test-report JSON artifacts, with job-trace fallback when older artifacts have expired.