Engineering Council Test Reliability Report

Scope aligned with Slack channel #dezvoltare, covering 2026-05-16 07:00 to 2026-05-23 07:00. Metrics and timings are sourced from GitLab pipelines, jobs, and test-report artifacts for the daily 6 PM regression suite and the production smoke suite. Trend charts use daily buckets across this window.

Executive Snapshot

7
Daily Runs
0/7
Daily Green
21m 42s
Avg Daily Runtime
16
Smoke Attempts
14/16
Smoke Green
3m 14s
Avg Smoke Runtime
3m 09s
Median Smoke Time
0
Current Green Streak

Executive Analysis

Bottom line: the regression system is informative but not calm. The data suggest repeatable problem areas rather than random breakage, which means focused ownership should move the needle quickly.

What Matters

  • Daily regression passed 0 of 7 runs (0.0%), with a current green streak of 0 and a best streak of 0 in this window. The latest daily run (157199) failed, so the system is ending the week under tension rather than in a clean state. 7 failed run(s) never reached complete daily-suite counts, which points to some infrastructure or setup noise mixed into the product signal.
  • Smoke passed 14 of 16 attempts (87.5%) across 9 production pipelines. 2 pipeline(s) recovered on rerun, which is useful for continuity but also a sign that first-pass deploy signal is noisier than it should be.
  • Failure concentration is not random: Frontend has the highest strict failure ratio at 4.75%, while Social has the broadest non-pass footprint at 16.67%.
  • University is the weakest smoke surface in this window at 4/5 green (80.0%).
  • Daily-suite runtime averaged 21m 42s, while observed daily test volume moved from 1,273 to 450.

Engineering Analysis

  • A release gate should fail loudly for product regressions and quietly for infrastructure noise. Rerun recoveries plus incomplete daily or smoke attempts suggest those two failure modes are still partially mixed together.
  • The failure profile is concentrated enough to act on. Frontend and Social are carrying the strongest signal, which means reliability work should be assigned by category ownership instead of treating the suite as one undifferentiated problem.
  • The broader daily suite is carrying more instability than smoke, which usually means product regressions are escaping into wider coverage areas even when the narrow deploy gate looks acceptable.

Recommended Actions

  • Split incomplete execution failures from real assertion failures in the report narrative. Setup breakage should stay visible, but it should not look identical to a product regression in the executive readout.
  • Assign one owner to Frontend for the next cycle and expect a short written burn-down: top failing tests, suspected root causes, flake versus regression breakdown, and what gets fixed or quarantined first.
  • Treat the daily regression suite like an operations queue until it is calm again: triage failures after each red run, close known-noise items fast, and avoid letting multiple unrelated red signals pile up between runs.
  • Put University smoke under closer guardrails for the next release cycle. It is the best place to improve first-pass deploy confidence quickly.

Improvement Ideas

  • Introduce a small reliability budget for tests: every flaky or quarantined case needs an owner and an expiry, and the team should review that budget weekly the same way it reviews bugs or incidents.
  • Track first-fail to root-cause time as a core metric. Fast diagnosis is as important as raw pass rate because the practical value of a test gate depends on how quickly it helps the team recover.
  • Define a runtime budget per suite and require justification when test count or duration grows. Reliable feedback systems stay trusted when they remain both stable and proportionate.

Category Execution Ratios

How computed

Category total executions means the sum of that category's observed test executions across every daily-suite run in the selected window.

Strict Failure Ratio = failed executions for that category divided by total executions for that category across the window.

Non-pass Ratio = (failed + pending + skipped) executions for that category divided by total executions for that category across the window.

Example: if Billing executed 800 times across the week and 2 of those executions failed, Billing strict failure ratio is 0.25%. That does not mean 0.25% of pipelines failed; it means 0.25% of observed Billing executions ended in failed.

How computed

Category total executions means the sum of that category's observed test executions across every daily-suite run in the selected window.

Strict Failure Ratio = failed executions for that category divided by total executions for that category across the window.

Non-pass Ratio = (failed + pending + skipped) executions for that category divided by total executions for that category across the window.

Example: if Billing executed 800 times across the week and 2 of those executions failed, Billing strict failure ratio is 0.25%. That does not mean 0.25% of pipelines failed; it means 0.25% of observed Billing executions ended in failed.

Daily Daily Suite Status0000105-1605-1805-2005-22
Daily Smoke Attempts0123505-1605-1805-2005-22
Daily Average Daily Suite Runtime8m 20s12m 34s16m 48s21m 02s25m 17s05-1605-1805-2005-22
Daily Average Smoke Runtime0m 00s1m 08s2m 16s3m 23s4m 31s05-1605-1805-2005-22
Daily Suite Total Test Growth (Recent 7 Runs)4506558611067127305-1605-1805-2005-22
Smoke Suite Total Test Growth (Latest Run Per Day)
FrontendUniversity
60728597110Frontend 05-18: 110Frontend 05-19: 110Frontend 05-20: 110Frontend 05-21: 110Frontend 05-22: 110University 05-21: 60University 05-22: 6005-1805-1905-2005-2105-22

Category Aggregate Table

How computed

Category total executions means the sum of that category's observed test executions across every daily-suite run in the selected window.

Strict Failure Ratio = failed executions for that category divided by total executions for that category across the window.

Non-pass Ratio = (failed + pending + skipped) executions for that category divided by total executions for that category across the window.

Example: if Billing executed 800 times across the week and 2 of those executions failed, Billing strict failure ratio is 0.25%. That does not mean 0.25% of pipelines failed; it means 0.25% of observed Billing executions ended in failed.

How computed

Category total executions means the sum of that category's observed test executions across every daily-suite run in the selected window.

Strict Failure Ratio = failed executions for that category divided by total executions for that category across the window.

Non-pass Ratio = (failed + pending + skipped) executions for that category divided by total executions for that category across the window.

Example: if Billing executed 800 times across the week and 2 of those executions failed, Billing strict failure ratio is 0.25%. That does not mean 0.25% of pipelines failed; it means 0.25% of observed Billing executions ended in failed.

CategoryTotalFailedPendingSkippedFailure RatioNon-pass RatioRuns With Failures
Billing756100981.32%14.29%1
Web23561030.04%0.17%1
Frontend18738901694.75%13.77%7
Library602230633.82%14.29%1
University00000.00%0.00%7
Subscriptions00000.00%0.00%7
Admission00000.00%0.00%7
Social12656103.97%16.67%1
CatFailF%NP%Tot
Billing
Pend 0Skip 98Runs 1
10
1.32%
14.29%
756
Web
Pend 0Skip 3Runs 1
1
0.04%
0.17%
2356
Frontend
Pend 0Skip 169Runs 7
89
4.75%
13.77%
1873
Library
Pend 0Skip 63Runs 1
23
3.82%
14.29%
602
University
Pend 0Skip 0Runs 7
0
0.00%
0.00%
0
Subscriptions
Pend 0Skip 0Runs 7
0
0.00%
0.00%
0
Admission
Pend 0Skip 0Runs 7
0
0.00%
0.00%
0
Social
Pend 6Skip 10Runs 1
5
3.97%
16.67%
126

Recent Runs

Recent Daily Suite Runs

DatePipelineSuitesStatusSummary
2026-05-16 18:27156178BillingWebFrontendLibraryUniversitySubscriptionsAdmissionSocialFAILEDTotal 1273 | Passed 1268 | Failed 4 | Pending 1 | Incomplete suite counts
2026-05-17 18:27156183BillingWebFrontendLibraryUniversitySubscriptionsAdmissionSocialFAILEDTotal 1273 | Passed 1269 | Failed 3 | Pending 1 | Incomplete suite counts
2026-05-18 18:26156473BillingWebFrontendLibraryUniversitySubscriptionsAdmissionSocialFAILEDTotal 497 | Passed 491 | Failed 5 | Pending 1 | Incomplete suite counts
2026-05-19 18:28156720BillingWebFrontendLibraryUniversitySubscriptionsAdmissionSocialFAILEDTotal 1273 | Passed 1267 | Failed 5 | Pending 1 | Incomplete suite counts
2026-05-20 18:28156907BillingWebFrontendLibraryUniversitySubscriptionsAdmissionSocialFAILEDTotal 497 | Passed 493 | Failed 3 | Pending 1 | Incomplete suite counts
2026-05-21 18:24157062BillingWebFrontendLibraryUniversitySubscriptionsAdmissionSocialFAILEDTotal 450 | Passed 449 | Failed 2 | Pending 1 | Incomplete suite counts
2026-05-22 18:11157199BillingWebFrontendLibraryUniversitySubscriptionsAdmissionSocialFAILEDTotal 450 | Passed 3 | Failed 106 | Incomplete suite counts
2026-05-16 18:27Pipeline 156178BillingWebFrontendLibraryUniversitySubscriptionsAdmissionSocial
FAILED
T 1273 | P 1268 | F 4 | Pend 1 | Incomplete
2026-05-17 18:27Pipeline 156183BillingWebFrontendLibraryUniversitySubscriptionsAdmissionSocial
FAILED
T 1273 | P 1269 | F 3 | Pend 1 | Incomplete
2026-05-18 18:26Pipeline 156473BillingWebFrontendLibraryUniversitySubscriptionsAdmissionSocial
FAILED
T 497 | P 491 | F 5 | Pend 1 | Incomplete
2026-05-19 18:28Pipeline 156720BillingWebFrontendLibraryUniversitySubscriptionsAdmissionSocial
FAILED
T 1273 | P 1267 | F 5 | Pend 1 | Incomplete
2026-05-20 18:28Pipeline 156907BillingWebFrontendLibraryUniversitySubscriptionsAdmissionSocial
FAILED
T 497 | P 493 | F 3 | Pend 1 | Incomplete
2026-05-21 18:24Pipeline 157062BillingWebFrontendLibraryUniversitySubscriptionsAdmissionSocial
FAILED
T 450 | P 449 | F 2 | Pend 1 | Incomplete
2026-05-22 18:11Pipeline 157199BillingWebFrontendLibraryUniversitySubscriptionsAdmissionSocial
FAILED
T 450 | P 3 | F 106 | Pend 0 | Incomplete

Recent Smoke Attempts

DateSuitePipelineJobStatusPassedFailedDuration
2026-05-18 15:02Frontend156306Frontend smokePASSED11003m 08s
2026-05-18 17:08Frontend156455Frontend smokePASSED11003m 12s
2026-05-18 22:21Frontend156479Frontend smokePASSED11003m 09s
2026-05-18 23:10Frontend156481Frontend smokePASSED11003m 20s
2026-05-19 12:54Frontend156608Frontend smokeFAILED8015m 01s
2026-05-19 12:59Frontend156608Frontend smokePASSED11004m 02s
2026-05-20 23:05Frontend156608Frontend smokePASSED11003m 31s
2026-05-21 14:43University157003University smokePASSED6002m 13s
2026-05-21 14:50Frontend157003Frontend smokePASSED11003m 49s
2026-05-21 16:13University157030University smokePASSED6002m 13s
2026-05-21 16:19Frontend157030Frontend smokePASSED11003m 06s
2026-05-22 15:37University157139University smokePASSED6002m 10s
2026-05-22 15:41Frontend157139Frontend smokePASSED11003m 08s
2026-05-22 16:56University157198University smokeFAILED5913m 22s
2026-05-22 16:56Frontend157198Frontend smokePASSED11003m 53s
2026-05-22 16:59University157198University smokePASSED6002m 25s

Smoke Suite Breakdown

Frontend
11 attempts across 9 pipelines
91% green
Passed10
Failed1
Incomplete0
Avg runtime3m 34s
Median passing runtime3m 16s
Pipelines9
University
5 attempts across 4 pipelines
80% green
Passed4
Failed1
Incomplete0
Avg runtime2m 28s
Median passing runtime2m 13s
Pipelines4
Generated from GitLab project adservio/helm2. Times are shown in Europe/Bucharest. Daily-suite runtime is measured from GitLab pipeline and job timestamps. Category counts come from GitLab test-report JSON artifacts, with job-trace fallback when older artifacts have expired.