Engineering Council Test Reliability Report

Scope aligned with Slack channel #dezvoltare, covering 2026-06-13 07:00 to 2026-06-20 07:00. Metrics and timings are sourced from GitLab pipelines, jobs, and test-report artifacts for the daily 6 PM regression suite and the production smoke suite. Trend charts use daily buckets across this window.

Executive Snapshot

7
Daily Runs
0/7
Daily Green
10m 11s
Avg Daily Runtime
25
Smoke Attempts
19/25
Smoke Green
5m 08s
Avg Smoke Runtime
4m 40s
Median Smoke Time
0
Current Green Streak

Executive Analysis

Bottom line: release confidence is unstable in both the broad regression path and the deploy smoke path. The immediate job is to separate real product regressions from execution noise, then burn down the concentrated failure clusters.

What Matters

  • Daily regression passed 0 of 7 runs (0.0%), with a current green streak of 0 and a best streak of 0 in this window. The latest daily run (161005) failed, so the system is ending the week under tension rather than in a clean state.
  • Smoke passed 19 of 25 attempts (76.0%) across 22 production pipelines. 1 pipeline(s) recovered on rerun, which is useful for continuity but also a sign that first-pass deploy signal is noisier than it should be.
  • Failure concentration is not random: Social has the highest strict failure ratio at 8.73%, while Social has the broadest non-pass footprint at 8.73%.
  • Frontend is the weakest smoke surface in this window at 16/22 green (72.7%).
  • Daily-suite runtime averaged 10m 11s, while observed daily test volume moved from 1,281 to 1,304.

Engineering Analysis

  • A release gate should fail loudly for product regressions and quietly for infrastructure noise. Rerun recoveries plus incomplete daily or smoke attempts suggest those two failure modes are still partially mixed together.
  • The failure profile is concentrated enough to act on. Social and Social are carrying the strongest signal, which means reliability work should be assigned by category ownership instead of treating the suite as one undifferentiated problem.
  • The broader daily suite is carrying more instability than smoke, which usually means product regressions are escaping into wider coverage areas even when the narrow deploy gate looks acceptable.

Recommended Actions

  • Assign one owner to Social for the next cycle and expect a short written burn-down: top failing tests, suspected root causes, flake versus regression breakdown, and what gets fixed or quarantined first.
  • Treat the daily regression suite like an operations queue until it is calm again: triage failures after each red run, close known-noise items fast, and avoid letting multiple unrelated red signals pile up between runs.
  • Put Frontend smoke under closer guardrails for the next release cycle. It is the best place to improve first-pass deploy confidence quickly.

Improvement Ideas

  • Introduce a small reliability budget for tests: every flaky or quarantined case needs an owner and an expiry, and the team should review that budget weekly the same way it reviews bugs or incidents.
  • Track first-fail to root-cause time as a core metric. Fast diagnosis is as important as raw pass rate because the practical value of a test gate depends on how quickly it helps the team recover.
  • Define a runtime budget per suite and require justification when test count or duration grows. Reliable feedback systems stay trusted when they remain both stable and proportionate.

Category Execution Ratios

How computed

Category total executions means the sum of that category's observed test executions across every daily-suite run in the selected window.

Strict Failure Ratio = failed executions for that category divided by total executions for that category across the window.

Non-pass Ratio = (failed + pending + skipped) executions for that category divided by total executions for that category across the window.

Example: if Billing executed 800 times across the week and 2 of those executions failed, Billing strict failure ratio is 0.25%. That does not mean 0.25% of pipelines failed; it means 0.25% of observed Billing executions ended in failed.

How computed

Category total executions means the sum of that category's observed test executions across every daily-suite run in the selected window.

Strict Failure Ratio = failed executions for that category divided by total executions for that category across the window.

Non-pass Ratio = (failed + pending + skipped) executions for that category divided by total executions for that category across the window.

Example: if Billing executed 800 times across the week and 2 of those executions failed, Billing strict failure ratio is 0.25%. That does not mean 0.25% of pipelines failed; it means 0.25% of observed Billing executions ended in failed.

Daily Daily Suite Status0000106-1306-1506-1706-19
Daily Smoke Attempts02571006-1306-1506-1706-19
Daily Average Daily Suite Runtime9m 24s9m 54s10m 23s10m 52s11m 21s06-1306-1506-1706-19
Daily Average Smoke Runtime0m 00s1m 51s3m 42s5m 33s7m 24s06-1306-1506-1706-19
Daily Suite Total Test Growth (Recent 7 Runs)1281128612921298130406-1306-1506-1706-19
Smoke Suite Total Test Growth (Latest Run Per Day)
FrontendUniversity
60728597110Frontend 06-15: 110Frontend 06-16: 110Frontend 06-17: 110Frontend 06-18: 110Frontend 06-19: 110University 06-15: 60University 06-16: 6006-1506-1606-1706-1806-19

Category Aggregate Table

How computed

Category total executions means the sum of that category's observed test executions across every daily-suite run in the selected window.

Strict Failure Ratio = failed executions for that category divided by total executions for that category across the window.

Non-pass Ratio = (failed + pending + skipped) executions for that category divided by total executions for that category across the window.

Example: if Billing executed 800 times across the week and 2 of those executions failed, Billing strict failure ratio is 0.25%. That does not mean 0.25% of pipelines failed; it means 0.25% of observed Billing executions ended in failed.

How computed

Category total executions means the sum of that category's observed test executions across every daily-suite run in the selected window.

Strict Failure Ratio = failed executions for that category divided by total executions for that category across the window.

Non-pass Ratio = (failed + pending + skipped) executions for that category divided by total executions for that category across the window.

Example: if Billing executed 800 times across the week and 2 of those executions failed, Billing strict failure ratio is 0.25%. That does not mean 0.25% of pipelines failed; it means 0.25% of observed Billing executions ended in failed.

CategoryTotalFailedPendingSkippedFailure RatioNon-pass RatioRuns With Failures
Billing7563000.40%0.40%3
Web539714000.26%0.26%5
Frontend2143370181.73%2.57%6
Library6020000.00%0.00%0
University210000.00%0.00%0
Subscriptions70000.00%0.00%0
Admission70000.00%0.00%0
Social12611008.73%8.73%1
CatFailF%NP%Tot
Billing
Pend 0Skip 0Runs 3
3
0.40%
0.40%
756
Web
Pend 0Skip 0Runs 5
14
0.26%
0.26%
5397
Frontend
Pend 0Skip 18Runs 6
37
1.73%
2.57%
2143
Library
Pend 0Skip 0Runs 0
0
0.00%
0.00%
602
University
Pend 0Skip 0Runs 0
0
0.00%
0.00%
21
Subscriptions
Pend 0Skip 0Runs 0
0
0.00%
0.00%
7
Admission
Pend 0Skip 0Runs 0
0
0.00%
0.00%
7
Social
Pend 0Skip 0Runs 1
11
8.73%
8.73%
126

Recent Runs

Recent Daily Suite Runs

DatePipelineSuitesStatusSummary
2026-06-13 18:12160133BillingWebFrontendLibraryUniversitySubscriptionsAdmissionSocialFAILEDTotal 1281 | Passed 1279 | Failed 2
2026-06-14 18:12160138BillingWebFrontendLibraryUniversitySubscriptionsAdmissionSocialFAILEDTotal 1281 | Passed 1280 | Failed 1
2026-06-15 18:12160308BillingWebFrontendLibraryUniversitySubscriptionsAdmissionSocialFAILEDTotal 1281 | Passed 1276 | Failed 5
2026-06-16 18:13160489BillingWebFrontendLibraryUniversitySubscriptionsAdmissionSocialFAILEDTotal 1304 | Passed 1285 | Failed 12
2026-06-17 18:13160727BillingWebFrontendLibraryUniversitySubscriptionsAdmissionSocialFAILEDTotal 1304 | Passed 1288 | Failed 11
2026-06-18 18:14160902BillingWebFrontendLibraryUniversitySubscriptionsAdmissionSocialFAILEDTotal 1304 | Passed 1295 | Failed 9
2026-06-19 18:14161005BillingWebFrontendLibraryUniversitySubscriptionsAdmissionSocialFAILEDTotal 1304 | Passed 1273 | Failed 25
2026-06-13 18:12Pipeline 160133BillingWebFrontendLibraryUniversitySubscriptionsAdmissionSocial
FAILED
T 1281 | P 1279 | F 2 | Pend 0
2026-06-14 18:12Pipeline 160138BillingWebFrontendLibraryUniversitySubscriptionsAdmissionSocial
FAILED
T 1281 | P 1280 | F 1 | Pend 0
2026-06-15 18:12Pipeline 160308BillingWebFrontendLibraryUniversitySubscriptionsAdmissionSocial
FAILED
T 1281 | P 1276 | F 5 | Pend 0
2026-06-16 18:13Pipeline 160489BillingWebFrontendLibraryUniversitySubscriptionsAdmissionSocial
FAILED
T 1304 | P 1285 | F 12 | Pend 0
2026-06-17 18:13Pipeline 160727BillingWebFrontendLibraryUniversitySubscriptionsAdmissionSocial
FAILED
T 1304 | P 1288 | F 11 | Pend 0
2026-06-18 18:14Pipeline 160902BillingWebFrontendLibraryUniversitySubscriptionsAdmissionSocial
FAILED
T 1304 | P 1295 | F 9 | Pend 0
2026-06-19 18:14Pipeline 161005BillingWebFrontendLibraryUniversitySubscriptionsAdmissionSocial
FAILED
T 1304 | P 1273 | F 25 | Pend 0

Recent Smoke Attempts

DateSuitePipelineJobStatusPassedFailedDuration
2026-06-15 13:04Frontend160207Frontend smokeFAILED852513m 19s
2026-06-15 17:38University160307University smokePASSED6003m 57s
2026-06-15 17:42Frontend160307Frontend smokePASSED11004m 57s
2026-06-16 10:59Frontend160350Frontend smokePASSED11004m 34s
2026-06-16 13:00Frontend160382Frontend smokeFAILED3035m 10s
2026-06-16 14:25University160415University smokePASSED6003m 12s
2026-06-16 14:28Frontend160415Frontend smokeFAILED10825m 15s
2026-06-16 14:41Frontend160418Frontend smokeFAILED10825m 38s
2026-06-16 15:15Frontend160437Frontend smokeFAILED10825m 29s
2026-06-16 16:40Frontend160472Frontend smokePASSED11004m 34s
2026-06-16 17:37Frontend160487Frontend smokePASSED11004m 30s
2026-06-16 17:41University160487University smokePASSED6003m 30s
2026-06-16 18:43Frontend160491Frontend smokePASSED11004m 54s
2026-06-17 14:53Frontend160654Frontend smokePASSED11004m 47s
2026-06-17 16:30Frontend160702Frontend smokePASSED11004m 36s
2026-06-17 23:51Frontend160732Frontend smokePASSED11004m 46s
2026-06-18 16:14Frontend160868Frontend smokeFAILED10197m 44s
2026-06-18 16:27Frontend160875Frontend smokePASSED11004m 32s
2026-06-18 17:11Frontend160882Frontend smokePASSED11004m 40s
2026-06-18 18:19Frontend160900Frontend smokePASSED11004m 55s
2026-06-18 18:31Frontend160904Frontend smokePASSED11004m 40s
2026-06-18 20:46Frontend160908Frontend smokePASSED11004m 35s
2026-06-19 09:40Frontend160911Frontend smokePASSED11004m 42s
2026-06-19 16:51Frontend161003Frontend smokePASSED11004m 41s
2026-06-19 21:05Frontend161014Frontend smokePASSED11004m 45s

Smoke Suite Breakdown

Frontend
22 attempts across 22 pipelines
73% green
Passed16
Failed6
Incomplete0
Avg runtime5m 21s
Median passing runtime4m 41s
Pipelines22
University
3 attempts across 3 pipelines
100% green
Passed3
Failed0
Incomplete0
Avg runtime3m 33s
Median passing runtime3m 30s
Pipelines3
Generated from GitLab project adservio/helm2. Times are shown in Europe/Bucharest. Daily-suite runtime is measured from GitLab pipeline and job timestamps. Category counts come from GitLab test-report JSON artifacts, with job-trace fallback when older artifacts have expired.