Pingdom customer impact
External signal1 active item(s) in this window.
Generated 2026-06-08 02:19 for 2026-05-30 07:00 to 2026-06-08 02:18 from Pingdom checks, Slack #_alerts_prod, and AWS SNS alerts.
Some enrichments were unavailable in this run; drilldowns below stay focused on captured evidence.
Bottom line: Pingdom observed recent customer-facing glitches (email unconfirmed) and application-level critical paths are present.
1 active item(s) in this window.
4 critical, 11 non-critical active item(s).
3 active item(s) in this window.
| Pingdom Check | Status | Events | Downtime | Last Seen | Likely Services | Correlated Evidence |
|---|---|---|---|---|---|---|
| Adservio Ro | Recovered recently | 5 | 23m | 2026-06-05 18:34 | unclassified | Pingdom-only evidence so far |
| https://www.adservio.ro/api/v2/status | No recent customer-visible issue | 0 | 0m | unclassified | Pingdom-only evidence so far |
Pingdom rows show externally visible signal first. The correlated evidence column helps tie the failing check back to services, Slack alert families, or AWS alarms when those links exist.
This view attributes alerts to the workload or resource named in the alert text. Grafana, Loki, and Tempo are treated as observability components and are excluded when a more specific impacted target is also present.
| Impacted Service / Resource | Highest Severity | Count | Last Seen | Status | Top Alert Types | Discussion Signal | Latest Thread Note |
|---|---|---|---|---|---|---|---|
| uni-api | Critical | 2 | 2026-06-05 09:40 | Recent (72h) | TraefikServiceHighErrorRate (1)TraefikServiceHighLatency (1) | General investigation | @U09JYAWCGLB | uni-api on TUIASI is returning HTTP 500 from StudentServiceImpl.getCatalogCloseStatus and updateCatalogCloseStatus because the JPA query fo… |
| web-80 | Critical | 8 | 2026-06-03 23:38 | Seen this week | TraefikServiceHighErrorRate (1)TraefikServiceHighLatency (7) | None | No thread note |
| core-grafana-80 | Critical | 3 | 2026-06-03 18:04 | Seen this week | TraefikServiceHighErrorRate (3) | None | No thread note |
| accommodations-api | Critical | 2 | 2026-06-04 14:40 | Seen this week | TraefikServiceHighErrorRate (2) | None | No thread note |
| metrics-server | Warning | 27 | 2026-06-08 00:20 | Seen today | KubeAggregatedAPIDown (27) | None | No thread note |
| admission-api | Warning | 6 | 2026-06-07 21:40 | Seen today | TraefikServiceHighLatency (6) | None | No thread note |
| ai-api | Warning | 11 | 2026-06-05 17:46 | Recent (72h) | TraefikServiceHighLatency (11) | None | No thread note |
| grafana | Warning | 24 | 2026-06-04 20:50 | Seen this week | KubeMemoryOvercommit (7)KubeNodeEviction (4)NodeSystemSaturation (3)KubeHpaMaxedOut (2)AlertmanagerFailedToSendAlerts (1) | General investigation | Brief node-exporter target gap caused by Cluster Autoscaler scaling down worker node 10.66.126.144 the node was cordoned, drained, and remo… |
| social-api | Warning | 8 | 2026-06-02 14:16 | Seen this week | TraefikServiceHighLatency (8) | None | No thread note |
| docgen2-api | Warning | 6 | 2026-06-02 16:17 | Seen this week | KubeHpaMaxedOut (6) | None | No thread note |
| subscriptions-api | Warning | 2 | 2026-06-02 12:56 | Seen this week | TraefikServiceHighLatency (2) | None | No thread note |
| accommodations-sync-users | Warning | 1 | 2026-06-03 14:10 | Seen this week | KubeJobFailed (1) | None | No thread note |
| colecteaza-sms-note-abs | Warning | 1 | 2026-06-03 14:10 | Seen this week | KubeJobFailed (1) | None | No thread note |
| minicrm-sync | Warning | 1 | 2026-06-03 14:10 | Seen this week | KubeJobFailed (1) | None | No thread note |
| notifications-event-manager | Warning | 1 | 2026-06-02 12:34 | Seen this week | KubeHpaMaxedOut (1) | None | No thread note |
| Alert | Severity | Count | Last Seen | Status | Threads | Top Impacted Services | Discussion Signal | Latest Thread Note |
|---|---|---|---|---|---|---|---|---|
| TraefikServiceHighErrorRate | Critical | 7 | 2026-06-05 09:40 | Recent (72h) | 1 | core-grafana-80 (3)accommodations-api (2)web-80 (1)uni-api (1) | General investigation | @U09JYAWCGLB | uni-api on TUIASI is returning HTTP 500 from StudentServiceImpl.getCatalogCloseStatus and updateCatalogCloseStatus because the JPA query fo… |
| KubeAggregatedAPIDown | Warning | 27 | 2026-06-08 00:20 | Seen today | 0 | metrics-server (27) | None | |
| TraefikServiceHighLatency | Warning | 21 | 2026-06-07 21:40 | Seen today | 0 | ai-api (11)social-api (8)web-80 (7)admission-api (6)subscriptions-api (2) | None | |
| KubeHpaMaxedOut | Warning | 9 | 2026-06-03 16:34 | Seen this week | 0 | docgen2-api (6)grafana (2)notifications-event-manager (1) | None | |
| KubeMemoryOvercommit | Warning | 7 | 2026-06-04 19:08 | Seen this week | 0 | grafana (7) | None | |
| KubeNodeEviction | Warning | 4 | 2026-06-04 20:50 | Seen this week | 0 | grafana (4) | None | |
| NodeSystemSaturation | Warning | 3 | 2026-06-02 12:28 | Seen this week | 0 | grafana (3) | None | |
| KubeJobFailed | Warning | 1 | 2026-06-03 14:10 | Seen this week | 0 | accommodations-sync-users (1)colecteaza-sms-note-abs (1)minicrm-sync (1) | None | |
| AlertmanagerFailedToSendAlerts | Warning | 1 | 2026-06-03 14:10 | Seen this week | 0 | grafana (1) | None | |
| TargetDown | Warning | 1 | 2026-06-01 11:34 | Seen this week | 1 | grafana (1) | General investigation | Brief node-exporter target gap caused by Cluster Autoscaler scaling down worker node 10.66.126.144 the node was cordoned, drained, and remo… |
| KubeletServerCertificateExpiration | Warning | 6 | 2026-05-31 11:22 | No recent signal | 1 | grafana (6) | General investigation | Cert auto-rotated over the weekend |
Status is heuristic. Slack rarely posts explicit resolutions, so “Seen today” or “Recent” means the alert family still appeared in production recently, not that it is definitely unresolved.
| AWS Alarm | Emails | ALARM | OK | State Flips | First Seen | Last Seen | Latest State | Status |
|---|---|---|---|---|---|---|---|---|
| adservio-rds-postgres-billing-cpu-high | 6 | 3 | 3 | 5 | 2026-06-01 10:11 | 2026-06-03 11:53 | OK | Latest OK |
| adservio-rds-mysql-master-disk-queue-high | 2 | 1 | 1 | 1 | 2026-06-06 04:22 | 2026-06-06 04:23 | OK | Latest OK |
| adservio-rds-postgres-billing-disk-queue-high | 2 | 1 | 1 | 1 | 2026-06-01 10:09 | 2026-06-01 10:13 | OK | Latest OK |
“Flapping, latest OK” means the most recent email was an OK, but the alarm toggled repeatedly and is still a reliability concern.
| Thread Date | Alert | Severity | Services | Signal | Key Notes |
|---|---|---|---|---|---|
| 2026-06-05 09:40 | TraefikServiceHighErrorRate | Critical | uni-api | General investigation | @U09JYAWCGLB | uni-api on TUIASI is returning HTTP 500 from StudentServiceImpl.getCatalogCloseStatus and updateCatalogCloseStatus because the JPA query fo… |
| 2026-06-01 11:34 | TargetDown | Warning | grafana | General investigation | Brief node-exporter target gap caused by Cluster Autoscaler scaling down worker node 10.66.126.144 the node was cordoned, drained, and remo… |
| 2026-05-31 11:22 | KubeletServerCertificateExpiration | Warning | grafana | General investigation | Cert auto-rotated over the weekend |