Pingdom customer impact
External signalNo Pingdom checks were available in this window.
No active issue listed in this category.
Generated 2026-04-25 16:19 for 2026-04-18 07:00 to 2026-04-25 07:00 from Pingdom checks, Slack #_alerts_prod, and AWS SNS alerts.
Bottom line: application-level critical paths are present.
No Pingdom checks were available in this window.
No active issue listed in this category.
3 critical, 10 non-critical active item(s).
3 active item(s) in this window.
| Pingdom Check | Status | Events | Downtime | Last Seen | Likely Services | Correlated Evidence |
|---|
Pingdom rows show externally visible signal first. The correlated evidence column helps tie the failing check back to services, Slack alert families, or AWS alarms when those links exist.
This view attributes alerts to the workload or resource named in the alert text. Grafana, Loki, and Tempo are treated as observability components and are excluded when a more specific impacted target is also present.
| Impacted Service / Resource | Highest Severity | Count | Last Seen | Status | Top Alert Types | Discussion Signal | Latest Thread Note |
|---|---|---|---|---|---|---|---|
| admission-api | Critical | 3 | 2026-04-21 09:33 | Seen this week | TraefikServiceHighErrorRate (3) | General investigation | @U09JYAWCGLB we know about this, discussing in daily with the devs | Hi Raul Popovici sure thanks |
| core-grafana-80 | Critical | 3 | 2026-04-21 00:41 | Seen this week | TraefikServiceHighErrorRate (3) | None | No thread note |
| uni-api | Critical | 1 | 2026-04-21 12:41 | Seen this week | TraefikServiceHighErrorRate (1) | None | No thread note |
| grafana | Warning | 26 | 2026-04-24 12:29 | Seen today | KubeCPUOvercommit (24)KubeContainerWaiting (2) | None | No thread note |
| ai-api | Warning | 4 | 2026-04-24 17:35 | Seen today | TraefikServiceHighLatency (4) | General investigation | some POSTs to ai summaries seem to take 5s or more. But we don't have traces in ai-api pod to see if it is normal (summaries could indeed t… |
| social-api | Warning | 3 | 2026-04-24 15:07 | Seen today | TraefikServiceHighLatency (3) | Resource limits | *Root cause summary* | This looks like a slow successful `social-api` news detail path, not a crash, 5xx, or resource saturation issue. The… |
| update-recurenta Grouped 4 variantsVariant mentions 28Active variants 4 | Warning | 28 | 2026-04-22 13:48 | Recent (72h) | KubeJobFailed (27)KubePodCrashLooping (1) | None | No thread note |
| docgen2-api | Warning | 4 | 2026-04-22 09:47 | Recent (72h) | KubeHpaMaxedOut (4) | None | No thread note |
| colecteaza-sms-note-abs | Warning | 14 | 2026-04-20 11:53 | Seen this week | KubeJobFailed (14) | None | No thread note |
| download-album | Warning | 14 | 2026-04-20 11:53 | Seen this week | KubeJobFailed (14) | None | No thread note |
| rezumat | Warning | 14 | 2026-04-20 11:53 | Seen this week | KubeJobFailed (14) | None | No thread note |
| core-getresponse-events-worker | Warning | 2 | 2026-04-20 11:48 | Seen this week | KubeDeploymentReplicasMismatch (2) | None | No thread note |
| admission-migration | Warning | 1 | 2026-04-21 13:18 | Seen this week | KubeJobFailed (1) | None | No thread note |
| forms-api Grouped 2 variantsVariant mentions 2Active variants 2 | Warning | 4 | 2026-04-20 17:27 | Likely noise / resolved | KubePodCrashLooping (2)KubeDeploymentRolloutStuck (2) | General investigationAlert tuning / noise | Investigation summary for `forms-api` | | *What happened* | - The `forms-api` rollout stalled and triggered `KubeDeploymentRolloutStuck`. |… | <!subteam^S0A2M3GD3CN>, @U09JYAWCGLB @U633D9JBW I made this skill that can run automatically as soon as an alert appears in Slack, so it ca… | we can improve it further based o… |
| Alert | Severity | Count | Last Seen | Status | Threads | Top Impacted Services | Discussion Signal | Latest Thread Note |
|---|---|---|---|---|---|---|---|---|
| TraefikServiceHighErrorRate | Critical | 7 | 2026-04-21 12:41 | Seen this week | 1 | core-grafana-80 (3)admission-api (3)uni-api (1) | General investigation | @U09JYAWCGLB we know about this, discussing in daily with the devs | Hi Raul Popovici sure thanks |
| KubeCPUOvercommit | Warning | 24 | 2026-04-24 12:29 | Seen today | 0 | grafana (24) | None | |
| TraefikServiceHighLatency | Warning | 7 | 2026-04-24 17:35 | Seen today | 2 | ai-api (4)social-api (3) | Resource limitsGeneral investigation | some POSTs to ai summaries seem to take 5s or more. But we don't have traces in ai-api pod to see if it is normal (summaries could indeed t… |
| KubeJobFailed | Warning | 27 | 2026-04-22 13:48 | Recent (72h) | 0 | update-recurenta (27)colecteaza-sms-note-abs (14)download-album (14)rezumat (14)admission-migration (1) | None | |
| KubeHpaMaxedOut | Warning | 4 | 2026-04-22 09:47 | Recent (72h) | 0 | docgen2-api (4) | None | |
| KubePodCrashLooping | Warning | 3 | 2026-04-20 17:24 | Seen this week | 1 | forms-api (2)update-recurenta (1) | General investigation | Warning Unhealthy 80s (x19 over 3m22s) kubelet spec.containers{forms-api-container}: Readiness probe failed: Get "": dial tcp 10.66.111.240… | one of the pod is healthy to keep the service alive . | Thats right |
| KubeContainerWaiting | Warning | 2 | 2026-04-20 12:38 | Seen this week | 0 | grafana (2) | None | |
| KubeDeploymentReplicasMismatch | Warning | 2 | 2026-04-20 11:48 | Seen this week | 0 | core-getresponse-events-worker (2) | None | |
| KubeDeploymentRolloutStuck | Warning | 2 | 2026-04-20 17:27 | Likely noise / resolved | 1 | forms-api (2) | Alert tuning / noise | Investigation summary for `forms-api` | | *What happened* | - The `forms-api` rollout stalled and triggered `KubeDeploymentRolloutStuck`. |… | <!subteam^S0A2M3GD3CN>, @U09JYAWCGLB @U633D9JBW I made this skill that can run automatically as soon as an alert appears in Slack, so it ca… | we can improve it further based o… |
Status is heuristic. Slack rarely posts explicit resolutions, so “Seen today” or “Recent” means the alert family still appeared in production recently, not that it is definitely unresolved.
| AWS Alarm | Emails | ALARM | OK | State Flips | First Seen | Last Seen | Latest State | Status |
|---|---|---|---|---|---|---|---|---|
| adservio-root-account-usage | 4 | 2 | 2 | 3 | 2026-04-23 14:38 | 2026-04-23 15:32 | OK | Latest OK |
| adservio-rds-mysql-catalog2-storage-low | 2 | 1 | 1 | 1 | 2026-04-20 13:48 | 2026-04-20 13:52 | OK | Latest OK |
| adservio-rds-mysql-catalog-disk-queue-high | 2 | 1 | 1 | 1 | 2026-04-20 10:13 | 2026-04-20 10:14 | OK | Latest OK |
“Flapping, latest OK” means the most recent email was an OK, but the alarm toggled repeatedly and is still a reliability concern.
| Thread Date | Alert | Severity | Services | Signal | Key Notes |
|---|---|---|---|---|---|
| 2026-04-24 17:35 | TraefikServiceHighLatency | Warning | ai-api | General investigation | some POSTs to ai summaries seem to take 5s or more. But we don't have traces in ai-api pod to see if it is normal (summaries could indeed t… |
| 2026-04-21 10:19 | TraefikServiceHighLatency | Warning | social-api | Resource limits | *Root cause summary* | This looks like a slow successful `social-api` news detail path, not a crash, 5xx, or resource saturation issue. The… |
| 2026-04-21 09:33 | TraefikServiceHighErrorRate | Critical | admission-api | General investigation | @U09JYAWCGLB we know about this, discussing in daily with the devs | Hi Raul Popovici sure thanks |
| 2026-04-20 17:27 | KubeDeploymentRolloutStuck | Warning | forms-api | Alert tuning / noise | Investigation summary for `forms-api` | | *What happened* | - The `forms-api` rollout stalled and triggered `KubeDeploymentRolloutStuck`. |… | <!subteam^S0A2M3GD3CN>, @U09JYAWCGLB @U633D9JBW I made this skill that can run automatically as soon as an alert appears in Slack, so it ca… | we can improve it further based o… |
| 2026-04-20 16:46 | KubePodCrashLooping | Warning | forms-api | General investigation | Warning Unhealthy 80s (x19 over 3m22s) kubelet spec.containers{forms-api-container}: Readiness probe failed: Get "": dial tcp 10.66.111.240… | one of the pod is healthy to keep the service alive . | Thats right |