Resolved -
This incident is now marked resolved, as all affected control planes have returned to and stayed in a stable state.
A Root Cause Analysis (RCA) is underway and will be published here once finalized.
Mar 17, 17:04 UTC
Monitoring -
Final changes have been applied to all afffected clusters resolving the issue. We are now monitoring the progress
Mar 16, 19:59 UTC
Update -
Final changes have been applied to all afffected clusters resolving the issue. We are now monitoring the progress
Mar 16, 19:58 UTC
Update -
The changes applied to some control plane clusters have had positive effects. The team is continuing the rollout to other affected clusters.
Mar 16, 19:34 UTC
Update -
We are marking DBaaS as recovered.
Our Kubernetes Team is currently working on stabilizing the Kubernetes Control Plane. We are focusing on mitigating recurring load spikes influencing stability.
Mar 16, 13:56 UTC
Update -
We are marking the Container Registry Service as recovered.
Mar 16, 12:57 UTC
Update -
We are closing the incident for the AI Model Hub. All metrics have recovered and the service should be up and running again normally.
Mar 16, 12:22 UTC
Update -
We are adding the Container Registry as an affected Service. Customers may currently experience issues pulling and pushing images from the Registry.
Mar 16, 11:57 UTC
Update -
Our Kubernetes Team has deployed a fix for the affected AI Model Hub Database Services. We currently see metrics improving and monitoring the situation closely.
Mar 16, 11:37 UTC
Update -
We are expanding the scope of this incident to include DBaaS and AI Model Hub. We have observed an increased error count originating from PostgresDB on Kubernetes. Additionally, to improve transparency, the previously reported separate incident regarding the AI Model Hub (https://status.ionos.cloud/incidents/rmgs845klm32) is being merged into this primary incident.
Mar 16, 11:05 UTC
Identified -
The team has identified the root cause as a resource constraint within the etcd database. Mitigation efforts are currently underway.
Mar 16, 10:04 UTC
Investigating -
Some customers may experience connection problems to the control plane and degraded functionality of kubernetes.
Our teams are investigating and working on a resolution.
Mar 16, 08:24 UTC