What happened: On November 26, 2025, at 12:19 UTC, we observed a decline in usage-related metrics in our AI Model Hub. This was a result of connectivity issues in the secure connection to our backend inference engines.
What we did: After resolving the configuration issue the environment recovered.
What we are doing to avoid recurrence:
To spot a decline in usage metrics more quickly, we have updated our alerting thresholds.
Automatic health checks will be expanded.
Posted Nov 26, 2025 - 14:24 UTC
Resolved
All metrics have returned to baseline. We are marking this incident as resolved. We will publish a Root Cause Analysis here shortly.
Posted Nov 26, 2025 - 13:46 UTC
Monitoring
The Infrastructure Team has deployed the fix, and we are seeing improvements in availability. We are monitoring the situation until all metrics return to expected baselines.
Posted Nov 26, 2025 - 13:28 UTC
Identified
We have identified a connectivity issue preventing communication to the inferencing backends. The Infrastructure Team is currently working on a fix. We will provide the next update by 13:45 UTC.
Posted Nov 26, 2025 - 13:15 UTC
Update
We are updating the impact to outage. The AI Model Hub Team is currently investigating and is working to restore the service as quickly as possible. We will post the next update here 13:15 UTC latest.
Posted Nov 26, 2025 - 12:43 UTC
Update
We are continuing to investigate this issue.
Posted Nov 26, 2025 - 12:40 UTC
Investigating
We are currently investigating increased error rate in our AI Model Hub.
Posted Nov 26, 2025 - 12:40 UTC
This incident affected: Global Services (AI Model Hub).