We are marking this incident as resolved. The incident was caused by capacity constraints following a hardware failure. While capacity has been restored, we still see some usage‑specific constraints with the Llama 3.1 405B Instruct model. Our AI ModelHub team will deploy optimizations to the model to increase performance and reliability. We recommend that users still experiencing issues with the model check GPT‑OSS 120B as a potential (temporary) replacement.
Posted Mar 11, 2026 - 19:20 UTC
Monitoring
Our AI Model Hub Team has mitigated the incident. While the underlying root cause is not yet fully established or resolved, the model service should be stable. We are monitoring the situation while the investigation is ongoing
Posted Mar 09, 2026 - 18:53 UTC
Identified
The team has identified the root cause: hardware degradation affecting this model's hosting environment is causing backend instability. We are currently implementing a fix.
Posted Mar 09, 2026 - 11:52 UTC
Investigating
Our Model Hub Team is currently working on resolving errors related to an instance running the llama 405b model.
Posted Mar 09, 2026 - 08:26 UTC
This incident affected: Global Services (AI Model Hub).