Private Cloud network services degraded

Incident Report for IONOS Cloud

Postmortem

What happened?

On 15.01.2026, we experienced a network incident that resulted in a loss of connectivity to key management devices in Berlin, as well as a disruption of Private Cloud management and production traffic.

How did this happen? (Technical Root Cause)

The incident occurred during a planned maintenance window intended to introduce a new VLAN to the management devices.

Human error during command execution led to unexpected behavior: instead of appending the new VLAN, the existing VLANs on the trunk between the core and management devices were overwritten. Consequently, only the new VLAN remained active.

The removal of these VLANs led to the following cascading outages:

Management Loss: Loss of access to multiple management interfaces in the Berlin region.
Storage Disruption: The affected management devices hosted interfaces for internal storage clusters. When the associated VLANs became unreachable, the storage clusters were disconnected. During the final stages of recovery, restarting the affected services led to a brief secondary outage.
Data Integrity Protection: This disconnection forced several VMs and Edge Nodes into read-only mode to prevent data corruption.
Routing Issues: Since the NSX Edge Nodes (which bridge virtual networks with the physical environment) were impacted, external routing was lost. This also resulted in a loss of connectivity between the virtual machines and the internet.

What are we doing to prevent this from happening again?

To improve network resilience and prevent a recurrence, we have identified the following action items:

Architecture Review: We will conduct a comprehensive audit of the management device architecture to ensure better segmentation. (Target: Q3 2026)
VLAN Configuration Mitigation: We will evaluate and implement solutions to prevent unintended VLAN changes, such as improved configuration safeguards or automation scripts. (Target: Q1 2026)

Posted Jan 26, 2026 - 23:09 UTC

Resolved

This incident has been resolved.

Posted Jan 15, 2026 - 09:25 UTC

Monitoring

A fix has been implemented and we are monitoring the results.

Posted Jan 15, 2026 - 08:37 UTC

Investigating

We are writing to inform you that network services for VMware-based Private Clouds in Berlin are currently impacted. This is affecting both management VPN access as well as traffic to / from individual VMs.

Network technicians began working on this soon as the outages were detected, and will isolate the problem and solve the issue as quick as possible.

We will inform you as soon as the functionality has been restored.

Posted Jan 15, 2026 - 08:21 UTC

This incident affected: Location DE/TXL (Private Cloud).