How to Fix SDDC Manager Upgrade Failures Caused by Deployment Drift

May 30, 2025

Upgrading your VMware Cloud Foundation (VCF) environment is a critical maintenance task, but it doesn’t always go as planned. A particularly frustrating issue many administrators encounter is the failure of an SDDC Manager upgrade, where key services like domainmanager and operationsmanager refuse to start. The culprit is often a subtle problem known as “SDDC Manager Deployment Drift.”

If you’re staring at a failed upgrade and pulling your hair out, you’re in the right place. This guide will walk you through the exact steps to diagnose and resolve this common issue, getting your VCF environment back on the path to a successful upgrade.

Understanding the Problem: Why Does the Upgrade Fail?

During the SDDC Manager upgrade process, the system performs a series of checks and updates. As part of this workflow, the SDDC Manager appliance is often rebooted. The “Deployment Drift” issue typically arises after this reboot when certain services fail to start correctly.

When you investigate, you’ll likely see the domainmanager and operationsmanager services in a stopped state. Digging into the logs, specifically /var/log/vmware/vcf/commonsvcs/commonsvcs.log, will reveal errors pointing to a failure to start these services, often linked to file permission problems. This prevents the upgrade from continuing, leaving you in a failed state.

The Solution: A Step-by-Step Resolution Guide

Here is a clear, step-by-step process to fix the deployment drift issue and get your upgrade running successfully.

Step 1: safety first – take a snapshot

Before you begin any troubleshooting, the most important first step is to take a snapshot of the SDDC Manager virtual machine. This is your safety net. If any step goes wrong, you can quickly revert to the pre-troubleshooting state without causing further issues.

Step 2: set the correct file permissions and validate

The core of the problem is that the vcf user, which runs the services, has lost the necessary read/write access to its own configuration files.

To fix this, SSH to SDDC manager with vcf and su to root and execute the following commands to restore the correct permissions:
chown vcf_domainmanager:vcf /etc/vmware/vcf/domainmanager/application.properties
chown vcf_operationsmanager:vcf /etc/vmware/vcf/operationsmanager/application.properties 

This command sets the permissions to allow the owner (vcf) to read and write the file, while others can only read it, which is the expected configuration.

Validate the ownership is changed:

ls -lrt /etc/vmware/vcf/domainmanager
ls -lrt /etc/vmware/vcf/operationsmanager

Step 3: restart services and upgrade the SDDC Manager

With the permissions corrected, you can now restart the services manually to confirm they start correctly:
systemctl restart domainmanager
systemctl restart operationsmanager

After confirming the services are running, navigate back to the SDDC Manager UI and click UPGRADE. The upgrade process should now proceed past the point of failure.

Step 4: clean up any stale backup folders left behind by failed backup operations

To prevent this in future updates of the SDDC Manager cleanup any stable backup folders left behind by failed backup operations:

rm -rf /var/log/vmware/vcf/sddc-support/backup-(tab for folder name)

Conclusion

While a “Deployment Drift” failure during an SDDC Manager upgrade can be alarming, the fix is usually straightforward. By carefully resetting the file permissions, you can overcome this hurdle and complete your VCF upgrade successfully.

Always remember to start with a snapshot, and you can troubleshoot with confidence!

Published On: May 30, 2025Categories: VMware Cloud Foundation649 wordsViews: 902