intro

When using Aria Automation (formerly known as vRealize Automation) to create virtual machines, you might choose to include a cloud-init (linux) or cloudbase-init (windows) script in the blueprint to setup the guest operating system(s) following a successful deployment. These scripts get triggered once the VM is first booted up and can run any system command in the default CLI language (bash, powershell, zsh etc). Sadly, this can cause some very unexpected issues…

1: problem

When working on a customer site recently, I was faced with two challenges related to the use of cloud-init and cloudbase-init in the blueprint:

  1. When using an event subscription in Aria Automation to detect when a deployment is complete and trigger an Orchestrator workflow, the runtime of cloud-init and cloudbase-init is not accounted for. This means that post-deployment automation might fail to run if it relies on guest customization done by these scripts. For example, Orchestrator might fail to add the newly deployed system to Active Directory if cloudinit haven’t yet configured an IP and hostname on the guest OS.
  2. Both cloud-init and cloudbase-init rely on custom properties attached to the VMs on deployment. These custom properties (user-data) - while looking like harmless hashes at first glance - actually contain the entirety of the cloud-init/cloudbase-init encrypted with base64. You can probably imagine how that would be unsafe if, for example, the scripts contained some passwords or private IPs/ports. Anyone with vCenter access could potentially copy the user-data string and paste it into any base64 decoder to easily get access to the source cloud-init/cloudbase-init script.

Below is an example of how that would look like in vCenter (original user-data was covered up for privacy reasons): user-data

2: solution

Thankfully, both of these issues could be tackled simultaneously by a single scripting object in the post-deployment Orchestrator workflow:

  1. To address challenge #1 and detect when cloud-init/cloudbase-init run is complete, we are going to make them shut down the VM/guest operating system as the very last step. Then, by using a simple while loop, we can make the workflow detect the VM’s power state and make it wait before proceeding.
  2. Luckily, we also need the VMs to be off to remove the custom property and resolve issue #2. By using the vCenter integration, we can modify the VM object’s vApp spec to remove the custom properties.

3: code

At first, I developed the code below and inserted that as a JavaScript scripting object into the post-deployment workflow:

vmName = inputProperties.resourneName[0] 
// inputProperties are passed on to Orchestrator from the even subscription in Aria Automation

for (var i = 0; i < vms.length; i++) {
    if (vms[i].name == vmName){
      vm = vms[i] // match VM name with discovered VMs in inventory and retrive VM object
    }
}

var vm_status = vm.runtime.powerState.value
// below code will stop the execution until the VM's power state changes
while(vm_status == 'poweredOn'){     
    System.log(vm_status)
    System.sleep(2000)
    var vm_status = vm.runtime.powerState.value
}

var allProps = vm.config.vAppConfig.property
var vmSpec = new VcVirtualMachineConfigSpec()
vmSpec.vAppConfig = new VcVmConfigSpec()
// retrive current vApp spec from VM object and create a new blank spec to apply our changes

for (var i = 0; i < allProps.length; i++) {
     var dellIndex = allProps[i].key
     var property = new Array()
     property[i] = new VcVAppPropertySpec()
     property[i].operation = VcArrayUpdateOperation.remove
     property[i].removeKey_IntValue = dellIndex
     // the above flags properties for removal
    }

// lastly, run a reconfig task and power on the VM
vmSpec.VAppConfig.property = property
var vmReconfigTask = vm.reconfigVM_Task(vmSpec)
vm.powerOnVM_Task()

Unfortunately, while the VM state bit was working ok, we discovered an issue when the target VM had more than just one custom property. For example, on some VMs we not only had the user-data property originating from cloud-init/cloudbase-init but also an administrator and password properties added to the VM by some other automation tooling.

When a target VM had more than a single custom property, the code above was only able to remove one of them and then fail to execute on 2nd iteration of the for loop. After some troubleshooting and debugging, I created an improved version of the script which correctly iterates over all custom properties and removes them one by one:

vmName = inputProperties.resourneName[0] 
for (var i = 0; i < vms.length; i++) {
    if (vms[i].name == vmName){
      vm = vms[i] 
    }
}

var vm_status = vm.runtime.powerState.value
while(vm_status == 'poweredOn'){     
    System.log(vm_status)
    System.sleep(2000)
    var vm_status = vm.runtime.powerState.value
}

var allProps = vm.config.vAppConfig.property
for (var i = 0; i < allProps.length; i++) {
     var vmSpec = new VcVirtualMachineConfigSpec()
     vmSpec.vAppConfig = new VcVmConfigSpec()
     var dellIndex = allProps[i].key
     var property = new Array()
     property[0] = new VcVAppPropertySpec()
     property[0].operation = VcArrayUpdateOperation.remove
     property[0].removeKey_IntValue = dellIndex
     
     vmSpec.vAppConfig.property = property
     var vmReconfigTask = vm.reconfigVM_Task(vmSpec)
    }

vm.powerOnVM_Task()