Refactoring community.vmware.vmware_guest in vmware.vmware

Hello vmware users!

As we migrate modules from community.vmware to vmware.vmware, we have the opportunity to improve upon them. I’d like to discuss improving the vmware_guest module, either by making it easier to maintain, use, or improving the user experience.

The module is very, very long. The code will likely be broken up and simplified, and if the code is being split we may be able to split the module functionally as well. It seems like there are a lot of other modules that already do part of the functionality (vmware_guest_disk for example). The new vmware_guest module could just manage basic state of a VM and then you’d have to use other modules to manage disks, networks, etc. I think @mariolenz mentioned playing around with this idea, and I’d like to know if anyone has already taken this approach…

If you have any thoughts, please share them!

3 Likes

community.vmware.vmware_guest is far too complex. It tries to manage more or less everything that you can do to with a VM. This list isn’t complete, but I would like to give some examples:

  • Creating VMs, cloning from a VM and cloning from a template
  • State: Not only if a VM should exist or not, but also the power state (on, off, suspended… triggering one-off actions like rebbooting and similar)
  • NICs / networking
  • HBAs, disks, CDROMs…
  • OS customization
  • Marking a VM as a template and vice versa

The really hard thing will be to decide what the module should do and what’s out of scope / other modules should do.

I’ve tried to come up with a solution, but this isn’t easy. Although… I’ve tried to remove things. Maybe I should start with an MVP like community.vmware.vm and then add stuff that looks really necessary :thinking:

3 Likes

Just looking at the list of modules we already have (and pretending vmware_guest doesnt exist):

  • Nothing?
  • vcenter_vm_guest_power, vcenter_vm_power
  • vmware_guest_network
  • vmware_guest_disk, vmware_guest_controller, vmware_guest_serial_port
  • vcenter_vm_guest_customization
  • Nothing?

So two pretty obvious gaps in functionality. I think Marking a VM as a template and vice versa is clear enough to create a module. I can create a task in vmware.vmware to make this a thing

Creating VMs, cloning from a VM and cloning from a template definitely needs some fleshing out. I think theres at least two modules there, one for cloning and one for bare bones deployments/vm state management. But like you, Im not sure with the minimum requirements for a deployment will be

My take on this subject is that splitting a module like vmware_guest into multiple, more generic, modules will lead to complexity on user side.

Let me explain on an example. If vmware_guest was to be split into separate module for handling disks, VM cloning (from template or other VM), network attachments, OS customization and other, then instantiation of VM would be a multiple step/multiple task process. Not only that but you would have to keep track of info like object UUIDs that you need to pass from task to task. All of this makes things much harder for users. Now they have to have higher programming skills. Also, in certain cases this multiple step process can be a headache for idempotency. I struggled a lot with ovirt_vm module which does not work with disks and you are required to use another module ovirt_disk. Though, I admit oVirt itself has a lot of it’s own quirks compared to VMware.

A single module that handles all of the complexity of instantiation of a VM in a single step is just much more user friendly.

For the code bloat of the vmware_guest module, I support the idea to split common functionality, shared with other modules, into module_utils, but changing the interface of the vmware_guest module in any significant way, no. I don’t see that as “improving the user experience”.

This module is buggy as hell and pretty much unmaintainable. Please trust me, I’m the maintainer of community.vmware. This might be more user friendly, but it’s also an accident waiting to happen… personally, I value stability over user friendly.

I don’t understand too much of the code which I think is a real problem (reminder: I’m maintaining community.vmware). If you run this in production, you really should have a bad feeling if the maintainer says “I don’t understand half of this”.

Maybe @mikemorency feels happy about the code and just copies it over to vmware.vmware… then I’m out of the discussion because it will be their problem :smirk:

I didn’t have a closer look at this, but maybe we can split up the functionality of the module into several ones and create a role that combines them to achieve what community.vmware.vmware_guestdoes now? This would make the code more maintainable and still keep usability.

1 Like

@mariolenz I completely agree with you about vmware_guest being buggy, about code being hard to understand and hard to maintain. Hell, I even had to locally patch a few things myself in the module or find some workarounds for it’s quirks. My point is that module interface should not change significantly.

In my opinion, a user should be able to tell the module:

  • Name of the VM
  • Location of the VM (datacenter, cluster, folder…)
  • VM template to use
  • Which additional disks to create and attach to the VM, including size and datastore
  • Which network interfaces to create and to which networks to attach them
  • Which OS parameters to set trough OS customization like hostname, networking parameters etc.

And for VM to be provisioned in one go. Not to assemble VM from pieces :).

While this can be achieved with a role, it’s way more practical to implement any complex data processing, data passing and possibly API calls in Python than to do it in Ansible tasks or Jinja.

All this comes from my experience with vmware_guest and similar modules for other virtualization platforms and implementing roles for handling VM provisioning. I had more trouble with modules with split functionality. I may be wrong but…

I think this is a good approach. We also document use cases in the collection README now, which outlines the modules you might use to achieve a goal. That would help new users figure out what they want to do.

When we talk about the “learning curve” required to create a VM, I think the main difference in these two approaches is knowing what module(s) to use. In both cases, the user will be defining the VM as a block of yaml and either passing it into a single module or breaking it up and passing it into multiple modules. A role would significantly reduce the differences for the user since they no longer need to know what module to call.

I had more trouble with modules with split functionality

@bvitnik can you expand on some of the issues?

My preference would be to split the vmware_guest module into smaller modules that, when named well, do discreet things well. I’m not as concerned about getting module results and carrying them through my roles as this is a pretty normal process in other environments like Azure and OCI.

Creating a VM in my job is the culmination of the process, not the first step. There are several verification tasks the preced this step like:

  • The vCenter exists
  • The Content library exists
  • The requested template/vm image exists
  • The requested virtual DC exists
  • The requested VLAN/DVS exists
  • Find some storage
  • Create the VM
  • Customize…

Each of these returns vars that need to be referenced later.

One of the advantages of more tasks with smaller scope is you know exactly what the module is being called for and what the Ansible dev intended by invoking it. If I set up a bunch of vars and then call vmware_guest I have to study the role to see what the effect of the call might be on an existing VM. IMO, more maintainable code.

If you want to clone from another VM or a template, and also want to have OS customization: Those are one-off actions. It’s OK to have modules for this, but I think it might be helpful to separate imperative (like reboot, or cloning) from declarative (like number of vCPUs) things. Just my opinion.

BTW It’s great to see so much feedback, thanks all! I hope the next Bullhorn will mention this thread and we get even more :smiley:

2 Likes

I can agree on this. For example, imperative stuff like power management is already split into separate module vmware_guest_powerstate. The explanation I was given by Abhijeet Kasurde on why power management was also kept in vmware_guest is to not break compatibility for people who use this functionality in vmware_guest.

Here are some scenarios that are problematic. It mostly boils down to being hard to maintain idempotency if vmware_guest is to be replaced by some role.

Scenario 1 - VM provisioning (e.g. from template) and disk management is split into separate modules:

The problematic thing here is that VM comes with at least one disk (possibly multiple) in a template itself. Imagine your module for VM provisioning does not have any disk management at all. Once you provision a VM, you are expected to call another module to add additional disks and/or to manage template provided disk. Imagine you have multiple data stores and you keep the template on one but want some template provisioned disks to be on another data store and additional disks on yet another data store. If you provision a VM in this scenario, template provided disks will have to end up either on the same data store as template (if you have 0 disk management functionality in a module) or they will all end up on some other data store but still the same for all template provided disks. Once disk management module kicks in, it will have to move some template provided disks to desired data stores and, of course, add additional disks. So there could be some unnecessary disk migration involved. I’m not so familiar with vCenter API but it’s reasonable to assume that you can specify up front where each template provided disk will end up during VM provisioning without the need to move the disks afterwords. In other words, to avoid disk migration, you have to have some disk management functionality inside VM provisioning module itself and, if that is the case, why not let the module do all the disk management required?

Scenario 2 - VM provisioning (e.g. from template) and OS customization is split into separate modules:

You want to provision a VM and you rely on OS customization for network configuration (and also reconfiguration). If you provision a VM from scratch then steps are:

  • Provision VM from template (module 1)
  • Apply OS customization (module 2)
  • Boot the VM (module 3)

OS customization will be done on first boot.

Now imagine you want to recustomize your already provisioned VM by just changing default GW of the primary network interface and running the same role/play. Now the steps look like this:

  • Is OS recustomization requested? (some change happened?)
  • Apply OS customization (module 2)
  • Is VM powered on or off?
  • If powered off, should I power on the VM to really apply OS customization and shut it down after? (module 3)
  • If powered on, should I reboot the VM to really apply OS customization? (module 3)
  • Wait for reconfiguration to happen
  • Validate reconfiguration

If you want to idempotently support both cases (VM from scratch + reconfiguration) and always only describe the final desired state of the VM instead of doing imperative actions on a VM, then it becomes so much harder to do in Ansible tasks compared to Python code inside a module. Module can in general track VM state more efficiently compared to Ansible tasks.

What kills these scenarios even more are API quirks that possibly don’t present themselves when using separate modules but require workarounds when multiple modules are called one after the other.

Offtopic:

What happened with the original VMware WG mentioned here:

especially the leads and VMware and RedHat employees? Can they provide any insight into this problem?

1 Like

Good point. A module can be idempotent, but if there’s a role calling half a dozen of modules and one of them fails you have an undefined state. Even if every module this role uses is idempotent, the role itself isn’t.

Balancing usability and maintainability looks like a hard problem :cry:

Well, if it would be easy @mikemorency wouldn’t have started this thread.

1 Like

Yes. My point exactly. Idempotency is very hard to achieve outside the module. If you also want to support check mode, it becomes even worse.