Refactoring community.vmware.vmware_guest in vmware.vmware

mikemorency · January 17, 2025, 6:11pm

Hello vmware users!

As we migrate modules from community.vmware to vmware.vmware, we have the opportunity to improve upon them. I’d like to discuss improving the vmware_guest module, either by making it easier to maintain, use, or improving the user experience.

The module is very, very long. The code will likely be broken up and simplified, and if the code is being split we may be able to split the module functionally as well. It seems like there are a lot of other modules that already do part of the functionality (vmware_guest_disk for example). The new vmware_guest module could just manage basic state of a VM and then you’d have to use other modules to manage disks, networks, etc. I think @mariolenz mentioned playing around with this idea, and I’d like to know if anyone has already taken this approach…

If you have any thoughts, please share them!

mariolenz · January 18, 2025, 1:35pm

community.vmware.vmware_guest is far too complex. It tries to manage more or less everything that you can do to with a VM. This list isn’t complete, but I would like to give some examples:

Creating VMs, cloning from a VM and cloning from a template
State: Not only if a VM should exist or not, but also the power state (on, off, suspended… triggering one-off actions like rebbooting and similar)
NICs / networking
HBAs, disks, CDROMs…
OS customization
Marking a VM as a template and vice versa

The really hard thing will be to decide what the module should do and what’s out of scope / other modules should do.

I’ve tried to come up with a solution, but this isn’t easy. Although… I’ve tried to remove things. Maybe I should start with an MVP like community.vmware.vm and then add stuff that looks really necessary

mikemorency · January 20, 2025, 2:30pm

Just looking at the list of modules we already have (and pretending vmware_guest doesnt exist):

Nothing?
vcenter_vm_guest_power, vcenter_vm_power
vmware_guest_network
vmware_guest_disk, vmware_guest_controller, vmware_guest_serial_port
vcenter_vm_guest_customization
Nothing?

So two pretty obvious gaps in functionality. I think Marking a VM as a template and vice versa is clear enough to create a module. I can create a task in vmware.vmware to make this a thing

Creating VMs, cloning from a VM and cloning from a template definitely needs some fleshing out. I think theres at least two modules there, one for cloning and one for bare bones deployments/vm state management. But like you, Im not sure with the minimum requirements for a deployment will be

bvitnik · January 20, 2025, 7:29pm

My take on this subject is that splitting a module like vmware_guest into multiple, more generic, modules will lead to complexity on user side.

Let me explain on an example. If vmware_guest was to be split into separate module for handling disks, VM cloning (from template or other VM), network attachments, OS customization and other, then instantiation of VM would be a multiple step/multiple task process. Not only that but you would have to keep track of info like object UUIDs that you need to pass from task to task. All of this makes things much harder for users. Now they have to have higher programming skills. Also, in certain cases this multiple step process can be a headache for idempotency. I struggled a lot with ovirt_vm module which does not work with disks and you are required to use another module ovirt_disk. Though, I admit oVirt itself has a lot of it’s own quirks compared to VMware.

A single module that handles all of the complexity of instantiation of a VM in a single step is just much more user friendly.

For the code bloat of the vmware_guest module, I support the idea to split common functionality, shared with other modules, into module_utils, but changing the interface of the vmware_guest module in any significant way, no. I don’t see that as “improving the user experience”.

mariolenz · January 20, 2025, 8:25pm

This module is buggy as hell and pretty much unmaintainable. Please trust me, I’m the maintainer of community.vmware. This might be more user friendly, but it’s also an accident waiting to happen… personally, I value stability over user friendly.

I don’t understand too much of the code which I think is a real problem (reminder: I’m maintaining community.vmware). If you run this in production, you really should have a bad feeling if the maintainer says “I don’t understand half of this”.

Maybe @mikemorency feels happy about the code and just copies it over to vmware.vmware… then I’m out of the discussion because it will be their problem

I didn’t have a closer look at this, but maybe we can split up the functionality of the module into several ones and create a role that combines them to achieve what community.vmware.vmware_guestdoes now? This would make the code more maintainable and still keep usability.

bvitnik · January 20, 2025, 9:15pm

@mariolenz I completely agree with you about vmware_guest being buggy, about code being hard to understand and hard to maintain. Hell, I even had to locally patch a few things myself in the module or find some workarounds for it’s quirks. My point is that module interface should not change significantly.

In my opinion, a user should be able to tell the module:

Name of the VM
Location of the VM (datacenter, cluster, folder…)
VM template to use
Which additional disks to create and attach to the VM, including size and datastore
Which network interfaces to create and to which networks to attach them
Which OS parameters to set trough OS customization like hostname, networking parameters etc.

And for VM to be provisioned in one go. Not to assemble VM from pieces :).

While this can be achieved with a role, it’s way more practical to implement any complex data processing, data passing and possibly API calls in Python than to do it in Ansible tasks or Jinja.

All this comes from my experience with vmware_guest and similar modules for other virtualization platforms and implementing roles for handling VM provisioning. I had more trouble with modules with split functionality. I may be wrong but…

mikemorency · January 21, 2025, 3:27pm

I think this is a good approach. We also document use cases in the collection README now, which outlines the modules you might use to achieve a goal. That would help new users figure out what they want to do.

When we talk about the “learning curve” required to create a VM, I think the main difference in these two approaches is knowing what module(s) to use. In both cases, the user will be defining the VM as a block of yaml and either passing it into a single module or breaking it up and passing it into multiple modules. A role would significantly reduce the differences for the user since they no longer need to know what module to call.

I had more trouble with modules with split functionality

@bvitnik can you expand on some of the issues?

smatterchew · January 21, 2025, 4:16pm

My preference would be to split the vmware_guest module into smaller modules that, when named well, do discreet things well. I’m not as concerned about getting module results and carrying them through my roles as this is a pretty normal process in other environments like Azure and OCI.

Creating a VM in my job is the culmination of the process, not the first step. There are several verification tasks the preced this step like:

The vCenter exists
The Content library exists
The requested template/vm image exists
The requested virtual DC exists
The requested VLAN/DVS exists
Find some storage
Create the VM
Customize…

Each of these returns vars that need to be referenced later.

One of the advantages of more tasks with smaller scope is you know exactly what the module is being called for and what the Ansible dev intended by invoking it. If I set up a bunch of vars and then call vmware_guest I have to study the role to see what the effect of the call might be on an existing VM. IMO, more maintainable code.

mariolenz · January 21, 2025, 6:41pm

If you want to clone from another VM or a template, and also want to have OS customization: Those are one-off actions. It’s OK to have modules for this, but I think it might be helpful to separate imperative (like reboot, or cloning) from declarative (like number of vCPUs) things. Just my opinion.

BTW It’s great to see so much feedback, thanks all! I hope the next Bullhorn will mention this thread and we get even more

bvitnik · January 22, 2025, 11:51pm

I can agree on this. For example, imperative stuff like power management is already split into separate module vmware_guest_powerstate. The explanation I was given by Abhijeet Kasurde on why power management was also kept in vmware_guest is to not break compatibility for people who use this functionality in vmware_guest.

Here are some scenarios that are problematic. It mostly boils down to being hard to maintain idempotency if vmware_guest is to be replaced by some role.

Scenario 1 - VM provisioning (e.g. from template) and disk management is split into separate modules:

The problematic thing here is that VM comes with at least one disk (possibly multiple) in a template itself. Imagine your module for VM provisioning does not have any disk management at all. Once you provision a VM, you are expected to call another module to add additional disks and/or to manage template provided disk. Imagine you have multiple data stores and you keep the template on one but want some template provisioned disks to be on another data store and additional disks on yet another data store. If you provision a VM in this scenario, template provided disks will have to end up either on the same data store as template (if you have 0 disk management functionality in a module) or they will all end up on some other data store but still the same for all template provided disks. Once disk management module kicks in, it will have to move some template provided disks to desired data stores and, of course, add additional disks. So there could be some unnecessary disk migration involved. I’m not so familiar with vCenter API but it’s reasonable to assume that you can specify up front where each template provided disk will end up during VM provisioning without the need to move the disks afterwords. In other words, to avoid disk migration, you have to have some disk management functionality inside VM provisioning module itself and, if that is the case, why not let the module do all the disk management required?

Scenario 2 - VM provisioning (e.g. from template) and OS customization is split into separate modules:

You want to provision a VM and you rely on OS customization for network configuration (and also reconfiguration). If you provision a VM from scratch then steps are:

Provision VM from template (module 1)
Apply OS customization (module 2)
Boot the VM (module 3)

OS customization will be done on first boot.

Now imagine you want to recustomize your already provisioned VM by just changing default GW of the primary network interface and running the same role/play. Now the steps look like this:

Is OS recustomization requested? (some change happened?)
Apply OS customization (module 2)
Is VM powered on or off?
If powered off, should I power on the VM to really apply OS customization and shut it down after? (module 3)
If powered on, should I reboot the VM to really apply OS customization? (module 3)
Wait for reconfiguration to happen
Validate reconfiguration

If you want to idempotently support both cases (VM from scratch + reconfiguration) and always only describe the final desired state of the VM instead of doing imperative actions on a VM, then it becomes so much harder to do in Ansible tasks compared to Python code inside a module. Module can in general track VM state more efficiently compared to Ansible tasks.

What kills these scenarios even more are API quirks that possibly don’t present themselves when using separate modules but require workarounds when multiple modules are called one after the other.

Offtopic:

What happened with the original VMware WG mentioned here:

especially the leads and VMware and RedHat employees? Can they provide any insight into this problem?

mariolenz · January 23, 2025, 5:27pm

Good point. A module can be idempotent, but if there’s a role calling half a dozen of modules and one of them fails you have an undefined state. Even if every module this role uses is idempotent, the role itself isn’t.

Balancing usability and maintainability looks like a hard problem

Well, if it would be easy @mikemorency wouldn’t have started this thread.

bvitnik · January 23, 2025, 6:02pm

Yes. My point exactly. Idempotency is very hard to achieve outside the module. If you also want to support check mode, it becomes even worse.

mikemorency · January 27, 2025, 2:20pm

I am not sure I agree idempotency is hard to achieve in Ansible roles/playbooks.
For example…

deploy vm. module is changed if the dest vm doesnt already exist
run customization when deploy is changed or when a force flag is set
ensure vm is powered on

I was looking at some content library “deployment” modules:
https://docs.ansible.com/ansible/latest/collections/community/vmware/vmware_content_deploy_ovf_template_module.html
https://docs.ansible.com/ansible/latest/collections/community/vmware/vmware_content_deploy_template_module.html#ansible-collections-community-vmware-vmware-content-deploy-template-module

They allow you to specify the datastore that the VM is deployed to, which I think is a good idea and would satisfy some of the concerns raise by Bojan.

mikemorency · February 25, 2025, 4:52pm

I wanted to mention that the latest version of vmware.vmware includes modules that deploy vms from different sources (ovf, templates).

(docs site hasnt been updated yet, so github links)

github.com/ansible-collections/vmware.vmware

plugins/modules/deploy_content_library_ovf.py

main

#!/usr/bin/python
# -*- coding: utf-8 -*-

# Copyright: (c) 2019, Ansible Project
# Copyright: (c) 2019, Pavan Bidkar <pbidkar@vmware.com>
# GNU General Public License v3.0+ (see LICENSES/GPL-3.0-or-later.txt or https://www.gnu.org/licenses/gpl-3.0.txt)
# SPDX-License-Identifier: GPL-3.0-or-later

from __future__ import absolute_import, division, print_function
__metaclass__ = type


DOCUMENTATION = r'''
---
module: deploy_content_library_ovf
short_description: Deploy a virtual machine from an OVF in a content library.
description:
    - Create a VM based on an OVF in a content library.
    - The module basis idempotentency on if the deployed VM exists or not, not the storage or deployment spec applied at deployment time.
author:

This file has been truncated. show original

github.com/ansible-collections/vmware.vmware

plugins/modules/deploy_content_library_template.py

main

#!/usr/bin/python
# -*- coding: utf-8 -*-

# Copyright: (c) 2019, Ansible Project
# Copyright: (c) 2019, Pavan Bidkar <pbidkar@vmware.com>
# GNU General Public License v3.0+ (see LICENSES/GPL-3.0-or-later.txt or https://www.gnu.org/licenses/gpl-3.0.txt)
# SPDX-License-Identifier: GPL-3.0-or-later

from __future__ import absolute_import, division, print_function
__metaclass__ = type


DOCUMENTATION = r'''
---
module: deploy_content_library_template
short_description: Deploy a virtual machine from a template in a content library.
description:
    - Create a VM based on a template in a content library.
    - The module basis idempotentency on if the deployed VM exists or not, not the storage or power settings applied at deployment time.
author:

This file has been truncated. show original

github.com/ansible-collections/vmware.vmware

plugins/modules/deploy_folder_template.py

main

#!/usr/bin/python
# -*- coding: utf-8 -*-

# Copyright: (c) 2023, Ansible Cloud Team (@ansible-collections)
# GNU General Public License v3.0+ (see LICENSES/GPL-3.0-or-later.txt or https://www.gnu.org/licenses/gpl-3.0.txt)
# SPDX-License-Identifier: GPL-3.0-or-later

from __future__ import absolute_import, division, print_function
__metaclass__ = type

DOCUMENTATION = r'''
---
module: deploy_folder_template
short_description: Deploy a VM from a template located in a folder
description:
    - Create a VM from a template that is located in vCenter. The template must be found in a folder, as opposed to being in a content library.
    - This module manages the deployed VM, not the template. The template will be unchanged.
    - To further configure the VM, ensure it is deployed in the powered off state and then use other modules to configure it.
    - The module basis idempotentency on if the deployed VM exists or not, not the storage or power settings applied at deployment time.

This file has been truncated. show original

These modules all allow you to specify storage backing (which doesn’t affect impotence) and placement (which does affect idempotent in some cases, like if you specify a folder).
So a user can

call deploy_content_library_ovf to ensure a VM exists and if it doesnt, deploy it from the specified OVF
run other modules to modify networks, devices, customize if the deploy changed etc
ensure the vm is powered on

I think that follows the approach we explored in this thread. I wanted to mention it so people could better visualize a future state for vmware_guest and provide feedback. I may try creating an example role (or a role like set of tasks) in the upcoming weeks too

mariolenz · February 25, 2025, 5:32pm

I think that’s because the latest vmware.vmware release isn’t part of the Ansible Community Package yet. They should show up after we’ve released a new ACP version, which will hopefully be today.

BTW the documentation is also available directly on galaxy:

Might be easier to read than the source code

Topic		Replies	Views
vmware/esx/vsphere module - planning Ansible Project vmware	35	10	February 25, 2015
vsphere_guest Ansible Developer galaxy-ng , vmware	9	1	December 21, 2014
Ansible VmWare module for Users, Group, Roles, Privileges Ansible Project vmware	0	6	January 10, 2018
vsphere_guest feature Ansible Project	2	2	May 9, 2014
Unable to use vmware_guest Module AWX Project awx , ubuntu , vmware	12	8	November 30, 2017

Refactoring community.vmware.vmware_guest in vmware.vmware

Related topics