Do not fear friends, automating Windows Server is not that hard!! Well, it is not that hard once you get past the hard part. We know, we know, that is a typical Windows administrator
response, but it is true. Let's just say Windows Server has made many an administrator cry late at night trying to automate super simple tasks across dozens or thousands of hosts. But all the pain and frustration of automating Windows Server will subside after understanding and implementing a few foundational things that will set you and your organization up for long term success. Let's get into it.
What In The WinRM?
WinRM is short for Windows Remote Management. Cool! The expanded acronym explains Microsoft's intent of the tool, which is to manage remote Windows hosts. WinRM originally debuted around the Windows XP and Server 2003 days, and was an implementation of what was/is referred to as Web Services Management (WS-Management). In short, this was a specification for the exchange of administrative information with SOAP (Simple Object Access Protocol). What all of this means is that WinRM was developed for IT professionals to enable the management of hosts from a remote workstation or server. This sounds awesome, and it is! However, there are some basics that you need to be aware of so that you can securely scale Ansible within your organization. If you are interested in the fun low-level tech, checkout this quick read on the architecture of WinRM.
Below is a standard looking WinRM config. Let's review and go through what you want to change.
PS C:\Users\****\****> winrm get winrm/config
Config
MaxEnvelopeSizekb = 500
MaxTimeoutms = 60000
MaxBatchItems = 32000
MaxProviderRequests = 4294967295
Client
NetworkDelayms = 5000
URLPrefix = wsman
AllowUnencrypted = true [Source="GPO"] --> Set this to False
Auth
Basic = true --> Change this to False
Digest = true
Kerberos = true
Negotiate = true
Certificate = true
CredSSP = false
DefaultPorts
HTTP = 5985 --> Do your best to disable this and use 5986 exclusively
HTTPS = 5986
TrustedHosts = * [Source="GPO"] --> Change this a small set of Ansible Hosts. This can be Ansible Automation Platform (AAP) running OpenShift like we usually implement for clients, or Ansible Tower, or Azure Kubernetes Service (AKS) CIDR blocks. Basically, this IP list should be your most protected command and control systems that are allowed to connect to your Windows hosts on WinRM.
Service
RootSDDL = O:NSG:BAD:P(A;;GA;;;BA)(A;;GR;;;IU)S:P(AU;FA;GA;;;WD)(AU;SA;GXGW;;;WD)
MaxConcurrentOperations = 4294967295
MaxConcurrentOperationsPerUser = 1500
EnumerationTimeoutms = 240000
MaxConnections = 300 --> Complete your due diligence to make a determination if you can/should lower this value or if you need to increase. Usually you can lower this number.
MaxPacketRetrievalTimeSeconds = 120
AllowUnencrypted = true [Source="GPO"] --> Change to false
Auth
Basic = false [Source="GPO"]
Kerberos = true
Negotiate = true
Certificate = false --> think about enabling this. We use a combination of certs with NTLM and encrypted port 5986. Security is about layers not individual settings and tools.
CredSSP = false --> think about enabling this. it is a relatively new security feature with a lot of cool stuff.
CbtHardeningLevel = Relaxed
DefaultPorts
HTTP = 5985 --> disable if possible
HTTPS = 5986
IPv4Filter = * [Source="GPO"]
IPv6Filter = * [Source="GPO"]
EnableCompatibilityHttpListener = false
EnableCompatibilityHttpsListener = false
CertificateThumbprint
AllowRemoteAccess = true [Source="GPO"]
Winrs
AllowRemoteShellAccess = true
IdleTimeout = 7200000
MaxConcurrentUsers = 100 [Source="GPO"] --> you can probably lower this, but do your research. most systems only need less than a handful of concurrent users, and these are usually core infrastructure nodes like ADDS/DNS/DHCP etc.
MaxShellRunTime = 2147483647
MaxProcessesPerShell = 150 [Source="GPO"]
MaxMemoryPerShellMB = 2147483647
MaxShellsPerUser = 50 [Source="GPO"] --> do some research on your organization, but realistically most systems do NOT need 50 shells per user.
There is a lot to process in the above example. So we will summarize here:
Disable 5985. Just stop using 5985. Just because it is easy it does NOT make it right.
Only use 5986. Just do this. It is the encrypted port. Yes it is harder to get working, but that is why we are here. :)
Disable Basic Auth. There are many tutorials online about getting started with Ansible and the vast majority use Basic Auth as an example. Just do NOT use it. Leaving this enabled and using 5985 is like permanently turning on an evil version of the Bat Signal. Hackers will see it. Your system will be compromised.
Use Kerberos where possible. Let's face it, no enterprise has a uniform implementation of Active Directory with a singular and perfect forest and a singular domain. There are always caveats, there are usually multiple forests and multiple domains. There are usually black-box areas where systems may or may not be domain joined to a specialized domain, or not connected to a domain at all.
Use GPOs to set your default config. This is generally a good recommendation for those hosts domain joined.
TrustedHosts update. Get a list of only the hosts that your organization wants to connect to all other Windows systems and add this to your config. This will block all other WinRM traffic.
AllowUnencrypted. Set this to false. This should ONLY be set to true if you are a developer, and you are working in a feature branch, and you are in debug mode.
To start your WinRM journey you will need to deploy a few test machines, and get connected via RDP. Yes, we said it! Manually connect with RDP. One of the biggest stumbling blocks of new DevOps recruits we have seen over the years is the false thinking that all of automation is automated. Not only is manual effort required, manual work is the starting point for automation. The reason for this is simple. Most of the time when DevOps engineers are asked to automate a system or process, it is a whole new world - no, not that new world with Aladdin and Jasmine! So the first place you should start when entering a whole new world is manual trial and effort combined with documentation. You need to manually log into your new hosts via RDP so that you can start manually setting and removing WinRM configurations. What this will do is teach you more than a blog post could ever do about how WinRM actually works.
Once you are connected to your new host, you will need the commands below to display and review WinRM configurations:
winrm get winrm/config
The command above will display a WinRM config that looks very similar to the config we reviewed before in this blog. Next you will need to get a cert setup on your host. You will need to get a certificate from your Windows administrative team. Once you receive the cert, you will need to:
Click Start and select Run
Type MMC, which is short of Microsoft Management Console
Click File from the menu options
Select Add or Remove Snap-ins
Select Certificates and click Add
Click Computer Account
Install certs under Certificate local compute personal certificates
Open PowerShell and type out --> winrm quickconfig -transport:https
Now you should have a good simple and more secure WinRM config rocking on your host. This is more secure than simply using the winrm quickconfig command. That command gives you all of the defaults, which you do NOT want.
Next you will want to make sure your host is running a WinRM listener. Use the command below to check the status.
winrm enumerate winrm/config/listener
Check to see if your certs have been setup correctly:
winrm get http://schemas.microsoft.com/wbem/wsman/1/config
For more information on WinRM you can run the following command
winrm help config
Lastly here is a nice simple command you should test manually, and then incorporate into your automation framework either in Packer or as a bootstrap script at build time.
winrm s winrm/config/client @{TrustedHosts="server1,server2,server3"}
This was one of the recommendations previously mentioned.
Here is a good starter Ansible secure_vars.yml file that will get you started with some of the recommendations above.
ansible_connection: winrm
ansible_winrm_transport: ntlm
ansible_port: 5986
ansible_winrm_message_encryption: always
ansible_winrm_kerberos_delegation: True
ansible_become_method: runas
If you are very interested in learning everything there is to know about WinRM Ansible settings, checkout this page provided by Ansible.
We will give you some time to work on WinRM and the Ansible config above.
Excellent! At this point you should know a bit more about WinRM and be generally more comfortable with using and updating it. Congrats! Now you may be asking; "how do we make this work at scale and securely?" Great question, and a perfect transition into the next section. 🤓
Simple, Secure, and Scalable Ansible With Windows
The good and bad dichotomy of information technology is that there are usually many ways to get to the same desired outcome. This post will show one of the many ways you can create a simple, secure, and scalable Windows automation environment with Ansible.
Our approach is to move as much of the hard stuff left into builds and bake as much as we can into the foundational starting point of all future builds. This has a number of benefits, but the biggest one is the harder automation bits usually take longer for the automation to complete, so while the initial time commitment to get things working is very high, you will save hundreds/thousands of hours of time after your system is functional with scheduled builds. Practically speaking, we setup our simple, secure, and scalable Packer platform.
The diagram above is a loose approximation of our Packer platform.
Microsoft Azure DevOps = Orchestration Platform
Microsoft Azure Key Vault = Secrets Management
Packer = Imaging
Ansible = Configuration Management
Azure/Azure-Gov/AWS/VMware = Compute Platforms
Unfortunately we cannot go into all of the details about how to make this work, however we will write about a few of the key components.
First, you will need to get setup with an autounattended.xml file. If you have never used this before, you should check it out. This is a simple to understand and update XML file that is injected into the imaging process. One of the cool blocks of code you can modify is:
<settings pass="oobeSystem">
...
...
<component xmlns:wcm="http://schemas.microsoft.com/WMIConfig/2002/State"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" name="Microsoft-Windows-Shell-Setup" processorArchitecture="amd64" publicKeyToken="31bf3856ad364e35" language="neutral" versionScope="nonSxS">
<FirstLogonCommands>
<SynchronousCommand wcm:action="add">
<CommandLine>%SystemRoot%\system32\WindowsPowerShell\v1.0\powershell.exe -File a:\windows-cert-setup.ps1</CommandLine>
<Description>Set Execution Policy 64-Bit</Description>
<Order>2</Order>
<RequiresUserInput>true</RequiresUserInput>
</SynchronousCommand>
So we see that under the oobesystem block we can execute any number of PowerShell scripts at image configuration time. We have a few scripts here that will configure WinRM to our specifications. Based on the WinRM commands outlined previously, you should be able to create a simple script that meets the demands of your organization. Just remember the points above.
Here is a hint of what your finished Packer directory will look like:
To run this in Packer we first create a variable:
variable "vm_floppy_files_server_dc_dexp" {
type = list(string)
description = "Used for Server with Desktop Experience. The list of files or directories to be added to the virtual floppy device. Used for unattended installation."
default = [
"../../../configs/windows/windows-server-2022/autounattend.xml",
"../../../scripts/windows/",
"../../../drivers/windows",
]
}
Then we call the var in our source like the below example:
source "vsphere-iso" "windows-server-2022-cis" {
vcenter_server = var.vcenter_server
username = var.vcenter_username
password = var.vcenter_password
datacenter = var.vcenter_datacenter
cluster = var.vcenter_cluster
datastore = var.vcenter_datastore
folder = var.vcenter_folder
insecure_connection = var.vcenter_insecure_connection
tools_upgrade_policy = true
tools_sync_time = true
remove_cdrom = false
convert_to_template = false
guest_os_type = var.vm_guest_os_type
vm_version = var.vm_version
notes = "Built by Mentat-Packer on ${local.buildtime}."
vm_name = local.vm_name
firmware = var.vm_firmware
CPUs = var.vm_cpu_sockets
cpu_cores = var.vm_cpu_cores
CPU_hot_plug = false
RAM = var.vm_mem_size
RAM_hot_plug = false
boot_wait = var.vm_boot_wait
boot_command = var.vm_boot_command
boot_order = "disk,cdrom"
cdrom_type = var.vm_cdrom_type
disk_controller_type = var.vm_disk_controller_type
storage {
disk_size = var.vm_disk_size
disk_controller_index = 0
disk_thin_provisioned = true
}
network_adapters {
network = var.vcenter_network
network_card = var.vm_network_card
}
floppy_files = var.vm_floppy_files_server_dc_dexp
iso_paths = [
"${var.iso_datastore}${var.iso_path}/${var.iso_file}.iso",
"${var.iso_datastore}${var.iso_path}/vmware-tools.iso"
]
iso_checksum = "none"
ip_wait_timeout = "10m"
communicator = "winrm"
winrm_username = var.build_username
winrm_password = var.build_password
winrm_port = 5986
winrm_timeout = "15m"
winrm_use_ssl = true
winrm_insecure = true
shutdown_command = var.vm_shutdown_command
shutdown_timeout = "5m"
content_library_destination {
library = var.vcenter_content_library
name = "cis-windows-server-2022-${local.buildtime}"
ovf = false
destroy = true
}
}
Now there is a LOT more code that goes into this, but the above Packer source is enough to get you started. You will notice that our default port is WinRM 5986 in the source config. That means that even the initial connection to the ephemeral Packer instance is connecting on WinRM 5986! This is AwesomeOps at work.
The End
As always, we hope this blog helps a few people out! If you are interested in learning more about how to create a simple, secure, and scalable Packer platform within your organization, please reach out to us.
Comments