How to deploy Azure Stack HCI part 1 (Manual)
Before we start, we will need to know a few things about Azure Stack HyperConverged Infrastructure. Basically it means that we can deploy physical hardware on-premise that uses internal hardware to create a redundant setup when deploying multiple nodes. When deploying Azure Stack HCI on hardware you will get an special Hyper-V installation with a pre-configured Storage Spaces Direct. On top of Hyper-V you can deploy Virtual Machines and Kubernetes based clusters. On top of this you can manage everything using the Azure Portal and using the Azure Tools like Azure Backup.
Pricing
When deploying services in Azure it means that management will cost you some $/€. In the case of Azure Stack HCI you will have 2 costs. Since I am in the Euro zone, pricing in euro’s:
Azure Stack HCI node per physical core per month: € 10,-
Azure Stack HCI hosted Virtual Machine Server per physical core per month: € 21,80
Optionally you can bring your own license
Prerequisites
Before deploying Azure Stack HCI we will need the gather a few things.
- Global admin
- On premise active directory
- Domain administrator
- Azure Stack HCI supported Hardware
- At least 4 network ports with multiple VLAN capabilities.
- Azure Stack HCI 23H2 (This manual does not work with earlier versions!)
Hardware requirements
In addition to Microsoft Azure Stack HCI updates, many OEMs also release regular updates for your Azure Stack HCI hardware, such as driver and firmware updates. To ensure that OEM package update notifications, check with your OEM about their specific notification process.
Before deploying Azure Stack HCI, version 23H2, ensure that your hardware is up to date by downloading the Support Package from your hardware vendor. Before deploying Azure Stack HCI make sure to deploy the latest firmware to your system.
Before you begin, make sure that the physical server and storage hardware used to deploy an Azure Stack HCI cluster meets the following requirements
Component | Minimum |
---|---|
Number of servers | 1 to 16 servers are supported. Each server must be the same model, manufacturer, have the same network adapters, and have the same number and type of storage drives. |
CPU | A 64-bit Intel Nehalem grade or AMD EPYC or later compatible processor with second-level address translation (SLAT). |
Memory | A minimum of 32-GB RAM per node. |
Host network adapters | At least two network adapters listed in the Windows Server Catalog. Or dedicated network adapters per intent, which does require two separate adapters for storage intent. For redudancy purposes I would recommend to go for at least 4 network connections. |
BIOS | Intel VT or AMD-V must be turned on. |
Boot drive | A minimum size of 200-GB size. |
Data drives | At least two disks with a minimum capacity of 500 GB (SSD or HDD). Single servers must use only a single drive type: Nonvolatile Memory Express (NVMe) or Solid-State (SSD) drives. |
Trusted Platform Module (TPM) | TPM version 2.0 hardware must be present and turned on. |
Secure boot | Secure Boot must be present and turned on. |
Special attention for the storage. You can use RAID 1 for your OS drives, but all other disks needs to be added as JBOD drives.
STEP 1: On premise domain preparation
Before we can start with the setup of the server, we will need to prepare the local Active Directory to allow onboarding the HCI node. Login to one of your on-prem domain controllers and start a PowerShell windows as admin and run the following command:
Install-Module AsHciADArtifactsPrecreationTool -Repository PSGallery -Force
The AsHciADArtifactsPreCreationTool.ps1 module is used to prepare Active Directory.
- The
-AsHciOUName
path doesn’t support the following special characters anywhere within the path- &,”,’,<,>
. - Moving the computer objects to a different OU after the deployment is complete is also not supported.
Run the following command on the domain controller.
New-HciAdObjectsPreCreation -AzureStackLCMUserCredential (Get-Credential) -AsHciOUName "OU=2azuredemo,DC=2azure,DC=local"
When running the command, the script will create a deployment user. Give the account a logical name and a safe password. We will need the credentials later during the deployment.
STEP 2: Hardware configuration
For this manual we used an HP DL380 GEN10 server with 128 GB of memory, 10 physical cores, 6x 960 GB SSD and 6 network interfaces.
Before startup connect the first 1 or 2 network adapters that will be used for management traffic. Then startup the server. After startup we get the default start screen of HP
Since the configuration of physical disks is different on each vendor, I left screenshots out of this manual. What we did is create a RAID1 drive for the OS, and created 4 JBOD drives for the Storage Spaces direct.
STEP 3: Deploy Azure Stack HCI OS
Go to the Azure Portal. From the Azure Portal search for Azure Stack HCI.
From the left side of the screen go to Azure Stack HCI and click on Download Azure Stack HCI
From the new sidebare select the latest version, in this case 23H2. Select your language and agree with the terms and privacy notice.
Now lets return to your hardware. Some vendors allow the mounting of the iso through ILO/iDRAC or other management card. If not create a bootable USB stick with Rufus or any other tool. Start your machine from the ISO/USB and start the installation.
Accept the user installation and click next.
On this screen we are going to install the server using the CUSTOM installation option.
Select your OS Disk, in our case the 200GB RAID 1 drive.
Now wait patiently for the installation to complete…
When ready reboot the server
After the reboot you will receive the notification to change the default password of the Administrator acocunt
Set a new password
Press enter to finish.
After logging in you will see the SConfig menu.
Now is the time to install the latest drivers. Most vendors provide an ISO with the latest drivers. Ask your vendor for specific instructions how to update your drivers.
STEP 4: Configure networking
After the latest firmware have been installed we are going to configure networking. Type in 8 and press Enter
Since we have an DCHP server we just need to configure DNS. Type the Index number, in our case 4.
In the following menu change the network adapter address and set the DNS servers
We only changed DNS, but the screen looks similar for the network configuration.
From the main menu, change the name of the server. This will be visible in the Azure portal and can’t be changed after finalizing the deployment.
Now reboot the server.
restart-computer
After the reboot login again, and select option 15 from the menu. Type the following command on the server. This will install the Hyper-V features.
Enable-WindowsOptionalFeature -Online -FeatureName Microsoft-Hyper-V -All
Next is to enable remote management, run the following command:
winrm quickconfig
Now we need to enable the firewall rule to accept incoming ping requests on IPv4.
netsh advfirewall firewall add rule name="ICMP Allow incoming v4 echo request" protocol=icmpv4:8,any dir=in action=allow
Now restart the server again.
restart-computer
After reboot login again, and run the following command to install Arc Registration script from PSGallery
Install-Module AzsHCI.ARCinstaller
Now install required Powershell modules in your node for HCI Registration
Install-Module Az.Accounts -Force
Install-Module Az.ConnectedMachine -Force
Install-Module Az.Resources -Force
Now go back to the Azur portal. Register your subscription with the required resource providers (RPs). You need to be an owner or contributor on your subscription to register the following resource RPs:
- Microsoft.HybridCompute
- Microsoft.GuestConfiguration
- Microsoft.HybridConnectivity
- Microsoft.AzureStackHCI
Now return to you physical machine and run the following commands, change the following values
- Subscription ID
- Resource Group
- Region
- Tenant ID
#Define the subscription where you want to register your server as Arc device
$Subscription = "YourSubscriptionID"
#Define the resource group where you want to register your server as Arc device
$RG = "YourResourceGroupName"
#Define the region you will use to register your server as Arc device
$Region = "westeurope"
#Define the tenant you will use to register your server as Arc device
$Tenant = "YourTenantID"
After running the command we can connect with the Entra tenant. We will need a device code because there is no browser installed.
#Connect to your Entra ID account and Azure Subscription
Connect-AzAccount -SubscriptionId $Subscription -TenantId $Tenant -DeviceCode
On your device where you’ve logged on to the azure portal go to https://microsoft.com/devicelogin and type in the code from the step before.
When authenticated you can continue on the server itself.
Now run the following commands on the HCI server.
#Get the Access Token for the registration
$ARMtoken = (Get-AzAccessToken).Token
#Get the Account ID for the registration
$id = (Get-AzContext).Account.Id
Now run the registration script on the server. This will register the Server in Azure ARC
Invoke-AzStackHciArcInitialization -SubscriptionID $Subscription -ResourceGroup $RG -TenantID $Tenant -Region $Region -Cloud “AzureCloud” -ArmAccessToken $ARMtoken -AccountID $id
Now go back to the Azure Portal and go to Azure ARC. Verify under machines that your server is present. Click on your machine for the next steps.
On the new page go to Extensions. Important, wait untill all extensions are changed to succeeded. This can take up to 30 minutes to complete. Be patient 😉
When ready, continue with the next step.
STEP 5: Configure Azure Stack HCI Cluster
First we will need to assign the required permissions to the Resource Group we are going to use for the HCI Cluster.
Go through the tabs and assign the following permissions to the user who deploys the cluster:
- Key Vault Data Access Administrator: This permission is required to manage data plane permissions to the key vault used for deployment.
- Key Vault Secrets Officer: This permission is required to read and write secrets in the key vault used for deployment.
- Key Vault Contributor: This permission is required to create the key vault used for deployment.
- Storage Account Contributor: This permission is required to create the storage account used for deployment.
Search for the permissions and assign them to your deploying user.
Now go back to Azure ARC, click on Azure Stack HCI and click on the right side on Deploy cluster
Fill in the required fields, and make sure to create a new keyvault. This needs to be unique. Make sure to select your server and click on validate. If you have issues during the deployment, in the end of this manual you can find a few common errors.
Wait untill the validation succeeds.
For the first deployment select new configuration
On the next screen you can choose which option you want to use for the network cards. If you’re going to deploy a cluster with multiple nodes it is recommend to use different network cards for the storage.
Also fill in the network settings for your management network at the bottom of the page.
Now we are going to join the server to the local domain and create a new Azure On Premise location.
- Custom location name: Your on-prem location name
- Domain: use your on-prem domain
- OU: fill in the full OU name
- Deployment account: This is the account that was created in step 1
- Local administrator: This is the local admin account of your HCI physical node.
For the security use the Recommended security settings, use custom configuration when applicable.
On the next page choose with options fits your needs the best. For this tu
Now the validation of the environment will take place. Please be patient. During our test we encountered several issues, we’ve marked them end you can find them at the bottom of the page.
When the resource creation validation in Azure is completed, start the hardware validation.
Please be patient, this should take up to one hour.
After a long hour of waiting the process should be finished.
In the last step create the cluster!
Again, the cluster will now be created and the physical node will be joined. Again, this might take more than an hour!
The job overview should look like this. If you think the process is hanging (Task taking longer dan +1 hour), reboot the server, and check the scheduled tasks for a matching task with a name in the portal that is hanging and run it. After that the process wil continue (Get-ScheduledTask)
When completed all tasks should say success!
When the cluster is created, make sure to install the latest updates!
Make sure to update the cluster nodes.
Under the hood in Active Directory we will see a normal computer object and a failover cluster object.
You have now successfully installed Azure Stack HCI. In the next manual we will continue with the configuration of the network, download the images and deploy a Virtual Machine.
Troubleshooting:
Validation error: Validate selected servers
wait for the LCMController extension to complete
Validation Error Azure Stack HCI cluster deployment
Type 'ValidateConnectivity' of Role 'EnvironmentValidator' raised an exception: Unable to create a valid session to Connecting to remote server failed with the following error message : Access is denied
If there is domain connectivity validation issues, make sure that the domain controller is reachable.
Validation error
Type 'ValidateConnectivity' of Role 'EnvironmentValidator' raised an exception: Unable to create a valid session to x.x.x.x: [x.x.x.x] Connecting to remote server x.x.x.x failed with the following error message : Access is denied
wrong username and password for the deployment user
Error hardware requirements not met.
Verify all your hardware and make sure latest firmware and drivers are installed.
Exception
Type 'ValidateHardware' of Role 'EnvironmentValidator' raised an exception: Hardware requirements not met. Review output and remediate: Rule: HealthCheckSource : Deployment\Hardware\da0dd4ca Name : AzStackHci_Hardware_Test_NetAdapter DisplayName : Test NetAdapter API 2AZURE-PILOT-N1 Tags : {} Title : Test NetAdapter API Status : FAILURE Severity : CRITICAL Description : Checking NetAdapter has CIM data Remediation : https://learn.microsoft.com/en-us/azure-stack/hci/deploy/deployment-tool-prerequisites TargetResourceID : Machine: 2AZURE-PILOT-N1, Class: NetAdapter TargetResourceName : Machine: 2AZURE-PILOT-N1, Class: NetAdapter TargetResourceType : NetAdapter Timestamp : 1-5-2024 12:17:49 AdditionalData: Key : Detail Value : Unable to retrieve data for NetAdapter on 2AZURE-PILOT-N1 Key : Status Value : FAILURE Key : TimeStamp Value : 05/01/2024 12:17:49 Key : Resource Value : Null Key : Source Value : 2AZURE-PILOT-N1 Rule: HealthCheckSource : Deployment\Hardware\da0dd4ca Name : AzStackHci_Hardware_Test_SingleNode_AllFlash DisplayName : Test Single Node All Flash Tags : {} Title : Test Single Node All Flash Status : FAILURE Severity : CRITICAL Description : Checking single node is all flash Remediation : https://learn.microsoft.com/en-us/windows-server/storage/storage-spaces/storage-spaces-direct-hardw are-requirements#minimum-number-of-drives-excludes-boot-drive TargetResourceID : TargetResourceName : TargetResourceType : Timestamp : 1-5-2024 12:17:50 AdditionalData: Key : Detail Value : Hostname '' drive types 'HDD: False, SSD:False, NVMe:False, SCM:False'. Expected all flash. Key : Status Value : FAILURE Key : TimeStamp Value : 05/01/2024 12:17:50 Key : Resource Value : HDD: False, SSD:False, NVMe:False, SCM:False Key : Source Value : Drive Type Rule: HealthCheckSource : Deployment\Hardware\da0dd4ca Name :
AdditionalData:
Key : Detail Value : Property ‘TpmEnabled’ value ‘False’ but expected ‘True’
Verify if the TPM is installed and enabled!
Additional documentation/information:
Microsoft Documentation: Azure Stack HCI documentation | Microsoft Learn
Change Firewall Settings Server Core: WIN 2019 Core activate SMB-in rule via PowerShell – IT koehler blog (it-koehler.com)
How to install Proliant Support Pack on HPE Server: Install HP SPP on Windows core (lbdg.me)
@Credits Jos Muller for the partnership during this deployment