How to deploy Azure Stack HCI part 1 (Manual)

Before we start, we will need to know a few things about Azure Stack HyperConverged Infrastructure. Basically it means that we can deploy physical hardware on-premise that uses internal hardware to create a redundant setup when deploying multiple nodes. When deploying Azure Stack HCI on hardware you will get an special Hyper-V installation with a pre-configured Storage Spaces Direct. On top of Hyper-V you can deploy Virtual Machines and Kubernetes based clusters. On top of this you can manage everything using the Azure Portal and using the Azure Tools like Azure Backup.

image 70

Pricing

When deploying services in Azure it means that management will cost you some $/€. In the case of Azure Stack HCI you will have 2 costs. Since I am in the Euro zone, pricing in euro’s:

Azure Stack HCI node per physical core per month: € 10,-
Azure Stack HCI hosted Virtual Machine Server per physical core per month: € 21,80

Optionally you can bring your own license

Prerequisites

Before deploying Azure Stack HCI we will need the gather a few things.

  • Global admin
  • On premise active directory
  • Domain administrator
  • Azure Stack HCI supported Hardware
  • At least 4 network ports with multiple VLAN capabilities.
  • Azure Stack HCI 23H2 (This manual does not work with earlier versions!)

Hardware requirements

In addition to Microsoft Azure Stack HCI updates, many OEMs also release regular updates for your Azure Stack HCI hardware, such as driver and firmware updates. To ensure that OEM package update notifications, check with your OEM about their specific notification process.

Before deploying Azure Stack HCI, version 23H2, ensure that your hardware is up to date by downloading the Support Package from your hardware vendor. Before deploying Azure Stack HCI make sure to deploy the latest firmware to your system.

Before you begin, make sure that the physical server and storage hardware used to deploy an Azure Stack HCI cluster meets the following requirements

ComponentMinimum
Number of servers1 to 16 servers are supported.
Each server must be the same model, manufacturer, have the same network adapters, and have the same number and type of storage drives.
CPUA 64-bit Intel Nehalem grade or AMD EPYC or later compatible processor with second-level address translation (SLAT).
MemoryA minimum of 32-GB RAM per node.
Host network adaptersAt least two network adapters listed in the Windows Server Catalog. Or dedicated network adapters per intent, which does require two separate adapters for storage intent. For redudancy purposes I would recommend to go for at least 4 network connections.
BIOSIntel VT or AMD-V must be turned on.
Boot driveA minimum size of 200-GB size.
Data drivesAt least two disks with a minimum capacity of 500 GB (SSD or HDD).
Single servers must use only a single drive type: Nonvolatile Memory Express (NVMe) or Solid-State (SSD) drives.
Trusted Platform Module (TPM)TPM version 2.0 hardware must be present and turned on.
Secure bootSecure Boot must be present and turned on.

Special attention for the storage. You can use RAID 1 for your OS drives, but all other disks needs to be added as JBOD drives.

STEP 1: On premise domain preparation

Before we can start with the setup of the server, we will need to prepare the local Active Directory to allow onboarding the HCI node. Login to one of your on-prem domain controllers and start a PowerShell windows as admin and run the following command:

Install-Module AsHciADArtifactsPrecreationTool -Repository PSGallery -Force
image 20

The AsHciADArtifactsPreCreationTool.ps1 module is used to prepare Active Directory.

  • The -AsHciOUName path doesn’t support the following special characters anywhere within the path - &,”,’,<,>.
  • Moving the computer objects to a different OU after the deployment is complete is also not supported.

Run the following command on the domain controller.

New-HciAdObjectsPreCreation -AzureStackLCMUserCredential (Get-Credential) -AsHciOUName "OU=2azuredemo,DC=2azure,DC=local"

When running the command, the script will create a deployment user. Give the account a logical name and a safe password. We will need the credentials later during the deployment.

image 22

STEP 2: Hardware configuration

For this manual we used an HP DL380 GEN10 server with 128 GB of memory, 10 physical cores, 6x 960 GB SSD and 6 network interfaces.

Before startup connect the first 1 or 2 network adapters that will be used for management traffic. Then startup the server. After startup we get the default start screen of HP

image 3

Since the configuration of physical disks is different on each vendor, I left screenshots out of this manual. What we did is create a RAID1 drive for the OS, and created 4 JBOD drives for the Storage Spaces direct.

STEP 3: Deploy Azure Stack HCI OS

Go to the Azure Portal. From the Azure Portal search for Azure Stack HCI.

image

From the left side of the screen go to Azure Stack HCI and click on Download Azure Stack HCI

image 1

From the new sidebare select the latest version, in this case 23H2. Select your language and agree with the terms and privacy notice.

image 2

Now lets return to your hardware. Some vendors allow the mounting of the iso through ILO/iDRAC or other management card. If not create a bootable USB stick with Rufus or any other tool. Start your machine from the ISO/USB and start the installation.

mstsc 5vn93dzpfu

Accept the user installation and click next.

image 5

On this screen we are going to install the server using the CUSTOM installation option.

image 6

Select your OS Disk, in our case the 200GB RAID 1 drive.

Now wait patiently for the installation to complete…

image 8

When ready reboot the server

image 9

After the reboot you will receive the notification to change the default password of the Administrator acocunt

image 10

Set a new password

image 11

Press enter to finish.

image 12

After logging in you will see the SConfig menu.

Now is the time to install the latest drivers. Most vendors provide an ISO with the latest drivers. Ask your vendor for specific instructions how to update your drivers.

STEP 4: Configure networking

After the latest firmware have been installed we are going to configure networking. Type in 8 and press Enter

image 13

Since we have an DCHP server we just need to configure DNS. Type the Index number, in our case 4.

image 14

In the following menu change the network adapter address and set the DNS servers

image 15

We only changed DNS, but the screen looks similar for the network configuration.

image 17

From the main menu, change the name of the server. This will be visible in the Azure portal and can’t be changed after finalizing the deployment.

image 19

Now reboot the server.

restart-computer
image 26

After the reboot login again, and select option 15 from the menu. Type the following command on the server. This will install the Hyper-V features.

Enable-WindowsOptionalFeature -Online -FeatureName Microsoft-Hyper-V -All
image 23

Next is to enable remote management, run the following command:

winrm quickconfig
image 24

Now we need to enable the firewall rule to accept incoming ping requests on IPv4.

netsh advfirewall firewall add rule name="ICMP Allow incoming v4 echo request" protocol=icmpv4:8,any dir=in action=allow
image 25

Now restart the server again.

restart-computer
image 26

After reboot login again, and run the following command to install Arc Registration script from PSGallery

Install-Module AzsHCI.ARCinstaller
image 27

Now install required Powershell modules in your node for HCI Registration

Install-Module Az.Accounts -Force
Install-Module Az.ConnectedMachine -Force
Install-Module Az.Resources -Force
image 28

Now go back to the Azur portal. Register your subscription with the required resource providers (RPs). You need to be an owner or contributor on your subscription to register the following resource RPs:

  • Microsoft.HybridCompute
  • Microsoft.GuestConfiguration
  • Microsoft.HybridConnectivity
  • Microsoft.AzureStackHCI
image 29

Now return to you physical machine and run the following commands, change the following values

  • Subscription ID
  • Resource Group
  • Region
  • Tenant ID
#Define the subscription where you want to register your server as Arc device
$Subscription = "YourSubscriptionID"
 
#Define the resource group where you want to register your server as Arc device
$RG = "YourResourceGroupName"
 
#Define the region you will use to register your server as Arc device
$Region = "westeurope"
 
#Define the tenant you will use to register your server as Arc device
$Tenant = "YourTenantID"
image 31

After running the command we can connect with the Entra tenant. We will need a device code because there is no browser installed.

#Connect to your Entra ID account and Azure Subscription
Connect-AzAccount -SubscriptionId $Subscription -TenantId $Tenant -DeviceCode
image 33

On your device where you’ve logged on to the azure portal go to https://microsoft.com/devicelogin and type in the code from the step before.

image 32

When authenticated you can continue on the server itself.

Now run the following commands on the HCI server.

#Get the Access Token for the registration
$ARMtoken = (Get-AzAccessToken).Token
 
#Get the Account ID for the registration
$id = (Get-AzContext).Account.Id
image 34

Now run the registration script on the server. This will register the Server in Azure ARC

Invoke-AzStackHciArcInitialization -SubscriptionID $Subscription -ResourceGroup $RG -TenantID $Tenant -Region $Region -Cloud “AzureCloud” -ArmAccessToken $ARMtoken -AccountID $id

image 36

Now go back to the Azure Portal and go to Azure ARC. Verify under machines that your server is present. Click on your machine for the next steps.

image 37

On the new page go to Extensions. Important, wait untill all extensions are changed to succeeded. This can take up to 30 minutes to complete. Be patient 😉

image 39

When ready, continue with the next step.

image 40

STEP 5: Configure Azure Stack HCI Cluster

First we will need to assign the required permissions to the Resource Group we are going to use for the HCI Cluster.

Go through the tabs and assign the following permissions to the user who deploys the cluster:

  • Key Vault Data Access Administrator: This permission is required to manage data plane permissions to the key vault used for deployment.
  • Key Vault Secrets Officer: This permission is required to read and write secrets in the key vault used for deployment.
  • Key Vault Contributor: This permission is required to create the key vault used for deployment.
  • Storage Account Contributor: This permission is required to create the storage account used for deployment.
image 41

Search for the permissions and assign them to your deploying user.

image 42

Now go back to Azure ARC, click on Azure Stack HCI and click on the right side on Deploy cluster

image 43

Fill in the required fields, and make sure to create a new keyvault. This needs to be unique. Make sure to select your server and click on validate. If you have issues during the deployment, in the end of this manual you can find a few common errors.

image 44

Wait untill the validation succeeds.

image 46

For the first deployment select new configuration

image 47

On the next screen you can choose which option you want to use for the network cards. If you’re going to deploy a cluster with multiple nodes it is recommend to use different network cards for the storage.

Also fill in the network settings for your management network at the bottom of the page.

image 48

Now we are going to join the server to the local domain and create a new Azure On Premise location.

  • Custom location name: Your on-prem location name
  • Domain: use your on-prem domain
  • OU: fill in the full OU name
  • Deployment account: This is the account that was created in step 1
  • Local administrator: This is the local admin account of your HCI physical node.
image 56

For the security use the Recommended security settings, use custom configuration when applicable.

image 50

On the next page choose with options fits your needs the best. For this tu

image 51

Now the validation of the environment will take place. Please be patient. During our test we encountered several issues, we’ve marked them end you can find them at the bottom of the page.

image 52

When the resource creation validation in Azure is completed, start the hardware validation.

image 53

Please be patient, this should take up to one hour.

image 54

After a long hour of waiting the process should be finished.

image 59

In the last step create the cluster!

image 60

Again, the cluster will now be created and the physical node will be joined. Again, this might take more than an hour!

image 61

The job overview should look like this. If you think the process is hanging (Task taking longer dan +1 hour), reboot the server, and check the scheduled tasks for a matching task with a name in the portal that is hanging and run it. After that the process wil continue (Get-ScheduledTask)

image 62

When completed all tasks should say success!

image 72

When the cluster is created, make sure to install the latest updates!

image 73

Make sure to update the cluster nodes.

image 74

Under the hood in Active Directory we will see a normal computer object and a failover cluster object.

image 71

You have now successfully installed Azure Stack HCI. In the next manual we will continue with the configuration of the network, download the images and deploy a Virtual Machine.

Troubleshooting:

Validation error: Validate selected servers

wait for the LCMController extension to complete

image 64

Validation Error Azure Stack HCI cluster deployment

Type 'ValidateConnectivity' of Role 'EnvironmentValidator' raised an exception: Unable to create a valid session to Connecting to remote server failed with the following error message : Access is denied

If there is domain connectivity validation issues, make sure that the domain controller is reachable.

image 55

Validation error

Type 'ValidateConnectivity' of Role 'EnvironmentValidator' raised an exception: Unable to create a valid session to x.x.x.x: [x.x.x.x] Connecting to remote server x.x.x.x failed with the following error message : Access is denied

wrong username and password for the deployment user

image 57

Error hardware requirements not met.

Verify all your hardware and make sure latest firmware and drivers are installed.

Exception
Type 'ValidateHardware' of Role 'EnvironmentValidator' raised an exception: Hardware requirements not met. Review output and remediate: Rule: HealthCheckSource : Deployment\Hardware\da0dd4ca Name : AzStackHci_Hardware_Test_NetAdapter DisplayName : Test NetAdapter API 2AZURE-PILOT-N1 Tags : {} Title : Test NetAdapter API Status : FAILURE Severity : CRITICAL Description : Checking NetAdapter has CIM data Remediation : https://learn.microsoft.com/en-us/azure-stack/hci/deploy/deployment-tool-prerequisites TargetResourceID : Machine: 2AZURE-PILOT-N1, Class: NetAdapter TargetResourceName : Machine: 2AZURE-PILOT-N1, Class: NetAdapter TargetResourceType : NetAdapter Timestamp : 1-5-2024 12:17:49 AdditionalData: Key : Detail Value : Unable to retrieve data for NetAdapter on 2AZURE-PILOT-N1 Key : Status Value : FAILURE Key : TimeStamp Value : 05/01/2024 12:17:49 Key : Resource Value : Null Key : Source Value : 2AZURE-PILOT-N1 Rule: HealthCheckSource : Deployment\Hardware\da0dd4ca Name : AzStackHci_Hardware_Test_SingleNode_AllFlash DisplayName : Test Single Node All Flash Tags : {} Title : Test Single Node All Flash Status : FAILURE Severity : CRITICAL Description : Checking single node is all flash Remediation : https://learn.microsoft.com/en-us/windows-server/storage/storage-spaces/storage-spaces-direct-hardw are-requirements#minimum-number-of-drives-excludes-boot-drive TargetResourceID : TargetResourceName : TargetResourceType : Timestamp : 1-5-2024 12:17:50 AdditionalData: Key : Detail Value : Hostname '' drive types 'HDD: False, SSD:False, NVMe:False, SCM:False'. Expected all flash. Key : Status Value : FAILURE Key : TimeStamp Value : 05/01/2024 12:17:50 Key : Resource Value : HDD: False, SSD:False, NVMe:False, SCM:False Key : Source Value : Drive Type Rule: HealthCheckSource : Deployment\Hardware\da0dd4ca Name :
image 58

AdditionalData:

Key : Detail Value : Property ‘TpmEnabled’ value ‘False’ but expected ‘True’

Verify if the TPM is installed and enabled!

Additional documentation/information:

Microsoft Documentation: Azure Stack HCI documentation | Microsoft Learn

Change Firewall Settings Server Core: WIN 2019 Core activate SMB-in rule via PowerShell – IT koehler blog (it-koehler.com)

How to install Proliant Support Pack on HPE Server: Install HP SPP on Windows core (lbdg.me)

@Credits Jos Muller for the partnership during this deployment

Add a Comment

Your email address will not be published. Required fields are marked *