Internal Azure Load Balancer: How to load balance dependent services on one backend pool

Today I would like to present a common issue. Let’s assume the following scenario:

  1. Customer has a on-premises load balanced application,
  2. This application consists of several modules,
  3. Every node of load balancing pool hosts a set of modules,
  4. Modules operate independently, as they do not rely on any one single host for other modules they may require but use the load balancer as an intermediary for any additional modules required
  5. Some modules uses services provided by the other modules
  6. TCP based communication is used. Modules use more than only the HTTPS protocol
  7. Customer wants to lift and shift their application’s infrastructure to Azure’s Cloud.

A good example of this scenario is outlined in the following KB by CyberArk: CyberArk KB article.
Microsoft describes this case in the following Azure Troubleshooting guide.

Microsoft makes reference of this issue by stating the following:

If an internal load balancer is configured inside a virtual network, and one of the participant backend VMs is trying to access the internal load balancer frontend, failures can occur when the flow is mapped to the originating VM. This scenario isn’t supported.

And of course, this is true. Let’s evaluate Root Cause of this issue.
As stated before, our customer wants to perform a shift and lift approach for their CyberArk infrastructure. In other words, they want to use as many built-in services as possible in the Cloud. An Azure Load Balancer is the purposed solution.
An example of the architectural design by CyberArk is shown below:
Failing Azure Load Balancer architecture

First, let me present common Azure Load Balancer features and limitations:

  1. An Azure Load Balancer works only on Layer 4 of OSI/ISO
  2. An Azure Load Balancer works by using standard methods for keeping a workload balanced – these are source and destination IPs and TCP ports
  3. An Internal Azure Load Balancer uses a virtual IP (VIP) from customer’s vNET, which allows traffic to be kept internally
  4. An Azure VM can be part of only one internal and one external load balancer
  5. Only the primary network interface (vNIC) of a given Azure VM can be part of the load balancer backend pool. Only this interface can have the default gateway configured
  6. Azure Load Balancer does not do outbound traffic NAT. Simply put, it doesn’t have an outbound IP address, only inbound. This means that the Azure Load Balancer changes only the destination IP from its own VIP to Virtual Machine Primary IP during inbound NAT or load balancing rules traversal. From an Azure VM perspective, ALB is completely transparent.

Last limitation is the root cause of our problem. Microsoft states that (let’s repeat it, as it is crucial):

failures can occur when the flow is mapped to the originating VM.

This is because after an IP packet leaves the Azure Load Balancer it would have the same source IP address and the same destination IP address. According to the RCF 791 section 3.2 this kind of situation should not happen and azure backbone drops those IP packages.

So, knowing that, we can try to make some workaround. From an architectural point of view, we need to create the following conditions, where the source IP will be different from the destination. This means that our VM must have an additional IP address.
Additional IPs on an Azure vNic are supported but only the primary IP from a given interface can be used as the origin of the IP packet. This means that we cannot leverage this feature in our scenario.
Another possibility is to add vNic to the Azure VM. This is also supported. Azure VM can have one primary vNic and up to eight secondary vNics. Secondary vNics can be assigned to any subnet within Primary vNic’s vNet. So Azure VM cannot belong to more than one Azure vNet. This is enough. We need to create an additional subnet and attach a secondary vNic to it.
Secondary vNics has one more constrain. It cannot have default route. It cannot have any route at all, to be honest. Secondary vNics can gather traffic, but is hardly to send something through it. A secondary vNIC has one more constraint. It cannot have a default route. In essence it cannot have any route at all by default. Secondary vNICs can gather traffic but it is hardly to send something through it. So, this naive approach will not work. The source IP will still be established from the primary vNICs IP address.

The trick is to change VIP of Azure Internal Load Balancer.
This is the only way to enforce passing the IP packet through the secondary network adapter. When the Azure Internal Load Balancer’s VIP is from the same subnet as the secondary vNIC, we can leverage the default IP behavior – local IPs are reached through the local segment of ethernet or another medium. This is for optimization, but in this scenario it is our only choice.

Final architecture is shown on following picture:

Happy Load Balancing!

Windows Server 2016 RDP certificate configuration

On windows 2016 as well as previous version, there is no utility for RDP Certificate configuration. We can do this levereging WMI interface. The simplest way for accomplishing that is Powershell script. Configuration script can be found below.

1
2
3
4
$CN = "CN=localhost"
$RdpSetting = Get-WmiObject -class "Win32_TSGeneralSetting" -Namespace root\cimv2\terminalservices -Filter "TerminalName='RDP-tcp'"
$thumbprint = (Get-ChildItem -path cert:/LocalMachine/My | | ? Subject -eq $CN).Thumbprint
Set-WmiInstance -path $RdpSetting.__path -argument @{SSLCertificateSHA1Hash="$thumbprint"}

Presented method is universal, so can be used on client version Windows 10 as well on the server Windows 2016.

“Root element is missing” error during SMA Runbook starup

This error is most often seen because of nested runbook invocation from inline or function context. Such action is obviously not permitted. We can call Inline or Function from Workflow, but in other direction we have to use one of solutions described here.
I have just met another cause of such error.
In powershell, there is nomally available such syntax:

1
$var = (get-service)[0]

It is using result of expression within brackets as a table, what is indexed without creation of intermediate variable.
Based on my expierience and some tests, I can state that this syntax is causing error “Root element is missing”

Windows Terminal Services Logon “Access Denied”

I would like to describe resolution of the problem with Terminal Services. When you are using Terminal Services in conjunction with License Server on separate machine, you may experience following symptoms:

  • During the Logon Process, user receives the message “Access denied.”. It is shown instead logon screen, just after the “Welcome” message.
  • Within application and system event logs, there is no related error messages.
  • Within the TerminalServices-LocalSessionManager event log, there is following message correlated with user logon attempt: “Session X has been disconnected, reason code 12”, where X means number of logon session granted to user logon try by Session manager.
  • This problem you may experience on Windows 2008 R2 as well on 2012 (R2).
  • GPO policy update failure often occurs simultaneously.

Temporary solution to this problem may be modifying the following registry entry:

1
2
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Terminal Server\
IgnoreRegUserConfigErrors (DWORD) = 1

After addition of this registry value you need to reboot affected server.

After mitigation of poor user experience, you can peacefully start real diagnosis, what is wrong in your environment. In one of the cases, the real issue was mistake in windows firewall configuration of domain controlers, what was applied by GPO. In affecting GPO, there was rule denying “SMB over TCP” traffic.
It may be something different in your case, but always it must be something connect with domain controllers.

Failover Cluster Generic Script Resource Failed 0x80070009

Hi All,

I have found interesting behavior of Failover Clustering feature in Windows 2012 R2. Message from console and logs in this situation is quite Add to dictionary.
When you have done Generic Script Resource configuration, as described in this Microsoft Product Team Blog , you can find your resource down and following status message:

1
The storage control block is invalid.

I have shown that on this screenshot:
Status zasobu klastrowego
When we display the extended error message, we find out, that error code is

1
0x80070009

Rozszerzona informacja o błędzie
Similar Entry we can find in Cluster Event Log:
Log systemowy klastra

In this situation we should verify if our script is not returning 0x9 value from any entry point function. When we assured that, this error tells us we have made syntax error in Visual Basic script and in the result cluster resources manager was unable to compile and execute that script.

That is why checking script (every, not only Visual Basic!) is general good practise. We can verify Visual Basic Script Behawior by running the script from command line. This can be done with following command:

1
cscript.exe C:\pełna\ścieżka\do\pliku.vbs

Correctly written script should write to console completelly nothing except VB host banner because Generic Script Resource should contain only function definition and no calls to them (this is property of all CallBacks).
In case of any syntax error command execution should return with similar error:
Błąd walidacji skryptu

Have a nice Scripting Time!

Resources migration between subscriptions within MS Azure

In Microsoft Azure Cloud, any user can own more than one subscription. It is the smallest payment unit, where you can configure payment method and configure resource ownership boundaries.
Every Subscription has own Billing Generation cycle. So you receive as much bills as you have subscriptions. There is exception for Enterprise Agreements, but this is out of scope of this text.
The important thing is you can receive for example three bills, each for two euros. It can be annoing, really.
So this is the reason for joining the subscription.
The only way for doing that is resouces migration to one choosen subscription from the others.
In the Internet, there is planty of questions about that, but responces are unclear or outdated.
Fortunatelly Microsoft has described it within documentation. It can be found here.
One thing is worth of notice. Every resource type has own migration policy, so possibility and limitations of given resource type depends on it.
Full list of resource types, what support migration you can find here.

Installing Service Pack 1 for Windows 2008 R2 by MDT 2010 U1 mechanizm

Today post is about why sometimes using manual can harm your things.
It is normal, that if you doesn’t know how invoke application in other way than simple double click on exe file, you call the command line
aplikacja.exe /? or aplikacja.exe /help
In the response we obtain in any form some information about command line switches handled by this program. Especially, we can obtain in this way information about advanced method of calling standalone windows operating system updates.
The Windows 2008 R2 Service Pack 1 installator responses with following:
Dialog window with help for Windows 2008 R2 Service Pack 1
For import process to Microsoft Deployment Toolkit, you have to unpack in any way the exe file to the Windows Standalone Update (msu or cab) files form. On the screen there is no such option.
But after several unsucessful tries I figured out that this command line:

1
 windows6.1-KB976932-X64.exe /extract

fires up exactly this extraction process without any error message. First simptom of right execution was dialog window with directory tree for choosing proper location for extracted files. There is also the cab file, what is necessary for update usage with MDT 2010 U1.

I am only curious, why Microsoft doesn’t add it to the help screen.