Recently, one of my customers asked for a detective and preventive Azure policy for Storage Accounts. The requirements for the policy were to limit Storage Accounts to only use Private Endpoints. I was surprised by the request as I assumed that a policy definition with these requirements existed in the built-in set. In this blog post, I review the built-in policies, the process I used to create one to meet the requirements, and the quirks I discovered along the way.

For reference, the Azure Policy service includes a number of built-in policy definitions that are organized into bundles called initiatives. Initiatives are labeled for the benchmark or framework that they’re aligned to like NIST, ISO, or CIS. This makes it easy for Azure customers to enable a set of policies to see how their environment performs against them. The service enables users to duplicate built-in initiatives and policies to custom so that they can be modified. In addition, all of the definitions can be exported. To make life easy, the team regularly updates their GitHub repository with the content available within the service.

After receiving my customer’s request, I visited the repository, and looked through the policy definitions for Storage. I found one that was labeled StorageAccountPrivateEndpointEnabled_Audit and checked to see if it was available in my subscription. The policy definition existed as Storage account should use a private link connection so I assigned it to my subscription. While I was waiting for the evaluation, I took a look at the contents of the policy definition.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
"policyRule": {
  "if": {
    "field": "type",
    "equals": "Microsoft.Storage/storageAccounts"
  },
  "then": {
    "effect": "[parameters('effect')]",
    "details": {
      "type": "Microsoft.Storage/storageAccounts/privateEndpointConnections",
      "existenceCondition": {
        "allOf": [
          {
            "field": "Microsoft.Storage/storageAccounts/privateEndpointConnections/privateEndpoint",
            "exists": "true"
          },
          {
            "field": "Microsoft.Storage/storageAccounts/privateEndpointConnections/provisioningState",
            "equals": "Succeeded"
          },
          {
            "field": "Microsoft.Storage/storageAccounts/privateEndpointConnections/privateLinkServiceConnectionState",
            "exists": "true"
          },
          {
            "field": "Microsoft.Storage/storageAccounts/privateEndpointConnections/privateLinkServiceConnectionState.status",
            "equals": "Approved"

Policy definitions are relatively straight forward. Each definition has a single if/then rule made up of a set of conditions and the desired effect. In this case, the definition uses the AuditIfNotExists effect which returns non-compliance if the defined conditions don’t exist once the resource is instantiated. The problem with this effect is that the actual compliance audit happens in the then against fields that are ephemeral. A storage account with a private endpoint will not have the fields in lines 13, 17, 21, and 25 above stored within the object. This makes it impossible to list these fields as conditionals in the if block of a policy that is preventive or corrective. This is a significant shortcoming in a policy solution where there isn’t a consistent pattern of evaluation and associated action. Writing compliance policy isn’t easy regardless of the solution that is used.

After making the above observations, I realized that I needed to dig deeper into how to write policies that would meet the customer’s requirements. I started at the service, and how it could be created through ARM automation.

A storage account can be created with three types of connectivity:

  • Public endpoint (all networks)
  • Public endpoint (selected networks)
  • Private endpoint

To meet my customer’s requirement, I needed to find a way to uniquely identify either the non-private endpoint connectivity types, or the private endpoint. I created three storage accounts in my subscription with each one of the connectivity types. I exported a JSON blob for each resource and effectively played the “What doesn’t belong game?”.

In order from above, the shortened exports were the following:

Public endpoint
1
2
3
4
5
6
7
8
9
10
11
12
13
14
"name": "64a8ff0d",
"type": "Microsoft.Storage/storageAccounts",
"location": "eastus2",
"tags": {},
"properties": {
    "minimumTlsVersion": "TLS1_2",
    "allowBlobPublicAccess": true,
    "allowSharedKeyAccess": true,
    "networkAcls": {
        "bypass": "AzureServices",
        "virtualNetworkRules": [],
        "ipRules": [],
        "defaultAction": "Allow"
    },
Public Endpoint (selected networks)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
"name": "0fe8bcfb",
"type": "Microsoft.Storage/storageAccounts",
"location": "eastus2",
"tags": {},
"properties": {
    "minimumTlsVersion": "TLS1_2",
    "allowBlobPublicAccess": true,
    "allowSharedKeyAccess": true,
    "networkAcls": {
        "bypass": "AzureServices",
        "virtualNetworkRules": [
            {
                "id": "/subscriptions/redacted/resourceGroups/rg-eastus2/providers/Microsoft.Network/virtualNetworks/rg-eastus2-vnet/subnets/default",
                "action": "Allow",
                "state": "Succeeded"
            }
        ],
        "ipRules": [],
        "defaultAction": "Deny"
    },
Private Endpoint
1
2
3
4
5
6
7
8
9
10
11
12
13
14
"name": "945fee95",
"type": "Microsoft.Storage/storageAccounts",
"location": "eastus2",
"tags": {},
"properties": {
    "minimumTlsVersion": "TLS1_2",
    "allowBlobPublicAccess": true,
    "allowSharedKeyAccess": true,
    "networkAcls": {
        "bypass": "AzureServices",
        "virtualNetworkRules": [],
        "ipRules": [],
        "defaultAction": "Deny"
    },

If I compare the blob from the storage account with a private endpoint to the other two, I can see minor nuances. Based on this, I can identify the public endpoint by the defaultAction field set to Allow. The differences between the public endpoint with selected networks and the private endpoint is clear as well. The difference is in the virtualNetworkRules array being populated with at least one record. The defaultAction field is set to Deny for both so it cannot be used to identify these two. Given the comparisons, I felt I had enough information to compile a policy and test.

I created two different policy definitions because I wasn’t sure how to handle the two Public Endpoint cases in the same definition. The first policy had checked for Microsoft.Storage/storageAccounts in the resource type, and that Microsoft.Storage/storageAccounts/networkAcls.defaultAction was set to Allow. The second policy checked for the same resource type, that defaultAction was set to Deny, and that virtualNetworkRules[*] was not equal to zero (indicating that it had at least one rule). I combined both of these policy definitions into an initiative and waited for the compliance run. Once it was complete, I noticed that the results from the initiative were correct (only reporting one compliant resource), the individual policies reported false positives because of the nuance between Public Endpoint (selected networks) and Private Endpoint.

I wanted to do this right.

I reviewed the Policy Rule documentation to try to figure out how to combine the two definitions. At the top of the definition, I noticed the following representation of the syntax:

1
2
3
4
5
6
7
8
{
    "if": {
        <condition> | <logical operator>
    },
    "then": {
        "effect": "deny | audit | modify | append | auditIfNotExists | deployIfNotExists | disabled"
    }
}

The syntax definition in line 3 was what intrigued me. The Boolean OR between condition and logical operator indicated that I could effectively chain as many rules as I wanted into one if clause. The reason I made this assumption is because of how logical operations could be defined. In the documentation, the logical operators are not, allOf, and anyOf which can be passed either a condition or an operator. With this flexibility, I could achieve a chaining effect.

The following was the result once I evaluated the conditions I wanted to report as non-compliant:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
"policyRule": {
  "if": {
    "allOf": [
      {
        "field": "type",
        "equals": "Microsoft.Storage/storageAccounts"
      },
      {
        "anyOf": [
          {
            "field": "Microsoft.Storage/storageAccounts/networkAcls.defaultAction",
            "notEquals": "Deny"
          },
          {
            "count": {
              "field": "Microsoft.Storage/storageAccounts/networkAcls.virtualNetworkRules[*]"
            },
            "notEquals": 0
          }
        ]
      }
    ]
  },
  "then": {
    "effect": "deny"
  }

The elegance is in the chaining of the allOf operator at line 3 and the anyOf operator at line 9. The anyOf operator evaluates any of the defined conditions (think of your Boolean Algebra class) and results in true if any of the conditions are true.

The other piece of this definition that was new to me was the count rule between lines 15 and 18. Being able to evaluate a specific number of an array object is fantastic, especially in this case. Unfortunately, the function isn’t well reflected in the documentation and is a bit buried. In addition, the function has a significant amount of functionality that I may cover in another post. The use for my policy rule was basic.

Once the policy definition is either assigned, or associated to a policy initiative, compliance is reported accurately. For those interested in testing this policy out, I put together an ARM template and is available in my GitHub repository.

As always, reading through the documentation was helpful in learning the policy language. I’m still a bit baffled about the AuditIfNotExists effect as it seems a bit backward to me. I may tinker with it a bit more in the future. For now, learning that I could chain multiple rules was beneficial, and the power of the count function.