azurejmespathazure-vm-scale-set

JMESPath Query to get the deallocated InstanceId of the oldest provisioned Azure VMSS


I'm new in JMESPath, so I'm not so familiar with a complex query. I have tried quite a lot of queries and they are far from giving me the final results that I wanted.

Target: Get the first instanceId of the deallocated Azure VMSS.

Detail

Tester: https://jmespath.org/

Sample JSON: (stripped output of az vmss list-instances command)

[
  {
    "instanceId": "5",
    "instanceView": {      
      "statuses": [
        {
          "code": "ProvisioningState/succeeded",
          "time": "2023-10-24T14:18:08.8438814+00:00"
        },
        {
          "code": "PowerState/deallocated"
        }
      ]
    }
  },
  {
    "instanceId": "13",
    "instanceView": {
      "statuses": [
        {
          "code": "ProvisioningState/succeeded",
          "time": "2023-10-24T15:53:59.6296842+00:00"
        },
        {
          "code": "PowerState/running"
        }
      ]
    }
  }
]

Expected Output: 5 (because it is the first provisioned instance ID and the state is deallocated)

Failed Attempts:


Solution

  • Preamble: for the sake of demonstrating the correct behaviour of the code provided, I included a third element in your array, in order to have two element with the PowerState/deallocated and observe the correct behaviour of the sorting.

    So here is the sample JSON I am using, which include an instance of identifier 3:

    [
      {
        "instanceId": "3",
        "instanceView": {
          "statuses": [
            {
              "code": "ProvisioningState/succeeded",
              "time": "2023-10-24T15:53:59.6296842+00:00"
            },
            {
              "code": "PowerState/deallocated"
            }
          ]
        }
      },
      {
        "instanceId": "5",
        "instanceView": {      
          "statuses": [
            {
              "code": "ProvisioningState/succeeded",
              "time": "2023-10-24T14:18:08.8438814+00:00"
            },
            {
              "code": "PowerState/deallocated"
            }
          ]
        }
      },
      {
        "instanceId": "13",
        "instanceView": {
          "statuses": [
            {
              "code": "ProvisioningState/succeeded",
              "time": "2023-10-24T15:53:59.6296842+00:00"
            },
            {
              "code": "PowerState/running"
            }
          ]
        }
      }
    ]
    

    I think the major concept missing in your actual trials is the fact that you have to stop an existing projection in order to get back an array if you want to pick the first element of an array.

    This is further explained in the chapter called "pipe expressions" of the tutorial:

    Projections are an important concept in JMESPath. However, there are times when projection semantics are not what you want. A common scenario is when you want to operate of the result of a projection rather than projecting an expression onto each element in the array. For example, the expression people[*].first will give you an array containing the first names of everyone in the people array. What if you wanted the first element in that list? If you tried people[*].first[0] that you just evaluate first[0] for each element in the people array, and because indexing is not defined for strings, the final result would be an empty array, []. To accomplish the desired result, you can use a pipe expression, <expression> | <expression>, to indicate that a projection must stop.

    This is most likely what is causing you to have null returned in your third attempt.


    Now, for your specific use case, I would split it into three steps:

    1. filter the array to exclude any instance that do not have any status code PowerState/deallocated. This can be achieved by crafting the filter projection in the top level array:
      [?instanceView.statuses[?code=='PowerState/deallocated']]
      
      Which gives:
      [
        {
          "instanceId": "3",
          "instanceView": {
            "statuses": [
              {
                "code": "ProvisioningState/succeeded",
                "time": "2023-10-24T15:53:59.6296842+00:00"
              },
              {
                "code": "PowerState/deallocated"
              }
            ]
          }
        },
        {
          "instanceId": "5",
          "instanceView": {
            "statuses": [
              {
                "code": "ProvisioningState/succeeded",
                "time": "2023-10-24T14:18:08.8438814+00:00"
              },
              {
                "code": "PowerState/deallocated"
              }
            ]
          }
        }
      ]
      
    2. Now we can apply our sorting on it, since the field we are going to filter on is coming from an array, this is the first case where we want to stop our projection in order to get the first element of the resulting array.
      So the field to sort on would be
      instanceView.statuses[?code=='ProvisioningState/succeeded'] | [0].time
      
      And the whole sort function, then, be
      sort_by(
        [?instanceView.statuses[?code=='PowerState/deallocated']],
        &instanceView.statuses[?code=='ProvisioningState/succeeded'] | [0].time
      )
      
      Which gives:
      [
        {
          "instanceId": "5",
          "instanceView": {
            "statuses": [
              {
                "code": "ProvisioningState/succeeded",
                "time": "2023-10-24T14:18:08.8438814+00:00"
              },
              {
                "code": "PowerState/deallocated"
              }
            ]
          }
        },
        {
          "instanceId": "3",
          "instanceView": {
            "statuses": [
              {
                "code": "ProvisioningState/succeeded",
                "time": "2023-10-24T15:53:59.6296842+00:00"
              },
              {
                "code": "PowerState/deallocated"
              }
            ]
          }
        }
      ]
      
    3. Since you have a projection as a result of using a sort_by function, you will have to stop the projection once again, before getting the instanceId:
      sort_by(...) | [0].instanceId
      

    So, your final query ends up being

    sort_by(
      [?instanceView.statuses[?code=='PowerState/deallocated']],
      &instanceView.statuses[?code=='ProvisioningState/succeeded'] | [0].time
    ) | [0].instanceId
    

    And your result, as expected, would be

    "5"