pythonpandasazure-table-storageazure-python-sdkazure-tableclient

Remove Azure Table Storage EntityProperty from TableClient.query_entities() Results Before Writing to Pandas DataFrame in Python


I'm fairly new to using Azure Table Storage and am trying to pull data from it with Python into a Pandas DataFrame. I am using TableClient .query_entities() from the azure-data-tables package, and it is returning data, but integer values are being returned, along with the Entity Property, looking like a tuple. All non-integer fields just return the value.

Is it possible to remove the Entity Property either before passing the results to Pandas to put in a DataFrame, or from the DataFrame column afterward? In the examples below, proprietary information has been modified.

import pandas as pd
from azure.core.credentials import AzureNamedKeyCredential
from azure.data.tables import TableServiceClient

def getATSdata(table_client, query_filter):
    for entity in table_client.query_entities(query_filter):
        yield entity

if __name__ == "__main__":
    AccountName = 'AzureAccountName'
    Key = 'AzureAccountKey'
    Url = 'https://AzureAccountName.table.core.windows.net/'
    credentials = AzureNamedKeyCredential(AccountName, Key)
    table = 'AzureTable'
    filtr = "NonIntegerField eq 'criteria'"
    ts = TableServiceClient(endpoint=Url, credential=credentials)
    tc = ts.get_table_client(table)
    df = pd.DataFrame(getATSdata(tc, filtr))

When looking at the DataFrame column with integer values, it shows both the value and the Entity Property (e.g., EdmType.INT64):

>>> df['IntegerField']
0        (20020263, EdmType.INT64)
1        (20021278, EdmType.INT64)
2        (20021278, EdmType.INT64)
3        (20021325, EdmType.INT64)
4        (20021366, EdmType.INT64)
                . . .
16202    (13255625, EdmType.INT64)
16203    (13255721, EdmType.INT64)
16204    (13255817, EdmType.INT64)
16205    (13255835, EdmType.INT64)
16206    (13255984, EdmType.INT64)
Name: IntegerField, Length: 16207, dtype: object

Non-integer columns don't show the Entity Property:

>>> df['NonIntegerField']
0        G 
1        G 
2        G1
3        G0
4        G0
         ..
16202    G 
16203    G 
16204    G 
16205    G 
16206    G2
Name: NonIntegerField, Length: 16207, dtype: object

Any help would be greatly appreciated! Thanks in advance!


Solution

  • Please try by adding the following header to your query:

    headers = {"Accept" : "application/json;odata=nometadata"}
    

    So your code would be something like:

    headers = {"Accept" : "application/json;odata=nometadata"}
    for entity in table_client.query_entities(query_filter, headers=headers):
    

    You can learn more about this here: https://learn.microsoft.com/en-us/rest/api/storageservices/payload-format-for-table-service-operations#json-format-applicationjson-versions-2013-08-15-and-later.