azure-data-factoryazure-cosmosdbazure-cosmosdb-sqlapi

"Detect Datetime" setting ignored when using Managed Identity for connecting to CosmosDB Source in ADF Copy activity


In several Azure Data Factory pipelines one of the activity is Copy Activity that uses Azure Cosmos DB for NoSQL as source and Azure Data Lake Storage Gen2 with JSON format as sink. The dates stored in CosmosDB use various datetime offsets e.g. 2023-11-02T00:00:00-04:00 and to preserve the dateoffset the copy activity source has Detect Datetime unchecked (as described here and here. This configuration works fine and the Json files produced by these activities do preserve the datetime offset. If this checkbox/property is left checked, the all of the datetimes are converted to UTC offset. For example, 2023-11-02T00:00:00-04:00 would become 2023-11-02T04:00:00+00:00 which is accurate datetime instant except that the offset is lost. This is the reason that all the copy activities where CosmosDB is source has this property unchecked. This shows up in code as "detectDatetime": false

"source": {
    "type": "CosmosDbSqlApiSource",
    "preferredRegions": [],
    "detectDatetime": false
},

Linked service for the connection to CosmosDB is using key which is fetched from a key vault. To reduce the configuration and hassle of key rotations, we are switching to System-assigned managed identity authentication. Everything works except the Detect Datetime setting is completely ignored and all of the dates in JSON sink are "transformed" to UTC offset. Obviously the pipelines are much more complex than just one copy activity but this behavior has been confirmed in isolation with two simple pipelines each using just one copy activity with CosmosDB as source and Data Lake Gen2 with JSON format as sink. The only difference is that one is using key based authentication for CosmosDB while other one is using Managed Identity based authentication. Is there a property that isn't exposed in the UI that needs to be passed?


Solution

  • Azure support has confirmed this as an issue/limitation with Managed Identity based Connector. As of now (Aug 12th 2024) there is no workaround except to not use this connector and use Account Key Authenticated connector :(

    We have consulted with the product team regarding this behavior, and after reproducing it in our environment, they confirmed that this is a known issue when using System Managed Identity authentication. In this scenario, Azure Data Factory (ADF) defaults to SDK V3 for performing the copy activity, and this version of the SDK does not support Detect DateTime at the SDK level. The Account Key does not use SDK V3, which is why Detect DateTime is being honored in that case.

    We have shared your comments to the product team, and they will conduct further testing to identify and implement a fix. Currently, there is no estimated time of arrival (ETA) for this, and the product team has indicated that, for now, the only method to ensure the Detect DateTime is honored is by using the Account key selection method.