containersazure-data-lake-gen2

Azure Data Lake avoid automatic creation of containers


I have an azure data lake (storage account gen2) which I use for uploading some csv files and integrate these files in a sql database afterwards.

Since some days I notice that there are many new containers with guid names like this: Screenshot with automatic created containers

I want to understand which process is responsible to create them and avoid that they are created.

I have looked in some of these containers and always find one csv-file with guid name. When I look into this csv-file I always see 4 columns (Name, Category, Status, Error) and one row which has a name of an uploaded csv-file, category is file and Status is Deleted.

So from my understanding this is something like a log-file which logs all files which I have deleted on the lake and create always a new container for this.

For me this is a very strange feature and I want to avoid this. Has anyone ideas why this is happening and how to avoid this?


Solution

  • First check which services are using your storage account. Use the Powershell script from this blog to get the list of resources that are using Storage account. You can check go through those resources and check whether it generates any files to storage account or not and turn it off as per it.

    Also, you can check StorageFileLogs logs table in the storage account. This table will contain the details like CallerIpAddress,_ResourceId- resource id from which the data came from, ServiceType - Type of that resource, _SubscriptionId- suvscription id of that resource.

    For this, go to storage account -> Monitor -> Logs -> StorageFileLogs.

    enter image description here

    For me, due to restriction of immutability on my storage account, I am not getting any result.

    You can check this SO answer to learn more about this log table.