node.jsazureazure-functionsazure-blob-storageazure-eventgrid

Prevent Azure function from recursive execution


I have created an Azure function that is triggered by an EventGrid subscription event which is fired when a new blob uploaded into Blob Storage.

I'm using this function to resize the uploaded image, it's uploading the resized image into the same container

The problem is that the function will trigger new upload event which is done from inside the function itself, so it will execute again. the original image and also the resized image should be stored in the same container, so I can't prevent this just bu filtering the Event Subscription based on the container name.

EventGrid Subscription has a part for advanced filtering, in this section we can filter the event based on the data object in the payload, that would be a good idea but I can't find a way to put any custom field in data object when uploading resized image from azure function, so that I can filter events based on that.

enter image description here

function implementation:

const stream = require('stream');
const Jimp = require('jimp');

const {
  Aborter,
  BlobServiceClient,
  StorageSharedKeyCredential,
} = require("@azure/storage-blob");

const ONE_MEGABYTE = 1024 * 1024;
const uploadOptions = { bufferSize: 4 * ONE_MEGABYTE, maxBuffers: 20 };

// const containerName = process.env.BLOB_CONTAINER_NAME;
const accountName = process.env.AZURE_STORAGE_ACCOUNT_NAME;
const accessKey = process.env.AZURE_STORAGE_ACCOUNT_ACCESS_KEY;

const sharedKeyCredential = new StorageSharedKeyCredential(
  accountName,
  accessKey);
const blobServiceClient = new BlobServiceClient(
  `https://${accountName}.blob.core.windows.net`,
  sharedKeyCredential
);

module.exports = async (context, event, inputBlob) => {  
  context.log(`Function started`);
  context.log(`event: ${JSON.stringify(event)}`);

  const widthInPixels = 100;
  const blobUrl = context.bindingData.data.url;
  const blobUrlArray = blobUrl.split("/");
  const blobFileName = blobUrlArray[blobUrlArray.length - 1];
  const blobExt = blobFileName.slice(blobFileName.lastIndexOf(".") + 1);
  const blobName = blobFileName.slice(0, blobFileName.lastIndexOf("."));
  const blobName_thumb = `${blobName}_t.${blobExt}`;

  const containerName = blobUrlArray[blobUrlArray.length - 2];

  const image = await Jimp.read(inputBlob);
  const thumbnail = image.resize(widthInPixels, Jimp.AUTO);
  const thumbnailBuffer = await thumbnail.getBufferAsync(Jimp.AUTO);
  const readStream = stream.PassThrough();
  readStream.end(thumbnailBuffer);

  const containerClient = blobServiceClient.getContainerClient(containerName);
  const blockBlobClient = containerClient.getBlockBlobClient(blobName_thumb);
  context.log(`blockBlobClient created`);
  try {
      await blockBlobClient.uploadStream(readStream,
            uploadOptions.bufferSize,
            uploadOptions.maxBuffers,
            { blobHTTPHeaders: { blobContentType: "image/jpeg" } });
      context.log(`File uploaded`);
  } catch (err) {

    context.log(err.message);

  } finally {

    context.done();

  }
};

Bindings:

{
  "bindings": [
    {
      "name": "event",
      "direction": "in",
      "type": "eventGridTrigger"
    },
    {
      "name": "inputBlob",
      "direction": "in",
      "type": "blob",
      "path": "{data.url}",
      "connection": "AzureWebJobsStorage",
      "dataType": "binary"
    }
  ]
}

This is a very basic issue when using Azure functions for triggering blob storage events, and it's really weird that I can't find a proper description in this regard, inside the big azure documentation!


Solution

  • EventGrid Subscription has a part for advanced filtering, in this section we can filter the event based on the data object in the payload, that would be a good idea but I can't find a way to put any custom field in data object when uploading resized image from azure function, so that I can filter events based on that.

    Correct, you can't put anything in the data object as you have no control over the creation of the event. This is of course due to the fact that the event is generated by the Azure Platform. You can only create filters based on the content of the event, which is (according to the docs):

    [{
      "topic": "/subscriptions/{subscription-id}/resourceGroups/Storage/providers/Microsoft.Storage/storageAccounts/my-storage-account",
      "subject": "/blobServices/default/containers/test-container/blobs/new-file.txt",
      "eventType": "Microsoft.Storage.BlobCreated",
      "eventTime": "2017-06-26T18:41:00.9584103Z",
      "id": "831e1650-001e-001b-66ab-eeb76e069631",
      "data": {
        "api": "PutBlockList",
        "clientRequestId": "6d79dbfb-0e37-4fc4-981f-442c9ca65760",
        "requestId": "831e1650-001e-001b-66ab-eeb76e000000",
        "eTag": "\"0x8D4BCC2E4835CD0\"",
        "contentType": "text/plain",
        "contentLength": 524288,
        "blobType": "BlockBlob",
        "url": "https://my-storage-account.blob.core.windows.net/testcontainer/new-file.txt",
        "sequencer": "00000000000004420000000000028963",
        "storageDiagnostics": {
          "batchId": "b68529f3-68cd-4744-baa4-3c0498ec19f0"
        }
      },
      "dataVersion": "",
      "metadataVersion": "1"
    }]
    

    That leaves you with 2 choiches. Either use a staging container in which files are uploaded and have the function resize them and put them in the final container or create a mechanism preventing from newly created thumbnails from being processed.

    Maybe in your function you can inspect the name of the blob, and if it is a thumbnail them just end processing.

    Or, my preferred solution, include metadata in the thumbnail blobs when you create and store them. Then, also in the function, upon execution, check for the existance of the metadata and only if no metadata is set, execute the resize logic.