node.jsazurebitmapimageface-api

Pass Image Bitmap to Azure Face SDK detectWithStream()


I am trying to write a React app that grabs a frame from the webcam and passes it to the Azure Face SDK (documentation) to detect faces in the image and get attributes of those faces - in this case, emotions and head pose.

I have gotten a modified version of the quickstart example code here working, which makes a call to the detectWithUrl() method. However, the image that I have in my code is a bitmap, so I thought I would try calling detectWithStream() instead. The documentation for this method says it needs to be passed something of type msRest.HttpRequestBody - I found some documentation for this type, which looks like it wants to be a Blob, string, ArrayBuffer, or ArrayBufferView. The problem is, I don't really understand what those are or how I might get from a bitmap image to an HttpRequestBody of that type. I have worked with HTTP requests before, but I don't quite understand why one is being passed to this method, or how to make it.

I have found some similar examples and answers to what I am trying to do, like this one. Unfortunately they are all either in a different language, or they are making calls to the Face API instead of using the SDK.

Edit: I had forgotten to bind the detectFaces() method before, and so I was originally getting a different error related to that. Now that I have fixed that problem, I'm getting the following error: Uncaught (in promise) Error: image must be a string, Blob, ArrayBuffer, ArrayBufferView, or a function returning NodeJS.ReadableStream

Inside constructor():

this.detectFaces = this.detectFaces.bind(this); 

const msRest = require("@azure/ms-rest-js");
const Face = require("@azure/cognitiveservices-face");
const key = <key>;
const endpoint = <endpoint>;
const credentials = new msRest.ApiKeyCredentials({ inHeader: { 'Ocp-Apim-Subscription-Key': key } });
const client = new Face.FaceClient(credentials, endpoint);

this.state = {
  client: client
}

// get video
const constraints = {
  video: true
}
navigator.mediaDevices.getUserMedia(constraints).then((stream) => {
  let videoTrack = stream.getVideoTracks()[0];
  const imageCapture = new ImageCapture(videoTrack);
  imageCapture.grabFrame().then(function(imageBitmap) {
    // detect faces
    this.detectFaces(imageBitmap);
  });
})

The detectFaces() method:

async detectFaces(imageBitmap) {
    const detectedFaces = await this.state.client.face.detectWithStream(
        imageBitmap,
        {
          returnFaceAttributes: ["Emotion", "HeadPose"],
          detectionModel: "detection_01"
        }
      );
    console.log (detectedFaces.length + " face(s) detected");
});

Can anyone help me understand what to pass to the detectWithStream() method, or maybe help me understand which method would be better to use instead to detect faces from a webcam image?


Solution

  • I figured it out, thanks to this page under the header "Image to blob"! Here is the code that I added before making the call to detectFaces():

        // convert image frame into blob
        let canvas = document.createElement('canvas');
        canvas.width = imageBitmap.width;
        canvas.height = imageBitmap.height;
        let context = canvas.getContext('2d');
        context.drawImage(imageBitmap, 0, 0);
        canvas.toBlob((blob) => {
    
          // detect faces
          this.detectFaces(blob);
    
        })
    

    This code converts the bitmap image to a Blob, then passes the Blob to detectFaces(). I also changed detectFaces() to accept blob instead of imageBitmap, like this, and then everything worked:

      async detectFaces(blob) {
        const detectedFaces = await this.state.client.face.detectWithStream(
            blob,
            {
              returnFaceAttributes: ["Emotion", "HeadPose"],
              detectionModel: "detection_01"
            }
          );
        ...
      }