I am trying to write a React app that grabs a frame from the webcam and passes it to the Azure Face SDK (documentation) to detect faces in the image and get attributes of those faces - in this case, emotions and head pose.
I have gotten a modified version of the quickstart example code here working, which makes a call to the detectWithUrl() method. However, the image that I have in my code is a bitmap, so I thought I would try calling detectWithStream() instead. The documentation for this method says it needs to be passed something of type msRest.HttpRequestBody - I found some documentation for this type, which looks like it wants to be a Blob, string, ArrayBuffer, or ArrayBufferView. The problem is, I don't really understand what those are or how I might get from a bitmap image to an HttpRequestBody of that type. I have worked with HTTP requests before, but I don't quite understand why one is being passed to this method, or how to make it.
I have found some similar examples and answers to what I am trying to do, like this one. Unfortunately they are all either in a different language, or they are making calls to the Face API instead of using the SDK.
Edit: I had forgotten to bind the detectFaces() method before, and so I was originally getting a different error related to that. Now that I have fixed that problem, I'm getting the following error:
Uncaught (in promise) Error: image must be a string, Blob, ArrayBuffer, ArrayBufferView, or a function returning NodeJS.ReadableStream
Inside constructor():
this.detectFaces = this.detectFaces.bind(this);
const msRest = require("@azure/ms-rest-js");
const Face = require("@azure/cognitiveservices-face");
const key = <key>;
const endpoint = <endpoint>;
const credentials = new msRest.ApiKeyCredentials({ inHeader: { 'Ocp-Apim-Subscription-Key': key } });
const client = new Face.FaceClient(credentials, endpoint);
this.state = {
client: client
}
// get video
const constraints = {
video: true
}
navigator.mediaDevices.getUserMedia(constraints).then((stream) => {
let videoTrack = stream.getVideoTracks()[0];
const imageCapture = new ImageCapture(videoTrack);
imageCapture.grabFrame().then(function(imageBitmap) {
// detect faces
this.detectFaces(imageBitmap);
});
})
The detectFaces() method:
async detectFaces(imageBitmap) {
const detectedFaces = await this.state.client.face.detectWithStream(
imageBitmap,
{
returnFaceAttributes: ["Emotion", "HeadPose"],
detectionModel: "detection_01"
}
);
console.log (detectedFaces.length + " face(s) detected");
});
Can anyone help me understand what to pass to the detectWithStream() method, or maybe help me understand which method would be better to use instead to detect faces from a webcam image?
I figured it out, thanks to this page under the header "Image to blob"! Here is the code that I added before making the call to detectFaces()
:
// convert image frame into blob
let canvas = document.createElement('canvas');
canvas.width = imageBitmap.width;
canvas.height = imageBitmap.height;
let context = canvas.getContext('2d');
context.drawImage(imageBitmap, 0, 0);
canvas.toBlob((blob) => {
// detect faces
this.detectFaces(blob);
})
This code converts the bitmap image to a Blob, then passes the Blob to detectFaces()
. I also changed detectFaces()
to accept blob
instead of imageBitmap
, like this, and then everything worked:
async detectFaces(blob) {
const detectedFaces = await this.state.client.face.detectWithStream(
blob,
{
returnFaceAttributes: ["Emotion", "HeadPose"],
detectionModel: "detection_01"
}
);
...
}