I have an application in Electron that does facial recognition of people to then decide whether or not they can enter the place and for that I'm using Amazon Rekognition.
Everything was working fine (for a few months) until, two days ago, a customer reported to me that the app was behaving strangely, like it wasn't responding to requests for facial recognition.
After several tests, I discovered that what is happening with it is a timeout error, which occurs in all API calls, whether they are looking for faces (SearchFacesByImage) or registering new faces (IndexFaces).
The error says:
{
"message": "connect ETIMEDOUT 3.226.60.54:443",
"errno": -4039,
"code": "TimeoutError",
"syscall": "connect",
"address": "3.226.60.54",
"port": 443,
"time": "2022-12-14T13:50:10.909Z",
"region": "us-east-1",
"hostname": "rekognition.us-east-1.amazonaws.com",
"retryable": true
}
What intrigued me was the fact that everything was working fine, until this behavior just started happening (and I didn't make any code changes/updates to the app running on my client's computer).
And what makes me even more intrigued is that this behavior occurs completely randomly and only on the machine of that client in question. Sometimes the API calls work correctly (returning whether the person was recognized or not), but most of the time, the calls take about 90 seconds to return the timeout error. When executing the same code on my machine (same methods and same CollectionId) everything runs normally and there was no timeout error at any time - while at the exact same moment on my client's machine the behavior continues.
I was using aws-sdk and then switched to @aws-sdk/client-rekognition (thinking that could solve the problem) but the code only worked on a few of the first calls to the API and a few minutes later it got the timeout errors again.
The code I'm using to configure and make calls to Rekognition is basically this:
const { RekognitionClient, IndexFacesCommand, SearchFacesByImageCommand } = require('@aws-sdk/client-rekognition')
const rekognitionClient = new RekognitionClient({
credentials: {
accessKeyId: 'accessKeyId',
secretAccessKey: 'secretAccessKey'
},
region: 'us-east-1'
})
const registerFaceOnRekognition = async (bytes, userId) => {
const params = {
CollectionId: 'collectionId',
Image: { Bytes: bytes },
ExternalImageId: userId,
MaxFaces: 1,
QualityFilter: 'HIGH'
}
const command = new IndexFacesCommand(params)
try {
const { FaceRecords } = await rekognitionClient.send(command)
if (!FaceRecords.length) {
console.log('No faces detected.')
return
}
console.log('Face created:')
console.log(FaceRecords[0].Face.FaceId)
} catch (error) {
console.error(error) // timeout error
}
}
const searchFaceByImageOnRekognition = async (bytes) => {
const params = {
CollectionId: 'collectionId',
Image: { Bytes: bytes },
MaxFaces: 1,
FaceMatchThreshold: 99,
QualityFilter: 'HIGH'
}
const command = new SearchFacesByImageCommand(params)
try {
const { FaceMatches } = await rekognitionClient.send(command)
if (!FaceMatches.length) {
console.log('This face has not been registered yet')
return
}
console.log('Face found:')
console.log(FaceMatches[0].Face.ExternalImageId)
} catch (error) {
console.error(error) // timeout error
}
}
// Method called through the renderer process that has a canvas where the webcam view is reproduced
const onTakePicture = (event, data) => {
const bytes = Buffer.from(data.dataURL.replace('data:image/jpeg;base64,', ''), 'base64')
// If there is a userId, register the face in the image
if (data.userId) {
registerFaceOnRekognition(bytes, data.userId)
return
}
// Else, search for the face in the image
searchFaceByImageOnRekognition(bytes)
}
Just remembering that: during all tests on my client's computer the internet connection was stable and working properly.
What is the best way to investigate and resolve this issue?
UPDATE:
I enabled Rekognition debug logs and they can be found at: https://gist.github.com/IgorSamer/4e58e09f3fa615401f85ca325b794245
In it, the first three requests (2022-12-16T13:48:45.932Z
, 2022-12-16T13:53:20.325Z
and 2022-12-16T14:19:12.479Z
) occur normally. However, all other consecutive requests start to give the timeout error, where, in fact, no data is returned after the [DEBUG] App: endpoints Resolved endpoint:
step.
As previously mentioned the internet connection is working fine. I could also managing to reproduce the error via remote access, that is, the machine internet was ok at the time of error.
Is there a possibility that there is a block made by my client's firewall/network that prevents requests from being sent by the SDK after a few successful requests? If yes, what is the best way to investigate this?
This is what I would do initially to gather some info:
I hope this may be of help to at least create a troubleshooting strategy