I need to process 6 images at once 10 times per second and i use yolov5 for this. But I'm new in this topic and im a bit confused with batch sizes for Inference. As far as I understood it, with higher batch size you can produce multiple images at once but with less precision in the result. So is it more common for this problem to use a yolov5 with higher batch size or to run yolo x times but with lower batch size (in this case i would run the different instances in parallel). Or am I completely wrong ?
Right now I'm using the YoloV5n pretrained .pts and convert them to an .engine file. The program is targeted to run on a Jetson AGX Xavier.
You might be confusing training and inference here. When training a model, batch size is one of hyperparameters which may have an effect how well your model converges.
During inference, in virtually all cases, batch size only dictates how many times your model will be executed in parallel. I.e. your output for each image will be the same regardless of batch size.
With your system, running a model as lightweight as yolov8n
at 60 frames per second shouldn't pose any problems, so I wouldn't worry about performance and just set the batch size to whatever works.