I don't know if my recent experience might help you. I am running BI 5, AI Tool, and DeepstackAI on Windows 10 with an Intel i5-4670, 3.4 GHz, 16 GB RAM. I am running 11 cameras - 6 @ .9 megapixel, 3 @ 2 MP, 2 @ 5 MP, all at 10 FPS with substreams. There are also several cloned camras but I don't think those affect resources too much. I'm using Intel hardware acceleration (I think Intel + VPP on most cameras).
Blue Iris reports a total of about 168 MP/s and 4100 Kb/s.
Before enabling substreams I was seeing about 2 second per image in Deepstack and about 30% CPU. After enabling substreams my CPU usage dropped closer to 20%. Deepstack also got a little faster, I assume because Blue Iris was using less CPU. It was running closer to 1 second per image, sometimes a little more, sometimes a little less. I tried using the high/medium/low options when starting Deepstack but never noticed much of a difference in the image processing time at any setting.
I decided to try using Docker Desktop for Windows to run Deepstack. The most profound change in doing this is that there is a huge difference between starting Deepstack with the mode set to high, medium, or low. When running at high it was taking about 2 second per image. When running at medium it was close to 1 second per image. Running at low I am consistently getting 700ms to 900ms image processing times. I don't know if the Windows GUI startup for Deepstack is ignoring the high/medium/low setting or if something else is going on, but the fastest response on my system is definitely running under Docker with Mode=Low.
I had set the time between BI writing JPEGs to try to avoid a backlog in Deepstack, so originally I only wrote one image every 2 seconds. I am masking all but a small area near my front door so I would sometimes miss alerts if the person moved in and out of the unmasked area fast enough. After switching to substreams I was able to set BI to write a JPEG every 1.3 second. After switching to Deepstack in Docker with Mode=Low I am now able to write a JPEG every second without a backlog and could probably bring that down to .9 seconds without any problem.
Currently with Deepstack in Docker I'm seeing about 23% total CPU with occasional spikes as high as 40%. Blue Iris is using about 15% of that. The attached image shows the CPU spike when I triggered my camera to write 4 JPEG images two times in a row. CPU jumps to about 80% to 90% total while processing the images.
Also I tried changing the JPEG size from the 5MP main stream image to the VGA substream image and didn't notice a huge difference in response time from Deepstack, so I put it back to the larger size.