Ryzen 3950x for Blue Iris

bp2008

Staff member
Mar 10, 2014
12,800
14,332
USA
I just upgraded my main workstation/gaming PC to AMD Ryzen 3950x (16 cores!), and decided to test Blue Iris on it.

Hardware Setup


(please note: This configuration was chosen for performance, not energy-efficiency or cost-effectiveness)

1575149006836.png

Performance Data

Baseline

For testing, I used 26 cameras ranging from 1 to 8 megapixels, all configured to encode H.264. One of the 2MP cameras has a weather overlay provided by Blue Iris Tools, so I turned off Direct-to-disc recording on that one. The other 25 cameras are using Direct-to-disc. The "Limit decoding" feature is not being used, and for GUI Open tests, the live preview frame rate is uncapped (although my highest camera frame rate is 15 FPS in these early tests).

I used BiUpdateHelper to measure performance data.

Intel i7-8700K Results

I will start by measuring CPU usage on my i7-8700K Blue Iris server, for comparison purposes.

Megapixels Per SecondNotesPower Consumption (Watts)Blue Iris CPU Usage %Overall CPU Usage %Nvidia RTX 2080 Ti Video Decode Usage %
1189Intel Decode, NO GUINot Measured2832N/A
1189Intel Decode, GUI @ 4KNot Measured5761N/A

I did not measure power consumption on the i7-8700K this time around, but based on previous measurements I can confidently say the consumption was likely in the 110-150 watt range, because I do have a low-end GPU in there, along with two spinning hard drives and an SSD.

AMD 3950x Results

Running exactly the same configuration as the Intel system above, the following are measurements from the AMD Ryzen 3950x system. I ran it with software decoding first, then with Nvidia hardware decoding for comparison purposes.

Megapixels Per SecondNotesPower Consumption (Watts)Blue Iris CPU Usage %Overall CPU Usage %Nvidia RTX 2080 Ti Video Decode Usage %
0IDLE. Blue Iris Not Installed110-14000N/A
1189Software Decode, NO GUI1711314disabled
1189Software Decode, GUI @ 4K1981819disabled
1189NVDEC, NO GUI2456762
1189NVDEC, GUI @ 4K259111262

This load hardly strains the CPU. As expected, enabling Nvidia hardware acceleration reduced CPU usage but raised power consumption.


Max it Out

Next, I began trying to find the limit of this system.

Nvidia NVDEC Enabled

I kept Nvidia hardware acceleration enabled, and maxed-out my camera frame rates a few cameras at a time in order to gradually increase the total Megapixels Per Second load.

When I reached 95% Nvidia Decode usage at 1919 MP/s, I considered the Nvidia card maxed out, and measured again with the GUI open.

Megapixels Per SecondNotesPower Consumption (Watts)Blue Iris CPU Usage %Overall CPU Usage %Nvidia RTX 2080 Ti Video Decode Usage %
1189NVDEC, NO GUI2456762
1314NVDEC, NO GUI2486767
1464NVDEC, NO GUI2527874
1702NVDEC, NO GUI2568986
1919NVDEC, NO GUI26091095
1919NVDEC, GUI @ 4K279151795

Full disclosure: 3 of the cameras in my configuration are 2MP PTZs which I had hardware acceleration disabled on for reasons of keeping video delay to a minimum. So when I stopped at 1919 MP/s, really only about 1836 MP/s of video were being fed through the Nvidia card for hardware-accelerated decoding.


Nvidia NVDEC Disabled

I clearly wasn't going to get much further with Nvidia NVDEC enabled, as I had basically maxed out its decoder. So I turned off Nvidia hardware acceleration and measured performance again with the GUI open and with the GUI closed. Then I proceeded to increase camera frame rates.

Interestingly, I wasn't able to push the system any further. As I increased frame rates, Blue Iris began choking on the load. The Megapixels Per Second counter started going down, not up. Blue Iris' status window showed some camera frame rates were reduced from where I had set them, and some streams in the GUI were becoming increasingly delayed.

Megapixels Per SecondNotesPower Consumption (Watts)Blue Iris CPU Usage %Overall CPU Usage %Nvidia RTX 2080 Ti Video Decode Usage %
1919Software Decode, GUI @ 4K2193536disabled
1919Software Decode, NO GUI1871920disabled
1834Software Decode, NO GUI. Load increased beyond previous1993639disabled
1514Software Decode, NO GUI. Load increased beyond previous1995558disabled
1460Software Decode, NO GUI. Load increased beyond previous1995459disabled

Clearly, we are running in to some kind of bottleneck, long before the CPU measures 100% usage. I suspect Blue Iris may be using all available memory bandwidth, however I have no evidence to support this hypothesis. Whatever the cause, it seems that the AMD Ryzen 3950x is simply overkill for Blue Iris, and the same results would likely be achievable with an 8 or 12 core processor.


Memory Channels Matter

To test the hypothesis that memory bandwidth was the limiting factor, I removed two 8 GB sticks from channel A, leaving only two 8 GB sticks in channel B. HWINFO confirmed that the memory was now running in single channel mode.

I then measured performance while running the baseline workload from before (1189 MP/s).

Single Channel Memory Results

Megapixels Per SecondNotesPower Consumption (Watts)Blue Iris CPU Usage %Overall CPU Usage %Nvidia RTX 2080 Ti Video Decode Usage %
813Software Decode, NO GUINot Measured4853disabled
835Software Decode, NO GUINot Measured3638disabled
1013Software Decode, NO GUINot Measured3334disabled
884Software Decode, NO GUINot Measured3334disabled
654Software Decode, GUI @ 4KNot Measured8083disabled

Performance was all over the place, and frames were being dropped. During the GUI Open (at 4K resolution) test, the GUI was extremely sluggish and it was evident that many cameras' live views were delayed.

Next, I re-inserted the two RAM sticks I had removed, and turned off the XMP Profile. In effect, this reduces the memory speed from 3600 MHz to 2133 MHz, and slightly improves memory latency.

I re-ran the performance measurements.

Megapixels Per SecondNotesPower Consumption (Watts)Blue Iris CPU Usage %Overall CPU Usage %Nvidia RTX 2080 Ti Video Decode Usage %
1189Software Decode, NO GUINot Measured1718disabled
1189Software Decode, NO GUINot Measured1718disabled
1187Software Decode, NO GUINot Measured1717disabled
1086Software Decode, NO GUINot Measured2122disabled
1189Software Decode, GUI @ 4KNot Measured3334disabled

Based on this, I think it is fair to say that Blue Iris is highly dependent on memory speed. Faster memory is better, and it is simply foolish to not run as many memory channels as your platform supports.

(edit 2020-04-09: Fixed incorrect Megapixels Per Second column values in last table to match originally recorded data -- one of the readings actually was 100 MP/s lower than the others, with higher CPU usage)
 
Last edited:
@bp2008 I'd be curious how much power both the i7-8700k at 1189 Mp/s and AMD chip draw MINUS the 2080TI which is a power hog itself. The AMD chip might be lower power since it is using 7nm process node, but then again the Intel has the Quicksync advantage also.
 
Last edited:
I'd be curious too, but I am not going to be measuring that. To do it proper justice, I would need to remove a lot more variables. Such as using similar motherboard (AMD X570 is very power-hungry), same case, power supply, disks and peripherals.

There was already not a huge difference when I compared AMD and Intel performance two years ago so I imagine at this point there is even less of a difference, with AMD having taken the efficiency crown for general CPU computing with 7nm.
 
It bodes well for the edge case where people are looking for significant number of cameras, where before the i9-9900 etc was the only option. Now, in theory at least, they could get a 16, 24 or 32-core chip with SMT. Unless, of course, you've found some other limit that Blue Iris is bumping against.
 
I've updated the first post with new data obtained by running the system with single channel memory, and with reduced-speed dual channel memory. It turns out memory makes a huge difference. I think the Ryzen 3950x is totally overkill given the memory limitations of the platform. A true HEDT platform supporting quad channel memory would likely be the only way to increase video throughput to 2000 MP/s and beyond.
 
Check out AMD uProf I believe it can tell you memory controller stats. Intel has something similar for their chips. Is it possible you are saturating something else like drive interface?

On paper you’d have to go quad channel, which might explain why Xeon, i9/i7 Extreme editions and threadripper chips seem to support the most cam Mp/s based on your stat database since they have the highest theoretical memory bandwidth.

Very cool test. It bodes well for Ryzen 8-12 core chips being able to handle a bunch of cams for much lower (less than 50% the price of your 3950x). Or maybe next gen Ryzen APU if they end up available with enough cores. Too bad it will take some time before those arrive on the used market. Meanwhile Intel has a few generations of refreshed chips coming off leases.

I started buying Intel just after they became dominant in the market, but am excited to see AMD fielding some contenders, and competition also forced Intel to lower prices so it’s a win-win in my book.
 
  • Like
Reactions: bp2008
You just need some good Samaritan to send you an EPYC CPU next with octa-channel memory controller! It is interesting that most of the reviews I've seen the Ryzen 7 & 9 are holding up well in the benchmarks (a little behind on gaming relative to Intel's top offerings), but maybe for Blue Iris the demand profile is different.
 
Last edited:
Blue Iris's work is definitely different from most workloads. Even video processing workloads, which typically only process one video stream at a time. Blue Iris will be processing DOZENS of them.

I mean, consider 1900 megapixels of video is 1.9 billion pixels. Each uncompressed pixel is likely to be around 3 bytes in memory if it is stored efficiently. That makes 5.7 GB of raw pixel data being produced per second when my PC hit its limit. Raw memory read speeds are in the ballpark of 20 GB/s for modern DDR4 (more or less depending on speed, channel count, timings, etc). It is not hard to imagine running out of memory bandwidth if you are working with this much data.
 
@bp2008
I am working to configure a computer that can handle 37 to 40 8MP cameras at 15FPS. Your testing is very interesting and makes me concerned for this build. I'm planning on the camera feeds being set to H.265 for the storage savings that it provides. My current configuration plan is below. I found that H.265 is supported with intel QSV so I would assume performance would be similar to H.264. The GUI will normally not be running. I'll be using netcamviewer monitor to provide camera feeds to 2 TVs remotely. That device connects directly to the cameras so it won't cause any increased utilization. I will be following the guide optimizing blue iris's CPU usage to get it down as low as possible. The GPU below is mainly for video out from the server as the motherboard does not have onboard video ports. Cameras will most likely be set to 24/7 recording. I'm trying to get that changed to motion.

Cameras - 37 to 40 8MP 15FPS Amcrest
CPU - i9-9820X 10 Core 3.3GHz
Memory - 64GB 8x8GB DDR4 3600 Quad Channel
GPU - Nvidia Geforce 1030 (Only used for video out during setup)
Storage - 8 10TB HDD with hardware raid set to Raid 50

Questions
1. Is this grossly under specd?
2. How will H.265 effect CPU?
3. Should I use AMD instead of intel?

Thank you for any insight or help?
 
  • Like
Reactions: Cache450
1. Probably
2. h.265 will make CPU usage higher. Not by a tremendous amount though. QSV does not in fact work with h.265 and never has, although Blue Iris claimed support years ago. Does not matter anyway since QSV is not available on HEDT which you will need for memory bandwidth.
3. Yes. AMD is kicking intel's ass in many-core workloads these days. I'd recommend a threadripper 3 system, and fill all ram slots with something fast like 3600 speed. 32 GB total would be fine though. Threadripper 2 has a much weaker memory controller so if TR3 is out of budget then see if you can get a latest gen Intel box. Big CPU cost savings with the latest. But definitely it needs to be their HEDT platform so you get quad channel memory.

FYI 49 8mp cams at 15 fps is nearly 5000 MP/s and I don't think you will get more than around 3000 MP/s on any system. Consider using the Limit Decoding feature on most of the cameras. That will drastically improve your performance and make many more computer options viable.
 
  • Like
Reactions: Cache450 and crw030
@bp2008

Thank you very much for your suggestions and for clarifying H.265 with QSV and the quick response. I'll start looking into the TR3 and cost. If 32GB quad-channel will do the job I'll drop down to that and reallocate the funds to the CPU.

I figured about 5000 MP/s as well. From what I've read about it the camera has to be able to work with limit decoding. It also looks like it will affect motion detection which if the system is going to be 24/7 then that won't matter.

Limit Decoding Questions
1. Do you know if there is a compatibility list for limit decoding?
2. Will this won't affect recording quality correct?
3. Are there other effects I need to worry about?
4. With limit decoding do you think I have a chance at the 5000MP/s?
 
1) No compatibility list. It works with H.264 and H.265 video.
2) Correct. You need to use Limit Decoding with direct-to-disc recording of course, but then the recordings are full quality and full frame rate.
3) I think the section on Limit Decoding here covers the concerns well. Most of the negative effects of this feature are relatively unimportant if you record continuously and do not rely on motion detection. Or you could try to use the camera's built-in motion detection and have Blue Iris get alerted through the ONVIF event stream... I've never done this so I do not know how well that works.
4) Yes. 40x 4K cameras (8.3 MP each) would only be 332 MP/s with limit decoding enabled on all of them, and i-frame interval set equal to frame rate. Most modern CPUs would handle this very easily.
 
@bp2008

Thank you again for your help. Just for my own knowledge how did you calculate 332MP/s? Just to be clear I don't think you are wrong I would just like to be able to calculate for myself in the future when using limit decoding. With that MP/s I could probably lower the cost of the overall system.

@looney2ns

How many cameras do you have connected to your system?
What MP are they?
What FPS are they set to?
 
Thank you again for your help. Just for my own knowledge how did you calculate 332MP/s? Just to be clear I don't think you are wrong I would just like to be able to calculate for myself in the future when using limit decoding. With that MP/s I could probably lower the cost of the overall system.

MP/s is just "megapixels per second". It really is a super-easy calculation.

A 4K camera has a resolution of 3840x2160 pixels. That is about 8.3 megapixels (because 3840 times 2160 is about 8.3 million).

Then you multiply that by the number of cameras. In this case, 40.

40 x 8.3 megapixels = 332 megapixels

When you use Limit Decoding in Blue Iris, that means Blue Iris only decodes the i-frames. An i-frame interval equal to the frame rate means there is 1 i-frame per second. So Blue Iris would only be decoding 1 frame per second from each of the cameras.

332 megapixels x 1 frame per second = 332 megapixels per second
 
New question on an old thread @bp2008 I see you were running 3600 speed RAM, but do you remember whether XMP profile was enabled in the BIOS? I have seen numerous tech youtubers discuss the fast memory requirement for Ryzen CPU's -- so wondering if any XMP headroom was left on your configuration or not.
 
Do you see any difference between dual channel 2133 vs 3600 mhz, without changing anything else?
 
Do you see any difference between dual channel 2133 vs 3600 mhz, without changing anything else?
Seems like it can impact, the question boils down to price vs performance delta for this specific Blue Iris use case (which not a lot of people test).
A tech youtuber I watch regularly ( Ryzen 3000 Memory Benchmark & Best RAM for Ryzen (fClock, uClock, & mClock) --I'm sure others also), have tested this. But for a regular person it's harder to have sets of different sets "lying around".