Can we gather some Memory Benchmarks?

Joined
Apr 26, 2016
Messages
1,090
Reaction score
852
Location
Colorado
All.

Every since @bp2008 did a test on where the limits were for his Blue Iris machine running an AMD 3950x (thread here: Ryzen 3950x for Blue Iris), I have been thinking that just as important as cores/threads to determine "maximum capacity" of a Blue Iris VMS PC there might be a memory impact as well. I've speculated on this, because many of the best performing (i.e. biggest) Blue Iris setups seem to be running configurations with more memory channels. Here are some of the top results (not using CUDA acceleration, and these may not all be "maximums" or optimally configured) but according to: Blue Iris Update Helper
  • Intel(R) Core(TM) i7-6950X (3170 MP/s) -- wikichip memory profile: (DDR4-2133/2400): 1-4 channels (19.2 - 76.8 GB/s)
  • Intel(R) Xeon(R) CPU E5-2620 (DUAL CPU) (2640 MP/s) -- wikichip memory profile: (DDR4-2133): 1-4 channels (15.9 - 63.6 GB/s)
  • Intel(R) Core(TM) i9-7980XE (2200 MP/s) -- wikichip memory profile: (DDR4-2666): 1-4 channels (19.9 - 79.5 GB/s)
  • AMD Ryzen 9 3950X (1900 MP/s) -- wikichip memory profile: (DDR4-3200): 1-2 (23.8 - 47.7 GB/s)
  • Intel(R) Core(TM) i7-4790 (1780 MP/s) -- wikichip memory profile: (DDR3-1333/1600): 1-2 (12.8 - 25.6 GB/s)
My thinking is if we can confirm some relative memory performance measurement (i.e. with Memory benchmark like Download PassMark PerformanceTest - PC Benchmark Software ) is needed for a corresponding level of MP/s it might help with optimal system selections (especially towards the high end, but also so people can buy with a little headroom).
Also selfishly, this will help me decide whether an EPYC system might outperform the Ryzen even though it will have lower CPU clock, because it has much higher memory bandwidth.

For giggles, here's my old (not maxed) system score, and Passmark Memory and CPU scores for it (Blue Iris Service was stopped for this test, but RDP was running):
  • Intel(R) Core(TM) i7-2600K (240 MP/s @ 17% -- 390 MP/s @ 62%)
1583857188955.png1583857209096.png

Adding AIDA64 since it has more comprehensive memory test, even though it's not free, the free version will still show some test results in some categories (seems random).
1590764421940.png1590764984404.png


And this is the SAME system maxed out on BlueIris 4.8.6.3 (it was actually past the point where I could reconnect and disable cameras via RDP at this point) using the cloned camera + designated master approach. If anyone spots something I could optimize config-wise please let me know! This screenshot is via RDP (which itself is a resource hog), and 5 fps live preview. If I close RDP CPU drops to 92% (via Pulseway monitoring), and I'm sure I'm dropping frames at this point, but not sure where I can check that.

1590769238932.png
 
Last edited:

bp2008

Staff member
Joined
Mar 10, 2014
Messages
12,666
Reaction score
14,005
Location
USA
I am very curious about this too. I know from my tests that a single channel of DDR4 memory is very noticeably worse than dual channel at a fairly high load of 1190ish MP/s, but what I don't know is if 4 or 8 channel memory will do noticeably better or at what MP/s loads it does better.
 

bp2008

Staff member
Joined
Mar 10, 2014
Messages
12,666
Reaction score
14,005
Location
USA
Dropped frames are mostly noticeable when you see your total MP/s readout shrink when it should have grown or stayed the same. It is worth noting this can happen naturally in some cases at night -- cams which have a long exposure enabled will normally cut their FPS to match.
 

Javier

n3wb
Joined
Feb 1, 2017
Messages
14
Reaction score
3
I am running a Threadripper W2990X with quad channel Corsair DDR4 3000mhz. The system is unstable above 40% CPU usage. The fps on the video starts to slow down, Mbps on the network cards goes lower and lower and after a while for unknown reasons the whole computers become too slow without any clear indication to why.

I had been running a theory for some years that running the system near its max "something" (maybe memory bandwidth), sometimes blue iris has a peak demand which goes above the limit and the whole computer gets backlogged in a downward spiral.

The most I've been able to run stable is around 2,900 MP/s on blue iris with around 45% cpu. My problem is in the inherent way TR2 W2990X handles its cache memory between cores on the CPU or some kind of memory bandwidth.

I suspect a TR3 with Quad channel ddr 4000 should be able to outperform this setup, but I don't think by much more.
 

bp2008

Staff member
Joined
Mar 10, 2014
Messages
12,666
Reaction score
14,005
Location
USA
That is great info, Javier! 1st and 2nd gen Threadripper has multiple NUMA nodes which certainly does complicate memory access compared to TR3 which has a single NUMA node. I would expect maximum MP/s on TR3 to be around 3800 MP/s (double what I got on a Ryzen 3950x / DDR4 3600). But probably less because quad channel memory is not simply double the speed of dual channel memory.

On Zen 2 (Ryzen 3000 series, TR3) you lose performance if the memory speed isn't synchronized with infinity fabric speed. DDR4 3600 works without much effort and I believe you can get to like 3800 pretty reliably by overclocking the infinity fabric to 1900 MHz, but beyond that it gets tricky to not lose performance.
 
Joined
Apr 26, 2016
Messages
1,090
Reaction score
852
Location
Colorado
I'm currently facing some bottleneck that I am still trying to identify (right now I suspect my little AMCREST Desktop POE switch), but at approximately 1500MP/s (and over 70-80 Mbps) adding more cameras causes the FPS to start falling on existing streams, and the CPU to go up drastically. My suspicion is either this little desktop switch is getting overwhelmed (its only 100Mbps), or something about the NIC configuration is causing a performance wall. Since I had hoped to simulate the load, I wasn't prepared with a spare 1Gbps POE switch (and two coming from Tiger Direct are still BO after the price snafoo probably due to slowdown from China).
 
Joined
May 1, 2019
Messages
2,215
Reaction score
3,504
Location
Reno, NV
what can I do to help?
my 1 month Blue Iris computer setup:
G.Skill RipJaws V Series 16GB (2 x 8GB) 288-Pin SDRAM PC4-28800 DDR4 3600 dual memory
i5-9600k CPU
780 MP/s (Though, do have 2 camera's offline for construction purposes)
 
Joined
Apr 26, 2016
Messages
1,090
Reaction score
852
Location
Colorado
@Holbs Can you run Passmark (CPU & Memory) and AIDA64 (Memory & GPU) and post screenshots like my OP? And if you happen to know where that system would max out (although figuring that out might be something we first need to determine the quickest way to reach the limit and validate based on frames starting to get dropped).
 
Joined
May 1, 2019
Messages
2,215
Reaction score
3,504
Location
Reno, NV
speak English, man :)
ok.. first I should max my system out. How does one do that? I thought these benchmark tests already do that
 
Joined
Apr 26, 2016
Messages
1,090
Reaction score
852
Location
Colorado
MEASUREMENTS
1. Download and Install FREE version of Passmark from here: Download PassMark PerformanceTest - PC Benchmark Software
it will be named Performance Test 10.0 or something. Run it.

2. Stop your Blue Iris Service and anything else that is a big drain on resources.
3. Click the word "RUN" under the yellow-orange heading "CPU Mark", that will run the suite of CPU tests.
1591193429114.png
4. Take a Screenshot of your "CPU Mark" scoring.
1591193540485.png
5. Click the word "RUN" under the green heading "Memory Mark", that will run the suite of Memory tests.
1591193586463.png
6. Take a Screenshot of your "Memory Mark" scoring.
1591193694506.png
7. Download and install "AIDA64 Extreme" (Free Trial Version) (link: Download AIDA64 Extreme 6.25.5400 (EXE) | AIDA64 )
8. Run the "Cache & Memory Benchmark", under Tools menu. Screenshot the results. Some metrics will be blocked in TRIAL edition.
1591193221673.png1591193895022.png

9. optional: Run the GPGPU Benchmark. Screenshot the results. Obviously people with dedicated graphics cards will score better in this test, and that is not the recommended path. But maybe we will discover how iGPU contributes to overall performance, or (for example) how to tell if you might have a problem running 2x4k monitors with your iGPU.
1591194219036.png
10. Post the screenshots in the same post as your system specs above. Hopefully that helps.


Regarding maxing out Blue Iris, TAKE A BACKUP OF YOUR BLUE IRIS CONFIG before doing this!
Probably one result per CPU is ample, and I just kept adding "new cameras" and clicking the "Clone Master" option (on "General" tab in BI5, like @fenderman recommended in another thread) for every stream until MP/s in Blue Iris stopped climbing.

I forgot that there is a way to override the cloning function to force multiple streams from the same camera. This of course will be limited to how many simultaneous high def, streams your camera can supply reliably. From the help file,

"If you have added multiple cameras with the same IP address, video path and camera
number, the software clones the video stream internally—only a single stream request is
actually made to the camera. Which camera window actually connects to the camera may be
otherwise random unless you mark one as the designated Clone master. By using this
option on each camera that would otherwise be cloned, you may defeat the cloning feature
altogether and force the software to make multiple streaming requests from a single camera.

In order to identify cloned cameras, and asterisk (*) is shown after its name in its window
title bar."
 
Last edited:
Joined
Apr 26, 2016
Messages
1,090
Reaction score
852
Location
Colorado
OK, finally ready to reveal new general-purpose AMD server for Blue Iris and some miscellaneous oddball jobs (Ubiquity Controller, etc). This fits firmly in the "crazy build" category, and is completely unnecessary except purely for science. It is even less necessary (than when I ordered it) due to recent Blue Iris sub-stream enhancements. Final Warning: Your money would be better spent on 6-8 used off-lease desktops.

Specs: 16-Core AMD EPYC 7302P, 8x8GB DDR4-2933 on a TYAN S8030GM4NE-2T server motherboard. Each memory channel is loaded according to the motherboard manual for 1, 2 & 4 channels while 8 Channels is all slots populated. Theoretical bandwidth would be higher with faster DIMMs. (in the case of the "OC" config, memory is running at 1597MHz instead of 1464 MHz). I will test with Blue iris, but I anticipate I will run this in NPSAUTO+MEMOC (or about 118GB/s) config. I anticipate everyone running higher core counts or higher clocks will naturally score better in the CPU Mark portion of this as the server chips don't really boost or OC like the prosumer/workstation stuff. My EPYC chip base clock is 3GHz and max boost is 3.3GHz for example. In the chart below you see an R9-3900 (non-X) with a boost of 4.3GHz almost has similar performance in this metric.

PASSMARK CPU SCORE
PassMark - CPU Test EPYC.GIF

PASSMARK & AIDA64 MEMORY SCORES (in different memory configs)
SINGLE-CHANNEL Memory (1 stick in C0)
1CH-MemoryMark.GIF1CH-AIDA64-Mem Benchmark.GIF

DUAL-CHANNEL Memory (2 sticks in C0 & D0)
2CH-C0D0-MemoryMark.GIF2CH (C0D0)-AIDA64-Mem Benchmark.GIF

QUAD-CHANNEL Memory (4 sticks in C0, D0, G0, H0)
4CH-C0D0G0H0-MemoryMark.GIF4CH (C0D0G0H0)-AIDA64-Mem Benchmark.GIF

OCTA-CHANNEL Memory (8 sticks, fully populated, NPS0-NPS4 sets memory interleave mode to control NUMA access).
NPS0 - interleaves all memory equally, between CCX's and even multiple sockets (if equipped). Benefit: More memory available to each process, but some memory access has significantly more latency.
NPS4 - uses only memory sockets with lowest latency to the CCX. Benefit: Produces smallest latency and most consistent memory access, but each CCX only accesses 2 DIMM slots (in my case limiting every 4 cores to 16GB RAM shared).

NPS0:8CH NPS0-MemoryMark.GIF8CH NPS0-AIDA64-Mem Benchmark.GIF

NPS2: 8CH NPS2-MemoryMark.GIF8CH NPS2-AIDA64-Mem Benchmark.GIF

NPS4: 8CH NPS4-MemoryMark.GIF8CH NPS4-AIDA64-Mem Benchmark.GIF

NPS AUTO (default):
8CH-Big MemoryMark.GIF8CH-AIDA64-Mem Benchmark.GIF
NPS4-OC (best overall score, but with NUMA config more like Zen).
8CH NPS4OC-Big MemoryMark.GIF8CH NPS4OC-AIDA64-Mem Benchmark.GIF
 
Last edited:
Top