Re: BlueIris causing hard lockups on new server
WELL! Here is the conclusion (for today... and hopefully from now on) of my saga!!!!
After RMA'ing the Mobo and the CPU (not thinking which one it could be... lets just do both), and it coming in, and still doing the SAME thing... I was about ready to give up, and go live up north, in solitude. I have been at this for 2 weeks now, and this morning, I took a colleague to the data center, and had him look it over... and, he was stumped as well.
We tried disabling the onboard graphics, and it "fixed" the issue, but only temporarily... because it locked up again. We tried numerous other things, and finally... we got it.
Changing the Ring Voltage Offset, and the UnCORE voltage offset, made the system "stable". The Intel XTU software ran a stress test on the GPU for 30 minutes... and no lockups (previously, it would lock up anywhere from 10 seconds, to 10 mintues). So, I said lets set each of these values back to 0, one at a time, and see which one is the culprit. Turns out, the Ring Voltage Offset of +75mv is the key. When we set it to 0, it locks up... having it at 75, runs fine (currently, I am 2hrs into a 4hr gpu stress test).
NOW... is +75 the magic number? I don't know... could it be lowered? maybe... could setting it higher be better? who knows at the moment. I have sent SuperMicro an email, telling them of my findings... just waiting to hear back from them now.
So, for now... I am waiting for the 4hr test to finish, and then I am going to go replace the stock CPU cooler, with an aftermarket one... since the CPU/GPU temps approach 70-80*C at peak... and I know the stock cooler is "meh" at best.
Hopefully I can start adding back the cameras, and putting the HDD's back in... and this issue will be behind me... cuz man, it has been a ROUGH few weeks.
Sorry to blame BI, but at the time, it was the only thing that would put a load on the computer, and lock it up at that time... but, it is not BI!