Help with deepstack gpu for windows

Interesting. I re-downloaded the 2021.09.1 version I linked and also the newer 2022.01.1 version from that site.
Tested on two different machines and I don't get an error (Chome reported no warning on the download and Windows Defender had no issue with a manual scan or during install).

Is your server patched? What versions are your protection signatures?
On my two test machines I have:
Code:
Get-MpComputerStatus |select *Version

AMEngineVersion : 1.1.19000.8 AMProductVersion : 4.18.2202.4 AMServiceVersion : 4.18.2202.4 AntispywareSignatureVersion : 1.361.1057.0 AntivirusSignatureVersion : 1.361.1057.0 FullScanSignatureVersion : NISEngineVersion : 1.1.19000.8 NISSignatureVersion : 1.361.1057.0 QuickScanSignatureVersion : 1.361.996.0

One of the machines is slightly older
QuickScanSignatureVersion : 1.361.701.0


Sorry for the really late reply to this... everything on my BI server checked out as you mentioned, but still the downloaded DeepStack GPU package refused to install with the same issue as before - spinning blue wheel for a couple of seconds, then nothing - no errors, no messages... just nothing.

However, I managed to figure out what was going on (and the reason for this reply is hopefully to help anyone else who has the same issue).

When downloading the file, Windows Defender throws a hissy fit about the "file is not downloaded often and could be malicious. Download anyway?". I clicked to download anyway and got the file in my downloads folder. However, despite agreeing to Windows Defender's demands it turns out that it still tags the file properties as "blocked"... but doesn't tell you it's done so, and there are no messages telling you when you attempt to open the file!

It is this "blocked" property that prevents the install from running.

Here's the URL of the article that explained it: Microsoft Defender SmartScreen Prevents Software Installation

Once the blocked property was removed, the DeepStack GPU installer ran and now my BI server CPU is back down running at 30-40% (rather that permanently pegged at 99%) with fast and accurate DeepStack person detection offloaded to my NVidia GT 1030.

Thanks for assistance!
 
I got deepstack GPU up and running on blue iris and it is identifying objects but it feels like the times are longer than they should be. I see anywhere from 800ms to 3500ms. There is no doubt I probably messed something up along the way but curious about what effects those times.
Those sounds like CPU response times rather than GPU response times, unless you're using huge models.
I'm have a lowly GT1030 and I'm seeing response times around ~150ms.
What models are you using?
Edit: Also check whether you're sending your cameras main stream, or sub stream for analysis. If you have high resolution cameras then it will be sending very large images from the main stream for analysis.
It's worth checking task manager to see if your CUDA cores are being utilised during deepstack operations.
You might need to change the view slightly per my screenshot:
taskmgr.png
 
Those sounds like CPU response times rather than GPU response times, unless you're using huge models.
I'm have a lowly GT1030 and I'm seeing response times around ~150ms.
What models are you using?
Edit: Also check whether you're sending your cameras main stream, or sub stream for analysis. If you have high resolution cameras then it will be sending very large images from the main stream for analysis.
It's worth checking task manager to see if your CUDA cores are being utilised during deepstack operations.
You might need to change the view slightly per my screenshot:
View attachment 128963
I actually went to using an older version of deepstack and it is down into the 2-400 range on High so I think that solved the problem.
 
I just changed from DeepStack CPU to DeepStack GPU and I think it sounds more complicated than it is - there are several different versions and comments floating around...

I recently purchased a Quadro P400 v2 and I have it installed in my system with the motherboard bios set to use onboard graphics. So in Windows, the Task Manager will show two separate GPU's. Also, I used the latest version of CUDA from NVIDIA (11.3.1)

I wrote up the procedure I used - (attached text file). It 'seems' to be working fine, but if any one sees anything I missed or has some tips....

The main webpage with instructions and links can be found here: Using DeepStack with Windows 10 (CPU and GPU)


Activity.jpg
 

Attachments

Thanks for the write up, but I am still having problems. It is really confusing when the cuDNN install instructions reference copying the downloaded files into the "C:\Program Files\NVIDIA\CUDNN" directory that doesn't exisit =(



I just changed from DeepStack CPU to DeepStack GPU and I think it sounds more complicated than it is - there are several different versions and comments floating around...

I recently purchased a Quadro P400 v2 and I have it installed in my system with the motherboard bios set to use onboard graphics. So in Windows, the Task Manager will show two separate GPU's. Also, I used the latest version of CUDA from NVIDIA (11.3.1)

I wrote up the procedure I used - (attached text file). It 'seems' to be working fine, but if any one sees anything I missed or has some tips....

The main webpage with instructions and links can be found here: Using DeepStack with Windows 10 (CPU and GPU)


View attachment 129304
 
Thanks for the write up, but I am still having problems. It is really confusing when the cuDNN install instructions reference copying the downloaded files into the "C:\Program Files\NVIDIA\CUDNN" directory that doesn't exisit =(

Where did you read that? (I think that is part of the problem - too many versions floating around. Keep it simple.) Use the two files on the main DeepStack Web page
(Keep in mind, I am not the expert on this but I did just go through this and it appeared to work for me. If I made an error somewhere, please let me know)

Short Version for Win10
1 - Install CUDA 11.3.1
2- Copy the files from cuDNN zip file into the matching folders
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3



T22-970.jpg

CUDA 11.3.1 creates all the folders
cuDNN is just a zip file. You open it up, or extract it somewhere (I always use WinRAR) and copy the files to the matching folders

Cuda.jpg

files in the bin folder that get copied over (all .dll files)
T22-971.jpg

files in the include folder that get copied over (all .h files)
T22-972.jpg

files in the lib folder that get copied over (all .lib files)
T22-973.jpg
 
Last edited:
Thanks for the details!

This is from the cuDNN Windows install guide showing the wrong folder locations:

D433DD34-7F30-4357-8160-D4904B307326.jpeg


Where did you read that? (I think that is part of the problem - too many versions floating around. Keep it simple.) Use the two files on the main DeepStack Web page
(Keep in mind, I am not the expert on this but I did just go through this and it appeared to work for me. If I made an error somewhere, please let me know)

Short Version for Win10
1 - Install CUDA 11.3.1
2- Copy the files from cuDNN zip file into the matching folders
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3



View attachment 129312

CUDA 11.3.1 creates all the folders
cuDNN is just a zip file. You open it up, or extract it somewhere (I always use WinRAR) and copy the files to the matching folders

View attachment 129311

files in the bin folder that get copied over (all .dll files)
View attachment 129313

files in the include folder that get copied over (all .h files)
View attachment 129314

files in the lib folder that get copied over (all .lib files)
View attachment 129315
 
Thanks for the details!

This is from the cuDNN Windows install guide showing the wrong folder locations:

View attachment 129320


Ya, I got all twisted up trying to follow that also - install zlib, yada yada yada. (I thought that may have been were your read it. Here is the link.
)

That is why I made my post. It doesn't appear to be really that complicated...

There is so much info floating around - install CUDA tool kit, and then the two upgrades, etc etc. If you look, most of the posts were over a year ago. Things change. It looks to be really simple now. Like I posted. You download and install one program, and you download a zipped archive and copy some files over. That's the short version. The text file I attached has some more info

Everything points to my setup working. The log file shows that when Blue Iris starts, both of my GPU's are detected
T22-976.jpg

I also have log entries showing DeepStack working
T22-975.jpg

I watch as a car drives by and less than a second later, I have an alert with DeepStack boxes showing the vehicle. My P400 GPU shows a spike in Task Manager while the CPU percentage remains flat.

This intersection is 300' away (using a 25x Dahua) and it properly identified someone riding a bicycle!
T22-977.jpg

These cars are going by at 30 mph+ and that is about 50' of roadway in view. Looks like it's working to me?
T22-978.jpg
 
Ya, I got all twisted up trying to follow that also - install zlib, yada yada yada. (I thought that may have been were your read it. Here is the link.
)

That is why I made my post. It doesn't appear to be really that complicated...

There is so much info floating around - install CUDA tool kit, and then the two upgrades, etc etc. If you look, most of the posts were over a year ago. Things change. It looks to be really simple now. Like I posted. You download and install one program, and you download a zipped archive and copy some files over. That's the short version. The text file I attached has some more info

Everything points to my setup working. The log file shows that when Blue Iris starts, both of my GPU's are detected
View attachment 129323

I also have log entries showing DeepStack working
View attachment 129324

I watch as a car drives by and less than a second later, I have an alert with DeepStack boxes showing the vehicle. My P400 GPU shows a spike in Task Manager while the CPU percentage remains flat.

This intersection is 300' away (using a 25x Dahua) and it properly identified someone riding a bicycle!
View attachment 129325

These cars are going by at 30 mph+ and that is about 50' of roadway in view. Looks like it's working to me?
View attachment 129326
You do not need to send Main Stream images to DeepStack because DeepStack will resize the image to 640 x 640 if set to High , 416 x 416 if set to Medium and 256 x 256 for Low (see DeepStack Code below). This resizing slows down DeepStack, you will get the better detection times with sending Sub Stream images with the same accuracy. Also to get the best accuracy set the mode to High not the default Blue Iris setting of Medium. The default DeepStack and my models are optimized for High (640 x 640)

Code:
"desktop_gpu": Settings(
            DETECTION_HIGH=640,
            DETECTION_MEDIUM=416,
            DETECTION_LOW=256,
 
Where did you read that? (I think that is part of the problem - too many versions floating around. Keep it simple.) Use the two files on the main DeepStack Web page
(Keep in mind, I am not the expert on this but I did just go through this and it appeared to work for me. If I made an error somewhere, please let me know)

Short Version for Win10
1 - Install CUDA 11.3.1
2- Copy the files from cuDNN zip file into the matching folders
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3



View attachment 129312

CUDA 11.3.1 creates all the folders
cuDNN is just a zip file. You open it up, or extract it somewhere (I always use WinRAR) and copy the files to the matching folders

View attachment 129311

files in the bin folder that get copied over (all .dll files)
View attachment 129313

files in the include folder that get copied over (all .h files)
View attachment 129314

files in the lib folder that get copied over (all .lib files)
View attachment 129315

Note that the lib folder on the target directory includes a Win32 and x64 folder. I assume you copied the cuDNN files to the x64 folder?
 
Well, despite the great install guide I am still faced with the dreaded Deepstack timeout error. CPU version of Deepstack works fine, just slow to respond (2000ms+) on my older Dell Xeon server. I am using sub-streams.

I now think the problem is my super old GPU (GeForce GT 710). I double checked the CUDA GPU support tables (CUDA GPUs - Compute Capability) and curiously they show support for the GeForce GT 430, 440, 520, 620, 630, 640, 705...but skip over 710 and go right to 720, 730. Maybe it is time to buy a better GPU.... :)
 
  • Like
Reactions: sebastiantombs
Note that the lib folder on the target directory includes a Win32 and x64 folder. I assume you copied the cuDNN files to the x64 folder?

Actually.... I just dumped them into the root of the lib folder. I thought if they wanted them in the Win32 or x64 folder they would have had those folders in the archive? Maybe someone else can weigh in????

T22-950.jpg
 
I now think the problem is my super old GPU (GeForce GT 710). I double checked the CUDA GPU support tables (CUDA GPUs - Compute Capability) and curiously they show support for the GeForce GT 430, 440, 520, 620, 630, 640, 705...but skip over 710 and go right to 720, 730. Maybe it is time to buy a better GPU.... :)
The GT1030 also isn't listed, but works fine. I suggest checking with a tool such as Cuda-Z ( more info in my earlier post here: Help with deepstack gpu for windows ).
Edit: to be clearer, it's a combination of whether your card supports CUDA and the version of cuda it supports.
 
  • Like
Reactions: sebastiantombs
The GT1030 also isn't listed, but works fine. I suggest checking with a tool such as Cuda-Z ( more info in my earlier post here: Help with deepstack gpu for windows ).
Edit: to be clearer, it's a combination of whether your card supports CUDA and the version of cuda it supports.

Thanks for the Cuda-Z tip. Apparently the GT 710 supports compute capability of 3.5. Is anyone running Deepstack GPU with this level of support?
 

Attachments

  • Screenshot 2022-05-30 073216.png
    Screenshot 2022-05-30 073216.png
    215.3 KB · Views: 9
Two other pieces of information... 1) I don't see any Deepstack process running under nvidia-smi.exe , 2) I don't see a drop down for Cuda in the windows performance monitor.

Screenshot 2022-05-30 074941.png

Screenshot 2022-05-30 074932.png
 
I came across this post with user _Peek and the same GT 710 GPU. Looks like the issue is lack of Pytorch support for compute capabilities of 3.5 :( . So either I learn how to compile or buy a newer GPU. I am quickly approaching the point where throwing new hardware at the problem might be the best solution.


 
Two other pieces of information... 1) I don't see any Deepstack process running under nvidia-smi.exe , 2) I don't see a drop down for Cuda in the windows performance monitor.

View attachment 129399

View attachment 129400
I don't have a 'Cuda' dropdown either - I noticed that the other day, but everything seems to be working fine? (The CPU version of DeepStack was completely removed)

Quadro P400
T22-982.jpg

Onboard Intel 630 graphics
T22-983.jpg
 
In case this helps anyone else I purchased an ASUS GT 1030 2GB GDDR5 for around $100 and now Deepstack GPU works great! (Note: Beware of the GT 1030 cards with inferior GDDR4 RAM. Apparently some manufacturers quietly downgraded).

In the end it was my old GT 710 that didn't support the newer CUDA version required for Deepstack. The GT 710 supports compute capability 3.5 while the GT 1030 supports 6.1. I didn't feel adventurous enough to attempt forcing the GT 710 to work via a freshly compiled version of Pytorch, but someone else may want to try it: GPU pytorch compile options for older card - i.e. GT710 . Credit to loopy12 .

I am getting processing times in the 100-200ms range now with the GT 1030.

Thanks for the help eeeeesh and PeteyPete.
 

Attachments

  • Screenshot 2022-06-04 091845.png
    Screenshot 2022-06-04 091845.png
    65.7 KB · Views: 19
  • Screenshot 2022-05-30 073216.png
    Screenshot 2022-05-30 073216.png
    215.3 KB · Views: 19
Last edited:
In case this helps anyone else I purchased an ASUS GT 1030 2GB GDDR5 for around $100 and now Deepstack GPU works great! (Note: Beware of the GT 1030 cards with inferior GDDR4 RAM. Apparently some manufacturers quietly downgraded).

In the end it was my old GT 710 that didn't support the newer CUDA version required for Deepstack. The GT 710 supports compute capability 3.5 while the GT 1030 supports 6.1. I didn't feel adventurous enough to attempt forcing the GT 710 to work via a freshly compiled version of Pytorch, but someone else may want to try it: GPU pytorch compile options for older card - i.e. GT710 . Credit to loopy12 .

I am getting processing times in the 100-200ms range now with the GT 1030.

Thanks for the help eeeeesh and PeteyPete.
Yea my cheapo Quadro k620 is getting 100-200ms. Also check them out for like 35 bucks on ebay