Yet another Custom DeepStack model aimed to detect people

ivanerr · Apr 10, 2022

The attached model file was updated to the latest iteration ~~on 20.04.2021~~ and on 22.04.2022

Ladies and gentlemen may I present to you yet another custom DeepStack model!

TL/DR: The model is built with a real security task in mind, using CCTV footage only; can reliably detect people with a very low false-positive rate.

I have been using DeepStack and BlueIris to detect intruders in my back garden. When my alarm system is armed and DeepStack detects a person I get a loud verbal alarm from a speaker in my house.

Unfortunately, I have found out that the default DeepStack model is basically unusable because it is very prone to false positives.

The default model detects 'people' in snowflakes, raindrops, spider webs or other odd objects. Also, there had been false negatives - the default model didn't detect actual people in many cases, so around 6 months ago I started to collect a dataset using my own cameras images.

What is included in the dataset?
The dataset is built using 1825 images from actual CCTV video feeds (mostly mine and some random YouTube CCTV footage).

What classes does the model have?

person (named 'chel' in the model)
car (named 'tachka' in the model)
cat (named 'kotik' in the model)
dog (named 'sobaka' in the model)

When should I use it?

when you don't care for elephants, narwals and other fancy animals
when you do need a model with a low rate of false positives to enhance your home security

Are false positives completely eliminated in the model?
No, it is possible that a snowflake or spider web bouncing in the wind will look like a blurry image of a person, although the probability of this happening is a lot lower than when using the default model.

Will it work for me?
Yes, if you use conventional outdoor CCTV cameras with normal focal lengths and the installation height is within the recommended range. It won't probably work that well if the camera is installed at eye level or very high. You really shouldn't use it indoors for the reason that it wasn't trained for that and you will get a lot of false positives.

Does it work at night?
Yes, the model can detect people in low light/infrared/blurry environments.

What confidence level should I set the model to?
90-92% is a nice sweetspot for general usage and 95-96% is when you need to eliminate most false positives.

What detection mode (low, medium or high) should I use?
I don't see much difference in processing times on my Nvidia T400 card, so HIGH mode is the way to go. It allows you to detect smaller objects (aka people in distance and cats). Just make sure you don't use your main video stream.

Any drawbacks?
The car class is a bit prone to false positives at the moment. Some dog breeds may not be detected at all or can be confused with cats.

How do I know the model works well?
Please see instructions in the thread below on how to use the Testing and tuning->Analyze with DeepStack feature on your clips. I suggest assigning a hotkey to enable or disable the feature.

Where do I get the model?
It is attached to this post.

I have a problem/Don't know how to set up a custom model or use a custom class!
Many solutions are already covered in the 'custom community deepstack model' thread.

(eliminating false positives) The model detects an object with high confidence where it should not!
Please dm me the frame in question. I will add it as a background class image to the dataset so it won't be detected in the next iteration of the model.

Important note:
Please bear in mind that classes of the model were renamed to avoid collisions with other models.
If you want to detect people, please use 'chel' (without the quotes) as the class name. Cats are 'kotik' class, dogs are 'sobaka' class, cars/trucks/atvs are 'tachka' class.

Cameraguy · Apr 18, 2022

Anyone test this yet? How are results?

joshwah · Apr 30, 2022

Any feedback?

TBurt · May 1, 2022

Cameraguy said:
Anyone test this yet? How are results?

I put it on a couple of cameras. I will let it run overnight using both the combined model and this one. Should be interesting to compare the two models. I am using person, dog, car (chel, etc) to test it. I will report back tomorrow after I let the two models run.

joshwah · May 1, 2022

TBurt said:
I put it on a couple of cameras. I will let it run overnight using both the combined model and this one. Should be interesting to compare the two models. I am using person, dog, car (chel, etc) to test it. I will report back tomorrow after I let the two models run.

Looking forward to your feedback

Cameraguy · May 1, 2022

TBurt said:
I put it on a couple of cameras. I will let it run overnight using both the combined model and this one. Should be interesting to compare the two models. I am using person, dog, car (chel, etc) to test it. I will report back tomorrow after I let the two models run.

Awesome let us know

TBurt · May 1, 2022

Cameraguy said:
Awesome let us know

Well, the model is fast and very accurate. Running the model through some saved videos using it and Combined, the new one was just about always higher in %. It also ran about 50 to 60ms per camera on my GTX 1060 3GB. That is up there with combined speed-wise. Now I did run into one problem though. I could not get the two models to work together. One would give results, but not the other in live video. I could rearrange the combined, kotikfinal in the model lists but then the other would stop working. I have not seen this behavior before in DS or BI. Oddly enough both models worked fine when using DS to analyze saved video tough. But then again, that usually gives different results compared to live video anyways. I did just update my drivers for my 1060 so that might be messing with DS. This would not stop someone from using the kotikfinal model as you most likely would not be running those two models at the same time. I was due to testing accuracy/missed detections/etc. Kind of a lazy way to test the two against each other. The best test would be to clone the cameras. I am going to go get some lunch and will try it again shortly. Maybe I will try the general model, or even the built-in one to see if DS/BI does the same thing. Heck, it could have been just me. I have a habit of staying up too late messing with this stuff.

I am curious as to what was done in the training of the model to get such high accuracy. I was getting 90s while combined was in the 70s. Very impressive!

Cameraguy · May 1, 2022

TBurt said:
Well, the model is fast and very accurate. Running the model through some saved videos using it and Combined, the new one was just about always higher in %. It also ran about 50 to 60ms per camera on my GTX 1060 3GB. That is up there with combined speed-wise. Now I did run into one problem though. I could not get the two models to work together. One would give results, but not the other in live video. I could rearrange the combined, kotikfinal in the model lists but then the other would stop working. I have not seen this behavior before in DS or BI. Oddly enough both models worked fine when using DS to analyze saved video tough. But then again, that usually gives different results compared to live video anyways. I did just update my drivers for my 1060 so that might be messing with DS. This would not stop someone from using the kotikfinal model as you most likely would not be running those two models at the same time. I was due to testing accuracy/missed detections/etc. Kind of a lazy way to test the two against each other. The best test would be to clone the cameras. I am going to go get some lunch and will try it again shortly. Maybe I will try the general model, or even the built-in one to see if DS/BI does the same thing. Heck, it could have been just me. I have a habit of staying up too late messing with this stuff.

I am curious as to what was done in the training of the model to get such high accuracy. I was getting 90s while combined was in the 70s. Very impressive!

Cool I'm gonna run it for a little while and see what I get.. indoor camera was 71% using combined. 90% with kotikfinal.. I'll keep checking

TBurt · May 1, 2022

Cameraguy said:
Cool I'm gonna run it for a little while and see what I get.. indoor camera was 71% using combined. 90% with kotikfinal.. I'll keep checking

Ok, it seems to be working correctly now and every model is getting along with each other. I didn't really do anything. Actually, I did reboot the computer so maybe that fixed my DS issues. It seems for cars, it is getting about 10% higher results than the combined model. It seems actually using IP camera footage makes a difference. I did not get to run it very long as we had a strong thunderstorm roll through this afternoon causing Havok on our power. If it continues to show such strong performance I will most likely use this model on my "alert" cameras. I have my Dahua and Hikvision camera's AI do the person and car detection. If they find a person, it will send it to BI and have Deep Stack confirm it before sending out push notifications and whatnot. For example, I have it ring a doorbell letting me know someone is walking up to my front door. I have not had any cat/dogs detected as people yet using the dual method, as the AI using deep learning chips on these cameras are pretty spot on. But you never know. Two AIs are better in my book at 3 am. I do not want to wake up being told there is a person at my front door. Then find out it was the neighbor's cat.

Maybe the actual data/images used to train this model can be put on GitHub so others can use it as a starting point in training their own custom models.

sebastiantombs · May 1, 2022

I downloaded this model and set it up on three cameras as a test. So far, about an hour and a half in, it is detecting with a higher confidence level but it isn't detecting as often as dark or combined. I'll let it run overnight, only enabled it in the night profile, but will add it to the day profile in the morning.

gouthamravee · May 1, 2022

Set this up today on all my cameras cause yolo.
Its definitely faster at detecting than combined during the day.

I run my cameras with color at night cause my neighborhood doesn't give a damn about light pollution. Will update tomorrow about how well it does, but so far it seems to be doing well.

Thank you for posting this!

gouthamravee · May 2, 2022

Not bad!!

But I am still getting false positives cause of this friggin water pot thing

overall though it has detected people better than combined so far, though I would need to let this run for a few more days to make a final decision.
It does really well for people who are obscured by the cars in the drive way.

sebastiantombs · May 2, 2022

So after a full night and about six hours of daylight, albeit overcast daylight, I have some observations. Keep in mind that these results are on my system which may not have the best views when compared to what have been used as samples for the model.

Yes, it is definitely faster. That is an expected result because it contains fewer distinct objects.

It usually identifies with a higher confidence level than combined or dark, I use the dark.pt model at night along with the combined custom model.

It has not detected as often as I had hoped, in fact it detects at about 50% versus the other two models on one camera, about 33% on another camera and, basically, 0% on the third camera. That third camera has a much wider view and as a result vehicles and people are significantly smaller than the other two cameras.

I suspect that the lowered detection rate may be related to the specific types of vehicles used for the model. While vehicles are all basically the same, four wheels and a body, they are shaped somewhat differently in European countries versus the US.

One of these days I'll do my own model based on my own captures from the cameras I want it to work on. I just need to find the time, and motivation, to research the "how to" and then do it.

Cameraguy · May 2, 2022

sebastiantombs said:
So after a full night and about six hours of daylight, albeit overcast daylight, I have some observations. Keep in mind that these results are on my system which may not have the best views when compared to what have been used as samples for the model.

Yes, it is definitely faster. That is an expected result because it contains fewer distinct objects.

It usually identifies with a higher confidence level than combined or dark, I use the dark.pt model at night along with the combined custom model.

It has not detected as often as I had hoped, in fact it detects at about 50% versus the other two models on one camera, about 33% on another camera and, basically, 0% on the third camera. That third camera has a much wider view and as a result vehicles and people are significantly smaller than the other two cameras.

I suspect that the lowered detection rate may be related to the specific types of vehicles used for the model. While vehicles are all basically the same, four wheels and a body, they are shaped somewhat differently in European countries versus the US.

One of these days I'll do my own model based on my own captures from the cameras I want it to work on. I just need to find the time, and motivation, to research the "how to" and then do it.

Yea seems like a fun project but I need a tutorial haha

sebastiantombs · May 2, 2022

Me too!

TBurt · May 2, 2022

sebastiantombs said:
So after a full night and about six hours of daylight, albeit overcast daylight, I have some observations. Keep in mind that these results are on my system which may not have the best views when compared to what have been used as samples for the model.

I suspect that the lowered detection rate may be related to the specific types of vehicles used for the model. While vehicles are all basically the same, four wheels and a body, they are shaped somewhat differently in European countries versus the US.

One of these days I'll do my own model based on my own captures from the cameras I want it to work on. I just need to find the time, and motivation, to research the "how to" and then do it.

It seems to when it detects an object that it was trained with t does it fast and accurately. People for example... But if the object is a little different, then it can fail totally and not see anything. I noticed this really on cars. Autos come in all shapes and sizes. Front pickups to cars and buses. I think it needs more images to train off of to fix that. Or maybe combining the images used with combined + plus all his IP camera samples he used would make it strong all around. I think that would be the fastest way to shore up it weakness in some areas.

sebastiantombs · May 2, 2022

That's the conundrum. Getting enough different samples but not ending up with a database that's too big to scan quickly. It's a balancing act so it may take many multiple attempts before you get it just right. That's why I keep thinking about my own custom model. That would eliminate contrast problems ad headlight bloom at night I think.

Cameraguy · May 2, 2022

sebastiantombs said:
That's the conundrum. Getting enough different samples but not ending up with a database that's too big to scan quickly. It's a balancing act so it may take many multiple attempts before you get it just right. That's why I keep thinking about my own custom model. That would eliminate contrast problems ad headlight bloom at night I think.

Yea headlight give mine fits , so I assume you could put a picture of a vehicle with glaring headlights to trigger the D's, right?

sebastiantombs · May 2, 2022

That's what I'm hoping. But I would limit it to minimal headlight bloom shots. Shots that show enough to identify that's it's a car, truck, motorcycle or whatever.

charredchar · Sep 29, 2022

I was using this for a couple of days but I needed to stop from a constant false alert. It is being triggered by both long shadows from tree branches as well as my mud boots. So it seems this model notices shoes as people instead of a complete person. I can't just mask this area of false alerts too as it is right at my front door which is exactly where I want it to trigger.

Just goes to show you, high prediction percentage isn't everything. I mostly went with this because I wanted better infrared detection.

Yet another Custom DeepStack model aimed to detect people

n3wb

Attachments

Known around here

Pulling my weight

Getting the hang of it

Pulling my weight

Known around here

Getting the hang of it

Known around here

Getting the hang of it

Known around here

n3wb

n3wb

Known around here

Known around here

Known around here

Getting the hang of it

Known around here

Known around here

Known around here

n3wb