If you want just humans/Vehicles
Theres already cameras that have built in ai for just that.
You wont need to run separated server for processing stream, as camera will do it for you.
Also Hikvision supports HEOP 2.0 andfreshly AIOP where you can learn camera to detect anything. % depends on...