CodeProject.AI Version 2.0

Well, the inference time sounds right. Not sure what it’s doing the rest of the time. Resizing a 4K image to the input tensor size takes roughly 10 ms. (Should be even faster in the latest version.)
I just gave it a go on my cpu only cpai version 2.6. Same image got this
Processed byObjectDetectionYOLOv8
Processed onlocalhost
Analysis round trip833 ms
Processing456 ms
Inference455 ms
Timestamp (UTC)Tue, 02 Apr 2024 03:33:36 GMT
 
I abandoned Coral for CPU processing sometime ago. It just sits in my machine although I may revert later if it proves worthy.

I very much get the impression from the varied results that this is something to do with the installation rather than the device. Some users have reported less than 10ms processing times whereas on mine (albeit on very old BI / CPAI versions now), I was seeing about 180-200ms for processing. However, mulitple triggers were causing queueing and sometimes timeouts. As I only have 2 cameras, this shouldn't really be overloading the Coral. From what I could tell before, it seems that those who installed Coral the full old way, got better times than those who used the built in installtion in CPAI. So maybe this could be where the issue lies. I was using YOLO 4 or 6 from memory (not on the BI machine currently to check).
 
  • Like
Reactions: jrbeddow
I abandoned Coral for CPU processing sometime ago. It just sits in my machine although I may revert later if it proves worthy.

I very much get the impression from the varied results that this is something to do with the installation rather than the device. Some users have reported less than 10ms processing times whereas on mine (albeit on very old BI / CPAI versions now), I was seeing about 180-200ms for processing. However, mulitple triggers were causing queueing and sometimes timeouts. As I only have 2 cameras, this shouldn't really be overloading the Coral. From what I could tell before, it seems that those who installed Coral the full old way, got better times than those who used the built in installtion in CPAI. So maybe this could be where the issue lies. I was using YOLO 4 or 6 from memory (not on the BI machine currently to check).
You seem particularly well positioned to re-try this setup, given your past experiences and knowledge of the results. Any chance you might be willing to "lead the way" for the rest of us? Or was the whole exercise so distasteful that you don't really want to give it another go?
 
Just got my m.2 Coral device, Believe everything is installed correctly in Unraid (enabled in settings) and codeAI Docker says (Started Multi-TPU (TF-Lite) in status.
In BlueIris System Setup - AI Custom Models says MobileNet SSD, Does this mean I don't use the IPCam-Combined anymore? and do I need to change each camera custom model to match this?

here is last few lines of logs
22:51:23:Started Object Detection (Coral) module
22:51:26:objectdetection_coral_adapter.py: TPU detected
22:51:26:objectdetection_coral_adapter.py: Attempting multi-TPU initialisation
22:51:26:objectdetection_coral_adapter.py: Supporting multiple Edge TPUs
22:53:26:objectdetection_coral_adapter.py: WARNING:root:No work in 120.0 seconds, watchdog shutting down TPUs.
 
You seem particularly well positioned to re-try this setup, given your past experiences and knowledge of the results. Any chance you might be willing to "lead the way" for the rest of us? Or was the whole exercise so distasteful that you don't really want to give it another go?


Sorry I won't touch it atm until I get some backup software for my BI PC. Everytime I've upgraded BI or CPAI and there's been an issue (which has been quite a few times), I've been forced to re-install the entire pc including Windows. Too much of a pain and hassle to risk having to do that. It means wasting 1/2 a day at least struggling with reconfiguring everything and rediscovering the cameras. That's the reason why I'm stuck on the old versions.
 
Last edited:
  • Like
Reactions: jrbeddow
I've had 200ms on Coral. In fact my CPU matches the times and doesn't time out. Hence why I'm on CPU atm.
 
  • Like
Reactions: jrbeddow
The MobileNet SSD models are roughly 10 ms fast. The total time may be larger for other reasons. The large models like YOLOv5 large may take over 1000 ms, but can also only fit a fraction of the model on the TPU. So timing will vary.
 
All of the non-custom models use the COCO labels.

There are many irrelevant labels in there, thus the custom models have pruned them.
 
Like CCTVCam said, every time I touch/upgrade BI/CPAI I end up having some type of issue and spend a few hours fixing and testing everything so I try not to touch it unless I have to. Been researching Proxmox/VM and thinking of that for at least snapshots and/or cloning the VM before upgrading (to have a backup). I don't want to hijack the thread but glad I'm not alone. Also don't want to sound like a complainer. When it works it usually works well (4 out of 5 stars).

Anyway, I have an M.2 Coral TPU (single not dual TPU) with BI v5.8.4.5 and CPAI v2.5.1 both on the same dedicated PC (Dell Optiplex 7060 SFF). I found the Medium model size to be the most accurate for my needs (small Model missed some objects). With Medium model size Inference is usually 60-120 ms. One thing I have a problem with that I just saw some previous posters mention is the long "Analysis round trip". That is usually like 100-150 ms but there are times where is is well over 1,000 ms and there is nothing going on. To clarify that is 60-120 ms of Inference but over 1k ms of round trip. Usually round trip is about 20-40 ms more than the inference value. There is no motion in BI to cause a queue. I cannot figure out what the issue is.

I've used the CPAI benchmark tool and usually get like 9-10 ops per second (not great but OK for low power PC which I need). However, there were times when it was as low as 0.2 per second. Again, when nothing is going on. I always used the same picture for all my testing (so I have the same baseline). When it was 0.2 ops/sec I opened Task Manager and CPU was fine around 8-12% with memory around 27% (4.3 GB of 16 GB available). I can't figure it out. Opened BI and and made sure nothing was triggering either.

Hope that helps with people on the fence with Coral TPU or looking for some feedback.

One other thing I noticed that is that just opening CPAI Explorer usually takes over 20 seconds. It's using localhost.32168/explorer.html (which I think is the default URL). I also tried 127.0.0.1/32168/explorer.html and same results. It using Edge and it doesn't matter if opening Edge from start or just going to the url with Edge already open.

Debating an additional 1-2 TPU's to help with queuing but not sure if that will add more headaches.
 
  • Like
Reactions: mailseth
I’d personally recommend at least two or three TPUs with a YOLO medium model size. That way you can fit more of the 24 MB model into multiple onboard 8 MB TPU caches. See my post here measuring performance:
 
  • Like
Reactions: AlwaysSomething
I’d personally recommend at least two or three TPUs with a YOLO medium model size. That way you can fit more of the 24 MB model into multiple onboard 8 MB TPU caches. See my post here measuring performance:

Interesting. Thank you for the detailed explanation as well.

They make a Dual TPU card but it needs an adapter. Any experience with the dual TPU card and the adapter?

Since I'll need more adapters in addition to the TPUs I'm wondering if I should get the Dual TPU even though I like the M.2 version because I can add it right to the motherboard without an adapter. Also, there is only 1 company (AFAIK) making the adapter for the Dual TPU so if they stop making them you are SOL (and they were out of stock at the time I bought the M.2 TPU). With the M.2 there are many companies that make an adapter (to PCIe). I need to see if I can find one that accepts 2 x M.2 to a single PCIe slot.

As far as multiple TPUs, besides changing the config for "Multi-TPU" from false to true is there any other configurations I need to make?

Just curious how is your setup? How many TPU and types? If you don't mind me asking.

I also see many people running CPAI in containers/docker. Any benefit to that rather than CPAI in Windows alongside BI? Only thing I can think of is upgrading might be easier since you don't have to uninstall and than reinstall (not always clean). You just wipe out the old container and create the new one. Starting fresh every time I guess.

Thanks
 
I actually have a total of eight TPUs in my machine. Two M.2 cards and three dual TPU setups (w/ three adapters). The intent was to split them between a recorder at my off-grid cabin and one at home, but I seem to have gotten side tracked playing with TPU development.

So I’m not running BI/CPAI right now because I’m booted into the particular version of Ubuntu that runs Google’s profiling compiler, and then experimenting with a faster multithreaded setup on the CLI for use by CPAI.

If you have the open M.2 slots, I’d fill them first with single TPU cards since they’re cheap, fast, and relatively less trouble.
 
Last edited:
If you’re upgrading to a dual, you may want to consider a heat sink on your Dual + Adapter. I overheated it pretty fast and eventually ended up with one of these:

And you’d need one of these too, cut to fit the chips:

And drilled out some holes in the heat sink and mashed in some M2.2x8 screws. Don’t tighten too hard or the boards will bend.

As far as I know, this is the only adapter that works with the Dual TPU. The problem is that the Dual TPU isn’t a 2x PCIe card, but its are two 1x interfaces over a single connection, which is something less supported. This adapter is more expensive, but is more compatible by converting it to a single 1x PCIe connection.

Maybe someday the guy will finish his 2x card / 4x PCIe lane adapter:

For thermal details, see the spec:
 
Last edited:
As an Amazon Associate IPCamTalk earns from qualifying purchases.
  • Like
Reactions: AlwaysSomething
The makerfabs was the only one I've ever seen as well. They actually make 2 adapters, the one you linked and also one for an M.2 slot as well. Would love to have that 2 card adapter from Github... maybe someday. Looks like the Dual TPU is on backorder again anyway unless I want to pay twice the price.

As far as the heatsink, I read the spec and want to confirm. The pads (2nd link) would be cut to fit and placed on the 4 chips (2 TPU and 2 IC). Then you are placing the heatsink (1st link) on top of that? The pads are to fill in the gap (so to speak) since there are other components that are taller and would interfere with the heatsink making contact.

As far as drilling the holes into the heatsink, I'm assuming you did 3 holes to align with the 3 holes on the board and then you screwed through the holes and just tapped into the adapter card with the screws?

Did you have any problems with the overheating the single TPU boards?

Thanks again for the help and info.