Sounds like you're trying to do too much with one camera in this case.
This is correct. This is a common issue with folks starting out with IP cams. Overview cams are fine. But they lack the ability to give good facial pics for anything further away. Basically if you want to see what is going on in a wide angle view, you use a wide angle lens. If you want to see what is going on far away, you use a telephoto lens. This is no different than using a DSLR Nikon camera.
If you compare the 8mp cam to the 4mp cam you get one digital zoom to the 4mp cam to match the 8mp FOV. But the pixels are twice as big. So most folks think that they are getting a better view from the 8mp cam. But that is fine for daytime picturesque views that make nice marketing shots, but in lower light situations, the 8mp cam (assuming they both have the same sensor) will not perform as well as the 4mp cam if there is motion. You have mashed twice the pixels on the same size sensor as the 4mp cam. Each pixel gets half the light. So for a similar performance in low light, an 8mp cam needs TWICE the light as the 4mp cam.
Do not chase MP. Look for good performance based on sensor size.
For detail, choose the proper lens for the job. I have a few overview, wide angle cams. I love the view I get. When I hear a noise outside, I can quickly see what it might have come from. But then I switch to one of the cams that are focused at the choke point with a zoomed in FOV so that I can get more info, like face, markings on clothing, logos, damage on vehicles and plates. Each cam I have has a specific job to do. None of my cams do more than one job. Well except for my Intersection cam that monitors the 'T' intersection. That cam is used to supplement my LPR cams by getting make, model, color, damage, etc. of the vehicles that the LPR cams pick up. But sometimes I use that Intersection cam for reading license plates.