The model used is the below, it should run on both Nvidia GPU and CPU (CPU will be slow). To run on a GPU you need greater then 6.87 GB of VRAM
https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GGUF
/ For loading via llama-cpp.from_pretrained
"CPAI_MODULE_LLAMA_MODEL_REPO"...