Your computer can run it, but you have to layer the model across CPU and GPU memory. Your bottleneck will be PCIe speed, which probably won't be a huge issue for a 4080 on smaller quants.
Check out exo labs blog. https://blog.exolabs.net/day-2/
Short answer: You cant run it locally. Its 670B parameters..
Long(er) answer: check the reddit thread on r/LocalLLaMa
Ollama doesn't have this model yet.
Your computer can run it, but you have to layer the model across CPU and GPU memory. Your bottleneck will be PCIe speed, which probably won't be a huge issue for a 4080 on smaller quants.