However, if you're running Llama.cpp in the cloud, you'll definitely want to lock down your firewall first. Llama.cpp works with most models quantized using the GGUF format. These models can be found ...