add llama-cpp-python to kubernet cluster
(1) use container https://github.com/abetlen/llama-cpp-python/pkgs/container/llama-cpp-python (2) mount k8s storage as /models export MODEL point to the right llama-model.gguf (3) expose 8000 to loadbalancer to outside (4) browse to ip:8000/docs for API…