feature: Can it support SSE deployment?

The current implementation is well done, but it doesn't quite match real-world usage scenarios. To make this application production-ready, it needs to support standalone deployment and multiple Kubernetes cluster services. During each LLM API call, the kubeconfig should be passed as an input parameter.