Deploy an application on Managed GPU Cluster

Ollama is an open-source tool that lets you run, manage, and customize large language models (LLMs) on a personal computer or server, supporting many models such as Llama, DeepSeek, Mistral, and others. Open-WebUI is an open-source web interface designed specifically to interact with Ollama, providing a user-friendly and easy-to-use experience for managing and using LLM models.

This guide walks through the steps to deploy the DeepSeek-R1 model on FPT Managed GPU Cluster using Ollama and Open-WebUI so that users can interact with the model in a simple and intuitive way.

Step 1: Clone the Open-WebUI source code and scripts.

git clone https://github.com/open-webui/open-webui
cd open-webui/kubernetes

Step 2: Run the scripts to deploy Ollama and Open-WebUI. The directory already includes all necessary files for deployment: namespace, ollama statefulSet, ollama service, open-webui deployment, and open-webui service.

cd kubernetes
kubectl apply -f ./kubernetes/manifest

Step 3: Access Open-WebUI in a browser on the forwarded port, for example: http://localhost:52433. On the first installation and use of OpenWebUI, you will need to configure a name, email, and password.

Step 4: After setup is complete, select a model to use. In this example, we will install the DeepSeek-R1 model, version 1.5b.

Step 5: Once the model has been downloaded and is running, users can interact with it in a simple and intuitive way through the interface.