FPT Kubernetes Engine with GPU
FPT Cloud provides Kubernetes with NVIDIA GPU support, featuring the following key capabilities:
- Flexible GPU configuration with multiple GPU types and optional GPU memory, applied per Worker Group.
- Automatic GPU resource management and allocation in Kubernetes using NVIDIA Operator.
- GPU visualization and monitoring with NVIDIA DCGM.
- Automatic Container/Node scaling with Autoscaler when applications increase or decrease GPU resource usage.
- GPU sharing support with Multi-Instance mechanism to optimize GPU resource utilization and cost.
FPT Cloud uses the NVIDIA GPU Operator, which provides tools to automatically manage all software components required to use GPUs on Kubernetes. The GPU Operator allows users to use GPU resources just like CPU resources in a Kubernetes cluster. Operator components include:
- NVIDIA Drivers (CUDA, MIG, …)
- NVIDIA Device Plugin
- NVIDIA Container Toolkit
- NVIDIA GPU Feature Discovery
- NVIDIA Data Center GPU Manager (Monitoring)
In the Hanoi and Saigon regions, FPT Cloud currently supports Kubernetes with Nvidia A30 GPUs with the following MIG profiles:
| No. | GPU A30 Profile | Strategy | Number instance | Instance resource |
|---|---|---|---|---|
| 1 | all-1g.6gb | single | 4 | 1g.6gb |
| 2 | all-2g.12gb | single | 2 | 2g.12gb |
| 3 | all-balanced | mixed | 2 | 1g.6gb |
| 4 | 1 | 2g.12gb | ||
| 5 | none (no label) | none | 0 | 0 (Entire) |
In the Hanoi 2 and Japan regions, FPT Cloud currently supports Kubernetes with Nvidia H100 and Nvidia H200 GPUs:
| No. | GPU H100 SXM5 | Strategy | Number instance | Instance resource |
|---|---|---|---|---|
| 1 | all-1g.10gb | single | 7 | 1g.10gb |
| 2 | all-1g.20gb | single | 4 | 1g.20gb |
| 3 | all-2g.20gb | single | 3 | 2g.20gb |
| 4 | all-3g.40gb | single | 2 | 3g.40gb |
| 5 | all-4g.40gb | single | 1 | 4g.40gb |
| 6 | all-7g.80gb | single | 1 | 7g.80gb |
| 7 | all-balanced | mixed | 2 / 1 / 1 | 1g.10gb / 2g.20gb / 3g.40gb |
| 8 | none (no label) | none | 0 | 0 (Entire) |
| No. | GPU H200 SXM5 | Strategy | Number instance | Instance resource |
|---|---|---|---|---|
| 1 | all-1g.18gb | single | 7 | 1g.18gb |
| 2 | all-1g.35gb | single | 4 | 1g.35gb |
| 3 | all-2g.25gb | single | 3 | 2g.25gb |
| 4 | all-3g.71gb | single | 2 | 3g.71gb |
| 5 | all-4g.71gb | single | 1 | 4g.71gb |
| 6 | all-7g.141gb | single | 1 | 7g.141gb |
| 7 | all-balanced | mixed | 2 / 1 / 1 | 1g.18gb / 2g.35gb / 3g.71gb |
| 8 | none (no label) | none | 0 | 0 (Entire) |
Example: If you select strategy single: all-1g.6gb, the A30 GPU card on the worker is divided into 4 MIG devices, each with logical GPU resources equal to 1/4 of the physical GPU and 6 GB GPU RAM. If you select strategy single: all-1g.10gb, the H100 GPU card on the worker is divided into 7 MIG devices, each with logical GPU resources equal to 1/7 of the physical GPU and 10 GB GPU RAM.
Note: MIG config applies to all GPU cards attached to the worker. The MIG strategy across all worker groups in the same cluster must be of the same type (single/mixed/none).