Skip to main content

FAQ

1. What is Model Fine-tuning?

Model Fine-tuning is the process of retraining a base language model on a specialized dataset so that the model achieves better performance in a specific domain or for a target use case.

2. Which model should I choose for fine-tuning?

  • Small models (<=1B parameters) → for testing or light workloads
  • Medium models (7B-13B) → balance of performance and cost
  • Large models (30B+) → for complex tasks, usually requiring multi-node configurations
  • Instruction-tuned models are preferred if your task is prompt-response based

3. How long does fine-tuning take?

The duration depends on:

  • Model size (a few hours for small models, several days for very large ones)
  • Dataset size
  • Your hardware setup (hyperparameters & infrastructure)

Typically, the time ranges from a few hours to several days.

4. What do you need to prepare before fine-tuning a model?

You need:

  • Clean, diverse, and deduplicated data.
  • Clear objectives for the fine-tuning (e.g., technical support, customer service, content writing, etc.).

5. How many GPUs are needed to fine-tune a model?

The number depends on the model size:

  • <1B parameters: 1 GPU (24 GB VRAM) is enough
  • 7B models: 2-4 GPUs (40 GB VRAM each)
  • 13B models: 4-8 GPUs recommended
  • 30B+ models: require 8+ GPUs and multi-node configuration

6. Do I need multiple nodes or just one?

  • For small to medium models (up to 13B), a single node with multiple GPUs is sufficient.
  • For larger models (30B+), a multi-node configuration is recommended for better memory and performance.

7. What is the minimum GPU memory requirement?

  • At least 24GB VRAM per GPU for standard fine-tuning
  • With LoRA/QLoRA methods, you can fine-tune on GPUs with 8-16GB VRAM

8. Does the training dataset size affect hardware requirements?

Yes. Larger datasets require more VRAM, RAM, and storage capacity.

  • Dataset < 20GB → can use Managed volume
  • Dataset > 20GB → requires Dedicated network volume