FAQ
1. What is Model Fine-tuning?
Model Fine-tuning is the process of retraining a base language model on a specialized dataset so that the model achieves better performance in a specific domain or for a target use case.
2. Which model should I choose for fine-tuning?
- Small models (<=1B parameters) → for testing or light workloads
- Medium models (7B-13B) → balance of performance and cost
- Large models (30B+) → for complex tasks, usually requiring multi-node configurations
- Instruction-tuned models are preferred if your task is prompt-response based
3. How long does fine-tuning take?
The duration depends on:
- Model size (a few hours for small models, several days for very large ones)
- Dataset size
- Your hardware setup (hyperparameters & infrastructure)
Typically, the time ranges from a few hours to several days.
4. What do you need to prepare before fine-tuning a model?
You need:
- Clean, diverse, and deduplicated data.
- Clear objectives for the fine-tuning (e.g., technical support, customer service, content writing, etc.).
5. How many GPUs are needed to fine-tune a model?
The number depends on the model size:
- <1B parameters: 1 GPU (24 GB VRAM) is enough
- 7B models: 2-4 GPUs (40 GB VRAM each)
- 13B models: 4-8 GPUs recommended
- 30B+ models: require 8+ GPUs and multi-node configuration
6. Do I need multiple nodes or just one?
- For small to medium models (up to 13B), a single node with multiple GPUs is sufficient.
- For larger models (30B+), a multi-node configuration is recommended for better memory and performance.
7. What is the minimum GPU memory requirement?
- At least 24GB VRAM per GPU for standard fine-tuning
- With LoRA/QLoRA methods, you can fine-tune on GPUs with 8-16GB VRAM
8. Does the training dataset size affect hardware requirements?
Yes. Larger datasets require more VRAM, RAM, and storage capacity.
- Dataset < 20GB → can use Managed volume
- Dataset > 20GB → requires Dedicated network volume