Hyperparameters
Configuration settings that control the training process, such as learning rate, batch size, and number of epochs. Set before training begins.
In-Depth Explanation
Hyperparameters are configuration values that control how a machine learning model learns, as opposed to parameters which are learned during training. They must be set before training begins and significantly impact model performance.
Common hyperparameters:
- Learning rate: How much to adjust weights per update
- Batch size: Samples processed before updating weights
- Epochs: Complete passes through training data
- Hidden layers/units: Network architecture
- Dropout rate: Regularisation strength
- Optimizer: Algorithm for updating weights (Adam, SGD)
For LLM fine-tuning:
- Learning rate (often 1e-5 to 5e-5)
- Batch size (often small, 4-32)
- Epochs (often 1-5 for fine-tuning)
- LoRA rank (for efficient fine-tuning)
- Warmup steps
Hyperparameter tuning approaches:
- Grid search: Try all combinations
- Random search: Sample randomly
- Bayesian optimisation: Smart exploration
- Manual tuning: Based on experience
Business Context
Hyperparameter tuning can significantly impact model performance and training costs. It's a key part of fine-tuning optimisation.
How Clever Ops Uses This
Example Use Case
"Adjusting learning rate to balance training speed and model quality - too high causes instability, too low wastes time and resources."
Frequently Asked Questions
Related Terms
Related Resources
Training
The process of teaching an AI model by exposing it to data and adjusting its par...
Fine-Tuning
Adapting a pre-trained model to a specific task or domain by training it further...
Gradient Descent
The optimisation algorithm used to train neural networks by iteratively adjustin...
Learning Centre
Guides, articles, and resources on AI and automation.
AI & Automation Services
Explore our full AI automation service offering.
AI Readiness Assessment
Check if your business is ready for AI automation.
