H

Hyperparameters

Configuration settings that control the training process, such as learning rate, batch size, and number of epochs. Set before training begins.

In-Depth Explanation

Hyperparameters are configuration values that control how a machine learning model learns, as opposed to parameters which are learned during training. They must be set before training begins and significantly impact model performance.

Common hyperparameters:

  • Learning rate: How much to adjust weights per update
  • Batch size: Samples processed before updating weights
  • Epochs: Complete passes through training data
  • Hidden layers/units: Network architecture
  • Dropout rate: Regularisation strength
  • Optimizer: Algorithm for updating weights (Adam, SGD)

For LLM fine-tuning:

  • Learning rate (often 1e-5 to 5e-5)
  • Batch size (often small, 4-32)
  • Epochs (often 1-5 for fine-tuning)
  • LoRA rank (for efficient fine-tuning)
  • Warmup steps

Hyperparameter tuning approaches:

  • Grid search: Try all combinations
  • Random search: Sample randomly
  • Bayesian optimization: Smart exploration
  • Manual tuning: Based on experience

Business Context

Hyperparameter tuning can significantly impact model performance and training costs. It's a key part of fine-tuning optimisation.

How Clever Ops Uses This

We handle hyperparameter optimisation for Australian businesses doing custom model training, finding configurations that balance performance and training cost.

Example Use Case

"Adjusting learning rate to balance training speed and model quality - too high causes instability, too low wastes time and resources."

Frequently Asked Questions

Category

business

Need Expert Help?

Understanding is the first step. Let our experts help you implement AI solutions for your business.

Ready to Implement AI?

Understanding the terminology is just the first step. Our experts can help you implement AI solutions tailored to your business needs.

FT Fast 500 APAC Winner|500+ Implementations|Harvard-Educated Team