G

Gradient Descent

The optimisation algorithm used to train neural networks by iteratively adjusting weights to minimise the loss function.

In-Depth Explanation

Gradient descent is the core optimisation algorithm that trains neural networks. It iteratively adjusts model weights in the direction that reduces error, like finding the lowest point in a landscape by always walking downhill.

How gradient descent works:

  1. Calculate loss (error) for current weights
  2. Compute gradients (slope) via backpropagation
  3. Update weights: new = old - learning_rate × gradient
  4. Repeat until convergence

Variants of gradient descent:

  • Batch: Use all training data per update (slow, stable)
  • Stochastic (SGD): Use single example per update (noisy, fast)
  • Mini-batch: Use small batches (balance of both)
  • Momentum: Add velocity to escape local minima
  • Adam: Adaptive learning rates per parameter

Key hyperparameters:

  • Learning rate: Step size (critical to tune)
  • Batch size: Examples per update
  • Momentum: Weight of previous updates

Challenges:

  • Local minima and saddle points
  • Learning rate selection
  • Training instability

Business Context

Understanding gradient descent helps explain why AI training requires significant compute and why learning rates matter.

How Clever Ops Uses This

We configure optimisation settings appropriately for fine-tuning projects, balancing training speed with model quality for Australian business clients.

Example Use Case

"The algorithm adjusts model weights step by step to reduce prediction errors - like finding the bottom of a valley by always walking downhill."

Frequently Asked Questions

Category

business

Need Expert Help?

Understanding is the first step. Let our experts help you implement AI solutions for your business.

Ready to Implement AI?

Understanding the terminology is just the first step. Our experts can help you implement AI solutions tailored to your business needs.

FT Fast 500 APAC Winner|500+ Implementations|Harvard-Educated Team