An efficient fine-tuning technique that trains only a small number of additional parameters, dramatically reducing compute and storage requirements.
LoRA (Low-Rank Adaptation) is a fine-tuning technique that dramatically reduces the resources needed to customise large language models. Instead of updating all model weights, LoRA trains small adapter matrices that modify model behaviour.
How LoRA works:
Benefits of LoRA:
Typical LoRA parameters:
LoRA makes fine-tuning 10x cheaper and faster, making custom model training accessible for most businesses.
We use LoRA extensively for Australian business fine-tuning projects. It enables custom models without the massive compute costs of full fine-tuning.
"Fine-tuning Llama with LoRA on a single GPU instead of expensive multi-GPU clusters, creating a custom model for your specific use case."