Initial training phase where models learn general patterns from large datasets. Pre-trained models can then be fine-tuned for specific tasks with much less data.
Pre-training is the first phase of modern AI model development, where models learn general representations from massive datasets before task-specific adaptation.
Pre-training approaches:
Pre-training characteristics:
What pre-trained models learn:
The pre-training + fine-tuning paradigm:
Pre-training is why you don't need Google-scale resources for AI. Foundation models like GPT-4 and Claude pre-train once; you fine-tune or prompt for your needs.
We leverage pre-trained foundation models for Australian businesses, using fine-tuning or RAG when customisation is needed without pre-training costs.
"GPT-4 pre-trained on trillions of tokens of internet text, learning language patterns that enable it to assist with almost any text task."