Bidirectional Encoder Representations from Transformers - a landmark language model from Google that reads text in both directions to understand context.
BERT, released by Google in 2018, was a breakthrough in natural language understanding that fundamentally changed how machines process language. Unlike previous models that read text left-to-right or right-to-left, BERT reads entire sequences bidirectionally, understanding context from both directions simultaneously.
BERT is an encoder-only transformer model, meaning it excels at understanding text rather than generating it. It was trained using masked language modelling - randomly hiding words and training the model to predict them from context - and next sentence prediction.
Key innovations of BERT include:
BERT remains highly relevant for classification, sentiment analysis, named entity recognition, and generating embeddings for search applications. While GPT models dominate generation tasks, BERT-family models often outperform them on understanding tasks.
BERT-based models are commonly used for classification, sentiment analysis, and search relevance ranking in business applications where understanding is more important than generation.
We use BERT-based models extensively for embedding generation in RAG systems and for classification tasks like email routing and document categorisation for our Australian business clients.
"Using BERT to classify customer support tickets by urgency and topic, automatically routing them to the right team."