The component of a transformer that processes input text into internal representations. BERT-style models are "encoder-only" and excel at understanding tasks.
The encoder in transformer architecture processes input sequences to create rich contextual representations. Unlike decoders, encoders can look at the entire input simultaneously using bidirectional attention - each token can attend to all other tokens, regardless of position.
Encoder models like BERT work by:
Key characteristics of encoder models:
While decoder-only models (GPT) dominate headlines, encoder models remain crucial for:
Encoder models are ideal for classification, sentiment analysis, and generating embeddings for search applications where understanding matters more than generation.
We use encoder models in our RAG pipelines for embedding generation. They're also essential for classification systems that route customer enquiries and categorise documents.
"Using an encoder model to generate embeddings for semantic search, converting documents and queries into comparable vector representations."