D

Dimensionality

The number of features or dimensions in an embedding vector. Higher dimensionality can capture more nuance but requires more storage and compute.

In-Depth Explanation

Dimensionality in AI refers to the number of elements in a vector representation, particularly for embeddings. Each dimension captures some aspect of meaning, and higher dimensionality allows more nuanced representations.

Understanding dimensionality:

  • Low-dimensional (e.g., 2-3D): Easy to visualise, limited capacity
  • Medium-dimensional (e.g., 384-768): Good for many tasks, efficient
  • High-dimensional (e.g., 1536-3072): Maximum nuance, more resources

Trade-offs:

  • Higher dimensions: More semantic nuance captured
  • Lower dimensions: Faster search, less storage
  • Optimal depends: On task complexity and resources

Common embedding dimensions:

  • OpenAI text-embedding-3-small: 1536
  • OpenAI text-embedding-3-large: 3072
  • Cohere embed-english-v3: 1024
  • BGE-small: 384
  • BGE-large: 1024

Dimensionality reduction:

  • PCA, UMAP for visualization
  • Matryoshka embeddings (truncatable dimensions)
  • Trade-off between preservation and efficiency

Business Context

Choosing the right embedding dimensionality (384-3072) balances accuracy against storage costs and search speed.

How Clever Ops Uses This

We help Australian businesses choose appropriate dimensionality for their use cases, balancing quality against infrastructure costs.

Example Use Case

"OpenAI ada-002 embeddings use 1536 dimensions; smaller models may use 384, trading some quality for speed and cost."

Frequently Asked Questions

Category

tools

Need Expert Help?

Understanding is the first step. Let our experts help you implement AI solutions for your business.

Ready to Implement AI?

Understanding the terminology is just the first step. Our experts can help you implement AI solutions tailored to your business needs.

FT Fast 500 APAC Winner|500+ Implementations|Harvard-Educated Team