Data Labelling
The process of adding annotations or tags to data to create training datasets for supervised learning. Labels tell the model what output to predict for each input.
In-Depth Explanation
Data labelling (annotation) adds ground truth labels to data, creating supervised learning datasets. It's often the most time-consuming and expensive part of ML projects.
Labelling types:
- Classification: Assigning categories
- Bounding boxes: Drawing rectangles around objects
- Segmentation: Pixel-level annotation
- Named entity: Tagging text spans
- Sentiment: Rating emotional tone
- Relationships: Connecting entities
Labelling approaches:
- Manual: Human annotators
- Crowdsourcing: Distributed workers
- Automated: Model-assisted suggestions
- Weak supervision: Programmatic rules
- Active learning: Smart sample selection
Labelling platforms:
- Scale AI, Labelbox, Appen
- Amazon SageMaker Ground Truth
- Open source: Label Studio, CVAT
Business Context
Labelling quality determines model quality. Budget for labelling as a significant project cost - it's often 50%+ of data preparation effort.
How Clever Ops Uses This
We design efficient labelling workflows for Australian businesses, balancing cost, quality, and speed for AI training data creation.
Example Use Case
"Setting up a labelling workflow where domain experts label 100 examples, then model suggestions accelerate labelling of remaining 5,000."
Frequently Asked Questions
Related Terms
Related Resources
Training Data
The dataset used to train machine learning models. Training data teaches the mod...
Ground Truth
The accurate, verified labels or outcomes used to train and evaluate machine lea...
Learning Centre
Guides, articles, and resources on AI and automation.
AI & Automation Services
Explore our full AI automation service offering.
AI Readiness Assessment
Check if your business is ready for AI automation.
