Comprehensive AI and ML terminology dictionary. Clear definitions, business context, and practical examples for 65+ essential AI terms and concepts.
The proportion of correct predictions among total predictions. A basic classification metric that can be misleading for imbalanced datasets.
The proportion of correct predictions among total predictions, a basic metric for classification model evaluation.
A step that an automated workflow performs, such as sending an email, updating a database, calling an API, or creating a document.
Autonomous AI systems that can perceive their environment, make decisions, and take actions to achieve specific goals. Unlike simple chatbots, agents can use tools, access external data, and execute multi-step tasks independently.
The challenge of ensuring AI systems behave according to human intentions and values. Critical for making powerful AI systems safe, helpful, and beneficial.
The principles and practices ensuring AI systems are developed and used in ways that are fair, transparent, and beneficial.
An organisation's preparedness to successfully adopt and benefit from AI, including data, technology, talent, and cultural factors.
A comprehensive plan defining how an organisation will adopt and leverage AI to achieve business objectives.
A cloud-based platform combining spreadsheet simplicity with database power, enabling flexible data organisation and automation.
AWS's fully managed service for accessing foundation models from multiple providers including Anthropic, Meta, and Amazon through a unified API.
The AI safety company behind the Claude family of AI assistants, known for emphasising responsible AI development and constitutional AI.
A set of protocols and tools that allows different software applications to communicate with each other. In AI, APIs enable applications to access AI model capabilities without running the models locally.
Technical documentation describing how to effectively use and integrate with an API.
A server that acts as an entry point for APIs, handling request routing, authentication, rate limiting, and other cross-cutting concerns.
Practices and technologies for protecting APIs from threats including unauthorised access, data breaches, and abuse.
Testing APIs to ensure they function correctly, handle errors gracefully, perform adequately, and are secure.
Strategies for managing changes to APIs while maintaining backwards compatibility for existing clients.
Automation that runs on a user's workstation and works alongside humans, triggered and supervised by the user.
A technique in neural networks that allows models to focus on relevant parts of the input when producing output. It's the core innovation behind transformer models and modern LLMs.
The process of enhancing AI model capabilities by connecting them to external data sources, tools, or knowledge bases. RAG (Retrieval Augmented Generation) is a prime example.
The process of verifying the identity of a user, device, or system attempting to access a resource.
The process of determining what actions or resources an authenticated user or system is permitted to access.
Microsoft's framework for building multi-agent AI systems where multiple AI agents collaborate to solve complex tasks.
Tracking automation performance, health, and outcomes through dashboards, alerts, and analytics to ensure reliable operation.
The return on investment from automation initiatives, measuring the value delivered relative to the costs of implementation and operation.
Validating that automated processes work correctly through systematic testing before and after deployment.
Amazon's managed service for accessing foundation models from multiple providers through a unified API.
Microsoft's hosting of OpenAI models on Azure, providing enterprise-grade security, compliance, and integration with Azure services.
Microsoft Azure's enterprise service for accessing OpenAI models with Azure security, compliance, and regional data residency.
The primary algorithm used to train neural networks by calculating gradients and adjusting weights to minimise errors. It propagates error signals backward through the network.
Processing multiple items or transactions together as a group, typically scheduled for off-peak times.
Processing data in groups or batches at scheduled intervals rather than immediately as it arrives.
Processing multiple requests or data points together in a single operation rather than one at a time. This improves throughput and efficiency in AI systems.
Standardised tests used to evaluate and compare AI model performance across specific tasks or capabilities.
Bidirectional Encoder Representations from Transformers - a landmark language model from Google that reads text in both directions to understand context.
Systematic errors in AI predictions caused by assumptions in the training data or algorithm. Can lead to unfair or inaccurate outputs for certain groups or scenarios.
Data synchronisation where changes in either connected system are reflected in the other, maintaining consistency in both directions.
Overseeing and coordinating software bots in an automation environment, including scheduling, monitoring, and maintenance.
A visual programming platform for building full-featured web applications without writing code.
A documented justification for an AI investment, including expected benefits, costs, risks, and strategic alignment.
Technologies and practices for collecting, integrating, analysing, and presenting business data. Enables data-driven decision making through dashboards, reports, and analytics.
Technologies and practices for collecting, integrating, analysing, and presenting business data to support better decision-making.
Using technology to automate repeatable business processes, reducing manual effort and increasing consistency. Broader than RPA, encompassing workflow systems and integrations.
A discipline for designing, executing, monitoring, and improving business processes to achieve organisational goals.
A dedicated team or function that provides governance, standards, expertise, and support for enterprise automation initiatives.
Storing copies of data or computed results in faster storage to reduce latency and load on the original data source.
A centralised team or framework that provides leadership, best practices, and governance for automation initiatives across an organisation.
A prompting technique that encourages AI models to show their reasoning step-by-step, leading to more accurate results on complex problems.
The structured approach to transitioning individuals, teams, and organisations to a desired future state. Critical for AI adoption where people's roles and processes change.
The structured approach to transitioning individuals, teams, and organisations from current state to a desired future state.
A lightweight, open-source embedding database designed for AI applications, popular for its simplicity and ease of use.
An open-source embedding database designed to be simple to use, particularly popular for development and prototyping AI applications.
Breaking large documents or texts into smaller, manageable pieces for processing. Critical for RAG systems where documents must fit within context windows.
Using AI and analytics to identify customers likely to stop using a product or service, enabling proactive retention efforts.
Business users who create applications or automations using no-code/low-code platforms without traditional programming skills.
Enabling non-IT employees to build applications and automations using low-code/no-code tools with appropriate governance.
Anthropic's family of AI assistants, known for being helpful, harmless, and honest. Claude models excel at analysis, writing, coding, and following nuanced instructions.
An enterprise-focused AI company providing language models, embeddings, and search capabilities through a developer-friendly API.
An enterprise AI company providing LLMs optimised for business applications, with strong focus on retrieval, classification, and embeddings.
The output text generated by a language model in response to a prompt. Also refers to the API endpoint type for generating text continuations.
AI that enables computers to interpret and understand visual information from images and videos. Powers image recognition, object detection, and visual inspection.
The field of AI that enables computers to interpret and understand visual information from images and videos.
Changes in the relationship between input features and target variables over time, causing model predictions to become less accurate.
A table showing predicted vs actual classifications, revealing true positives, false positives, true negatives, and false negatives. Essential for understanding model error patterns.
A table showing the counts of correct and incorrect predictions for each class, revealing detailed model performance.
A pre-built integration component that enables applications to communicate with external systems, handling authentication and data formatting.
The maximum amount of text (measured in tokens) that an LLM can process in a single request. This includes both the input prompt and the generated output.
Neural network architecture designed for processing grid-like data such as images. Uses convolution operations to automatically learn spatial features and patterns.
A Python framework for orchestrating role-playing AI agents that work together as a crew to accomplish complex tasks.
The overall perception customers have of their interactions with a company. AI and automation can dramatically improve CX through personalisation, speed, and consistency.
The complete experience a customer has with a company, from initial awareness through purchase and beyond.
The predicted total revenue a customer will generate throughout their entire relationship with a business.
The practice of dividing customers into groups based on common characteristics to enable targeted marketing and personalised experiences.
Automating customer support through chatbots, self-service portals, ticket routing, and AI-assisted agent tools.
A visual display of key metrics and data points that provides at-a-glance understanding of business performance.
Techniques for artificially increasing training data by creating modified versions of existing data.
A centralised inventory of data assets with metadata, enabling discovery, understanding, and governance of organisational data.
The process of detecting and correcting errors, inconsistencies, and inaccuracies in datasets to improve data quality.
Changes in the statistical properties of input data over time that can degrade machine learning model performance.
The framework of policies, processes, and standards for managing data assets. Ensures data is accurate, secure, compliant, and used appropriately across the organisation.
The process of annotating data with labels or tags that machine learning models can learn from.
The process of adding annotations or tags to data to create training datasets for supervised learning. Labels tell the model what output to predict for each input.
A storage repository holding vast amounts of raw data in native format until needed. Unlike warehouses, lakes store unstructured and semi-structured data without predefined schemas.
A centralised repository that stores raw data in its native format, enabling flexible analysis and machine learning workloads.
An architecture combining data lake flexibility with data warehouse reliability and performance.
Tracking the origin, movement, and transformation of data throughout its lifecycle in an organisation.
The ability to track data from its origin through transformations to its final use, showing the complete data journey.
An automated sequence of data processing steps that moves and transforms data from sources to destinations. Essential for keeping AI systems and analytics fed with fresh data.
Protecting personal and sensitive information from unauthorised access, use, and disclosure.
The measure of data fitness for its intended purpose. High-quality data is accurate, complete, consistent, timely, and valid. Critical for AI systems that learn from data.
The degree to which data is accurate, complete, consistent, timely, and fit for its intended use.
Protecting data from unauthorised access, corruption, or theft through technical and administrative controls.
The process of maintaining consistency between data stored in different systems or locations.
Keeping data consistent across multiple systems by automatically copying and updating data when changes occur. Essential for multi-system architectures.
Converting data from one format, structure, or value system to another. Essential for integration when systems use different data models or formats.
The process of converting data from one format or structure to another to meet the requirements of target systems or analysis.
The process of checking data against rules and constraints to ensure accuracy, completeness, and consistency.
A centralised repository that stores integrated data from multiple sources for reporting and analysis. Optimised for query performance rather than transaction processing.
A centralised repository optimised for analytical queries, storing structured data from multiple sources for business intelligence and reporting.
A unified analytics platform combining data engineering, data science, and machine learning on a lakehouse architecture.
A transformation tool that enables analytics engineers to transform data using SQL, with software engineering best practices.
The component of a transformer model that generates output sequences. GPT-style models are "decoder-only" architectures optimised for text generation.
Machine learning using neural networks with multiple layers. Enables automatic feature learning and powers modern AI breakthroughs in vision, language, and more.
AI models that generate images by gradually removing noise from random patterns. Powers tools like DALL-E, Midjourney, and Stable Diffusion.
Integrating digital technology into all areas of business, fundamentally changing how organisations operate and deliver value. AI and automation are key enablers.
The integration of digital technology into all areas of business, fundamentally changing how organisations operate and deliver value.
The number of features or dimensions in an embedding vector. Higher dimensionality can capture more nuance but requires more storage and compute.
A platform for containerising applications, essential for deploying AI models consistently across different environments.
Using AI and software to automatically create, process, extract data from, and manage documents without manual intervention.
Automatically creating, processing, routing, and managing documents using templates, data extraction, and workflow rules.
Automated sending, sorting, and responding to emails based on triggers, schedules, or content analysis. Includes marketing sequences, transactional emails, and inbox management.
Automatically sending, sorting, responding to, or processing emails based on triggers, rules, or AI-driven understanding.
Numerical vector representations of text, images, or other data that capture semantic meaning. Similar items have similar embeddings, enabling semantic search.
The component of a transformer that processes input text into internal representations. BERT-style models are "encoder-only" and excel at understanding tasks.
A data integration process that extracts data from sources, transforms it to fit operational needs, and loads it into a destination system like a data warehouse.
A data integration process that extracts data from sources, transforms it for analysis, and loads it into a destination system.
Quantitative measures used to assess AI model performance, such as accuracy, precision, recall, F1 score, and perplexity.
Automation triggered by real-time events rather than schedules. Responds immediately when something happens - a form submission, database change, or API call.
Automations triggered in response to specific events or changes in systems, enabling real-time reactive workflows.
Managing situations where automated processes encounter unexpected conditions or errors that prevent normal completion.
Managing cases where automated processes encounter unexpected conditions, errors, or situations outside normal parameters.
The harmonic mean of precision and recall, providing a single metric that balances both. Useful when you need good performance on both false positives and false negatives.
The harmonic mean of precision and recall, providing a single metric that balances both concerns.
Facebook AI Similarity Search, a library for efficient similarity search and clustering of dense vectors at scale.
The process of creating and selecting input variables (features) for machine learning models. Good features capture relevant patterns and improve model performance.
A technique where models learn to perform tasks from just a few examples provided in the prompt, without additional training.
Adapting a pre-trained model to a specific task or domain by training it further on specialised data. Creates a new model variant.
An open-source visual tool for building LLM flows and AI agents using a drag-and-drop interface built on LangChain.
Large AI models trained on broad data that can be adapted to many downstream tasks. GPT-4, Claude, and BERT are examples that serve as the foundation for specific applications.
Large AI models trained on broad data that can be adapted to a wide range of downstream tasks.
An LLM capability to output structured requests to external functions or APIs, enabling AI to take actions like searching databases or executing code.
The framework of policies, processes, and accountability structures that guide responsible AI development and use.
AI systems that create new content - text, images, code, audio, or video. Includes LLMs like GPT and Claude, and image generators like DALL-E and Midjourney.
AI systems that can create new content including text, images, audio, video, and code.
Google Cloud's unified ML platform providing access to Google's AI models and tools for building AI applications.
Google Cloud's unified AI platform for building, deploying, and scaling ML models, including access to Gemini and PaLM models.
OpenAI's family of language models that generate human-like text. GPT-4 is currently the most capable version, excelling at reasoning, coding, and analysis.
The optimisation algorithm used to train neural networks by iteratively adjusting weights to minimise the loss function.
A Python library for quickly creating web interfaces for machine learning models, particularly popular for AI demos.
A Python library for quickly creating web interfaces for machine learning models with automatic UI generation.
A query language for APIs that lets clients request exactly the data they need. Alternative to REST that reduces over-fetching and enables flexible data retrieval.
A query language for APIs that allows clients to request exactly the data they need, developed by Facebook.
The accurate, verified labels or outcomes used to train and evaluate machine learning models.
Connecting AI model outputs to factual, verified information sources to reduce hallucinations and improve accuracy.
Safety mechanisms and constraints implemented to prevent AI systems from producing harmful, inappropriate, or off-topic outputs.
When an AI model generates plausible-sounding but factually incorrect or fabricated information. A major challenge in deploying AI for business.
The leading platform for sharing and deploying machine learning models, datasets, and applications. Known for the Transformers library.
The leading open-source AI platform providing model hosting, datasets, and the Transformers library for machine learning development.
Automation design that includes human review, approval, or intervention points for handling exceptions or validating AI decisions.
An organisation-wide strategy to automate as many business and IT processes as possible using multiple technologies including AI, RPA, process mining, and low-code platforms.
A business-driven approach to rapidly identify, vet, and automate as many business processes as possible using multiple technologies.
Configuration settings that control the training process, such as learning rate, batch size, and number of epochs. Set before training begins.
AI capability to identify and classify objects, people, text, and other content within images. Powers applications from photo organisation to quality inspection.
An external company that helps organisations plan, build, and deploy AI solutions.
The ability of LLMs to learn and adapt their behaviour based on examples and instructions provided in the prompt, without model updates.
Using a trained model to make predictions or generate outputs on new data. This is the "runtime" phase of AI, as opposed to training.
Fine-tuning a model on examples of following instructions to improve its ability to understand and execute user requests.
Combining RPA with AI capabilities like machine learning, NLP, and computer vision to automate processes that require judgment, learning, or understanding unstructured data.
The combination of RPA with AI capabilities like machine learning and NLP to automate complex processes requiring judgement.
Cloud-based platforms that connect applications, data, and processes across cloud and on-premises environments. Examples include Zapier, Make, and Workato.
Integration Platform as a Service, cloud-based platforms for connecting applications and data across cloud and on-premise systems.
A lightweight data format for storing and exchanging data. Human-readable and easy to parse, JSON is the standard format for API requests and responses.
JavaScript Object Notation, a lightweight data interchange format that is easy for humans to read and machines to parse.
A compact, URL-safe token format for securely transmitting claims between parties. Used for authentication and authorisation in web applications and APIs.
A structured repository of information that AI systems can query. In RAG systems, this typically contains company documents, FAQs, and domain-specific content.
A network of entities (people, places, concepts) and their relationships, enabling AI to reason about connections and context.
Quantifiable measures used to evaluate success in meeting objectives. For AI projects, KPIs track both technical performance and business outcomes.
Measurable values that demonstrate how effectively an organisation is achieving key business objectives.
An open-source container orchestration platform for automating deployment, scaling, and management of containerised applications.
A popular open-source framework for building LLM applications. Provides abstractions for chains, agents, memory, and integrations with various AI services.
A visual IDE for building and deploying LangChain applications through a flow-based interface with Python extensibility.
A platform from LangChain for debugging, testing, evaluating, and monitoring LLM applications.
The time delay between sending a request and receiving a response from an AI system. Critical for real-time applications.
A methodology for ranking prospects based on their perceived value and likelihood to convert, often enhanced by AI.
Meta's family of open-source large language models. Llama 2 and 3 offer strong performance that can be self-hosted without API costs.
A data framework for building LLM applications, specialising in connecting custom data to language models. Excellent for RAG applications.
AI models trained on vast amounts of text that can understand and generate human language. GPT-4, Claude, and Llama are leading examples.
An efficient fine-tuning technique that trains only a small number of additional parameters, dramatically reducing compute and storage requirements.
Development platforms that minimise hand-coding through visual interfaces while still allowing code customisation when needed. Bridges no-code and traditional development.
A visual automation platform for connecting apps and designing complex workflows, formerly known as Integromat.
Software and strategies that automate marketing tasks including email campaigns, social media, lead nurturing, and customer journey orchestration.
Software and strategies that automate marketing tasks including email campaigns, lead nurturing, social media, and campaign analytics.
Systems that allow AI to retain and recall information across conversations. Can be short-term (within session) or long-term (across sessions).
A component that stores messages sent between applications, enabling asynchronous communication and decoupling between services.
Data that describes other data, providing context about structure, meaning, origin, and usage.
An architectural style where applications are composed of small, independent services that communicate over network protocols.
Software that connects different applications or systems, handling communication, data transformation, and integration logic between components.
Software that sits between applications, providing common services like messaging, authentication, and data transformation.
A French AI company known for efficient, high-performance open-weight language models that compete with much larger models.
A French AI company known for efficient, high-performance open-weight models that compete with larger proprietary models.
Neural network architecture using multiple specialised "expert" subnetworks with a gating mechanism that routes inputs to the most relevant experts, enabling larger models with efficient compute.
An open-source platform for managing the machine learning lifecycle, including experimentation, reproducibility, and deployment.
A trained AI system that can make predictions or generate outputs. Models encode learned patterns from training data in their parameters.
The infrastructure and processes for deploying trained models to make predictions in production environments.
Architectures where multiple AI agents collaborate, each with specialised roles, to accomplish complex tasks through coordination.
AI models that can process and generate multiple types of data - text, images, audio, and video. GPT-4V and Gemini are multi-modal.
AI systems that can process and generate multiple types of data such as text, images, audio, and video.
An open-source, self-hostable workflow automation platform with both visual builder and code capabilities.
NLP technique that identifies and classifies named entities in text - people, organisations, locations, dates, monetary values, and other specific information.
The branch of AI focused on enabling computers to understand, interpret, and generate human language. Powers chatbots, translation, sentiment analysis, and more.
The field of AI focused on enabling computers to understand, interpret, and generate human language.
A metric measuring customer loyalty and satisfaction based on likelihood to recommend a company to others.
A computing system inspired by biological brains, consisting of interconnected nodes (neurons) that process information in layers.
The field of AI focused on enabling computers to understand, interpret, and generate human language.
Platforms that allow building applications and automations without writing code, using visual interfaces and pre-built components.
An all-in-one workspace combining notes, docs, wikis, databases, and project management with AI capabilities.
An authorisation framework that lets users grant limited access to their accounts on one service to another service, without sharing passwords.
Computer vision task that identifies and locates objects within images or video, drawing bounding boxes around each detected item and classifying what it is.
A tool for running large language models locally on your own computer, making LLMs accessible without cloud APIs.
The AI research company behind GPT models, ChatGPT, and DALL-E. A leader in large language model development.
The API and developer platform from OpenAI providing access to GPT models, DALL-E, Whisper, and embedding models for building AI applications.
A standard, language-agnostic format for describing REST APIs, enabling documentation, code generation, and testing.
The ratio of output to input in business operations, often improved through AI automation and optimisation.
Coordinating multiple AI components, models, or agents to work together in a workflow. Managing data flow, error handling, and sequencing.
When a model learns training data too well, including noise and outliers, leading to poor performance on new data.
The learned values (weights and biases) in a neural network that determine its behavior. LLMs have billions of parameters.
A PostgreSQL extension that adds vector similarity search capabilities, enabling AI applications on existing PostgreSQL infrastructure.
A small-scale preliminary project used to evaluate feasibility, test approaches, and learn before committing to full implementation.
A popular managed vector database optimised for AI applications. Known for ease of use and scalability.
A sequence of data processing or AI steps connected together, where each step's output feeds into the next.
A popular API development platform for designing, testing, documenting, and monitoring APIs.
Microsoft's cloud-based automation platform for creating workflows across Microsoft 365 and third-party applications.
Initial training phase where models learn general patterns from large datasets. Pre-trained models can then be fine-tuned for specific tasks with much less data.
Of all positive predictions, what proportion was actually positive. High precision means few false positives - when the model says "yes," it's usually right.
The proportion of true positive predictions among all positive predictions, measuring how reliable positive predictions are.
Evaluating business processes to determine their suitability and priority for automation.
Analysing event logs from IT systems to discover, monitor, and improve business processes. Reveals how processes actually work versus how they're supposed to work.
Data-driven analysis of business processes using event logs to discover, monitor, and improve actual process execution.
The input text or instructions given to an AI model to elicit a response. Quality prompts dramatically improve output quality.
The practice of designing and optimising prompts to get better results from AI models. Combines art and science.
A messaging pattern where senders (publishers) send messages to topics without knowledge of receivers (subscribers), who receive messages by subscribing to topics.
An even more efficient fine-tuning technique that combines quantisation with LoRA, enabling fine-tuning of large models on consumer hardware.
Reducing the precision of model weights (e.g., from 32-bit to 4-bit) to decrease memory usage and increase speed with minimal quality loss.
A request sent to an AI system or database to retrieve information or generate a response. In RAG, queries trigger retrieval from the knowledge base.
A technique that enhances LLM responses by first retrieving relevant information from a knowledge base, then using it to generate accurate, grounded answers.
Controlling the number of requests a client can make to an API within a specified time period to prevent abuse and ensure fair usage.
Data that is delivered and processed immediately or with minimal delay as it is generated.
Processing data or transactions immediately as they occur, enabling instant responses and up-to-date information.
Of all actual positives, what proportion did the model identify. High recall means few false negatives - the model finds most of the positive cases.
The proportion of actual positive cases that were correctly identified, measuring how completely positives are found.
Neural network designed to process sequential data by maintaining internal state. Used for time series, text, and other sequential tasks before transformers became dominant.
A machine learning paradigm where agents learn by interacting with an environment, receiving rewards or penalties for actions. Used in robotics, games, and optimisation.
Machine learning where an agent learns to make decisions by taking actions and receiving rewards or penalties.
A platform for running machine learning models in the cloud via API, making it easy to deploy open-source models without managing infrastructure.
An architectural style for web APIs using standard HTTP methods. The most common way to build and consume web services, enabling system-to-system communication.
An architectural style for web APIs using HTTP methods to perform operations on resources, the most common approach for modern web services.
A low-code platform for building internal tools and admin panels by connecting to databases and APIs.
The process of finding and fetching relevant information from a database or knowledge base in response to a query.
A technique to fine-tune AI models using human preferences, making outputs more helpful, harmless, and aligned with human values.
Software robots that automate repetitive, rule-based tasks by mimicking human interactions with digital systems. Works with existing applications without API integration.
Software robots that automate repetitive, rule-based tasks by mimicking human interactions with digital systems.
A measure of profitability that compares the gain from an investment to its cost. For AI projects, ROI considers cost savings, revenue increases, and implementation costs.
The process of calculating return on investment by comparing the gains from an investment against its costs.
Software that executes business rules to automate decisions, separating decision logic from application code.
Automating sales tasks like lead assignment, follow-up sequences, proposal generation, and CRM updates to increase efficiency.
A visual representation of where prospects are in the sales process, from initial contact to closed deal.
The ability of a system or process to handle growing amounts of work or to be enlarged to accommodate growth.
Automation that runs at predetermined times rather than in response to events. Used for batch processing, reports, maintenance tasks, and regular synchronisation.
Automations triggered by time-based schedules rather than events, running at defined intervals or specific times.
Search that understands meaning and intent rather than just matching keywords. Uses embeddings to find conceptually similar content.
NLP technique that determines the emotional tone of text - positive, negative, or neutral. Used for analysing customer feedback, social media, and reviews.
The use of NLP to identify and extract subjective information, determining whether text expresses positive, negative, or neutral sentiment.
Finding items in a database that are most similar to a query, typically using vector distance calculations on embeddings.
A cloud-native data warehouse platform offering scalable storage and compute for analytics and AI workloads.
Sending AI model output incrementally as it's generated rather than waiting for the complete response. Improves perceived latency.
A Python framework for quickly building and sharing web applications for machine learning and data science.
A Python framework for creating web applications and data dashboards quickly, popular for AI/ML demos and tools.
Data organised in a predefined format with clear schema, typically stored in databases with rows and columns.
A machine learning approach where models learn from labelled training data. The algorithm learns to map inputs to known outputs, enabling predictions on new, unseen data.
Machine learning where models learn from labeled training data to predict outcomes for new data.
Artificially generated data that mimics real data characteristics. Used when real data is scarce, sensitive, or expensive to obtain for AI training.
Artificially generated data that mimics real data characteristics while preserving privacy and enabling use cases where real data is scarce or sensitive.
A parameter controlling randomness in AI outputs. Lower temperature (0-0.3) gives consistent, focused responses; higher (0.7-1.0) gives more creative, varied ones.
NLP task of assigning predefined categories to text. Used for spam detection, sentiment analysis, topic categorisation, and intent recognition.
The time between starting an investment and realising measurable benefits. Critical for AI projects where stakeholders expect results within reasonable timeframes.
The duration between starting an initiative and realising measurable business value from it.
The process of breaking text into smaller units (tokens) that AI models can process. Tokens might be words, subwords, or characters depending on the tokenizer.
The process of breaking text into smaller units (tokens) that AI models can process and understand.
The basic units of text that LLMs process. Roughly 1 token = 4 characters or 0.75 words in English. Both input and output are measured in tokens.
The complete cost of acquiring, deploying, and operating an AI system over its lifetime. Includes obvious costs like software and hidden costs like training and maintenance.
The complete cost of acquiring, operating, and maintaining a system over its entire lifecycle, including hidden costs.
The process of teaching an AI model by exposing it to data and adjusting its parameters to minimise errors.
The dataset used to train machine learning models. Training data teaches the model patterns and relationships it will apply to new, unseen data.
Applying knowledge learned from one task to a different but related task. This allows models to achieve good performance with less training data and compute.
A technique where a model trained on one task is adapted for a different but related task.
The neural network architecture behind modern LLMs. Uses attention mechanisms to process sequences in parallel, enabling training on massive datasets.
An event that initiates an automated workflow or action. Common triggers include form submissions, schedule times, data changes, emails, and webhooks.
Automation that runs independently on servers without human intervention, typically triggered by schedules or events.
Data without a predefined format or schema, such as text documents, images, audio, and video.
Machine learning where models find patterns in data without labelled examples. The algorithm discovers hidden structures, clusters, or relationships autonomously.
Machine learning where models find patterns in data without labeled examples or predefined outcomes.
A specific scenario describing how AI will be used to solve a business problem or enable a capability.
A list of numbers representing data in multi-dimensional space. In AI, vectors (embeddings) encode semantic meaning of text or other data.
A specialised database optimised for storing and searching vector embeddings. Essential for RAG and semantic search applications.
The process of evaluating and choosing AI technology providers, platforms, or implementation partners.
An open-source vector database that combines vector search with traditional filtering, designed for AI applications.
An open-source vector database designed for AI applications, featuring built-in vectorisation, semantic search, and hybrid search capabilities.
An HTTP callback that sends real-time data when events occur. Instead of polling for changes, systems push notifications to your endpoint automatically.
The numerical values in neural networks that are learned during training. They determine how strongly inputs influence outputs.
An ML platform for experiment tracking, model management, and collaboration in machine learning projects.
A popular MLOps platform for experiment tracking, model visualisation, and team collaboration in machine learning projects.
A defined sequence of automated tasks that accomplish a business process. AI workflows combine multiple AI and non-AI steps.
Software that executes and manages business workflows, routing tasks, handling approvals, and tracking progress through multi-step processes.
A reusable, pre-configured workflow pattern that can be customised for specific use cases, accelerating automation development.