Page

AI Glossary: Every AI Term Explained in Plain English

Every AI term you need to know, explained in plain English. Bookmark this page and come back whenever you hit a term…

Last updated April 7, 2026 5 min read

Every AI term you need to know, explained in plain English. Bookmark this page and come back whenever you hit a term you do not recognise.

A

AGI (Artificial General Intelligence) – A hypothetical AI system that can perform any intellectual task a human can. Does not exist yet.

AI Agent – An AI system that can take actions autonomously, such as browsing the web, writing code, or completing multi-step tasks without human intervention at each step.

Alignment – The challenge of ensuring AI systems behave in ways that match human values and intentions. A major focus of AI safety research.

Attention Mechanism – A technique that allows AI models to focus on the most relevant parts of the input when generating output. The foundation of transformer architecture.

B

Benchmark – A standardised test used to measure and compare AI model performance across specific tasks.

Bias – Systematic errors in AI outputs caused by imbalanced or unrepresentative training data. Can lead to unfair or discriminatory results.

C

Chain of Thought – A prompting technique where you ask the AI to show its reasoning step by step, which often improves accuracy on complex problems.

ChatGPT – A conversational AI product by OpenAI, powered by their GPT series of models. Launched in November 2022 and sparked the current AI boom.

Claude – An AI assistant built by Anthropic, known for strong performance in coding, analysis, and following complex instructions.

Context Window – The maximum amount of text an AI model can process in a single conversation. Measured in tokens. Larger windows allow the model to work with more information at once.

D

Deep Learning – A subset of machine learning that uses multi-layered neural networks. Powers most modern AI including image recognition, language models, and speech synthesis.

Diffusion Model – The technology behind AI image generators like Midjourney and DALL-E. Works by learning to gradually remove noise from random static until a coherent image emerges.

E

Embedding – A numerical representation of text, images, or other data that captures its meaning. Used for search, recommendations, and similarity matching.

Emergent Behaviour – Capabilities that appear in large AI models that were not explicitly programmed or expected. Larger models sometimes develop abilities that smaller ones lack.

F

Few-Shot Learning – Giving an AI model a few examples of what you want before asking it to perform the task. Improves output quality without retraining.

Fine-Tuning – Taking a pre-trained AI model and training it further on specific data to improve performance for a particular task or domain.

Foundation Model – A large AI model trained on broad data that can be adapted for many different tasks. GPT-4, Claude, and Gemini are all foundation models.

G

Generative AI – AI systems that create new content such as text, images, music, code, or video. The category that includes ChatGPT, Midjourney, and similar tools.

GPT (Generative Pre-trained Transformer) – The family of large language models developed by OpenAI. GPT-4 and GPT-4o are the current generation.

GPU (Graphics Processing Unit) – The hardware that powers AI training and inference. Originally designed for gaming graphics, GPUs turned out to be ideal for the parallel calculations AI requires. NVIDIA dominates this market.

H

Hallucination – When an AI model generates confident but factually incorrect information. One of the biggest challenges with current language models.

I

Inference – The process of running a trained AI model to generate outputs. When you send a message to ChatGPT and get a response, that is inference.

L

Large Language Model (LLM) – An AI model trained on massive amounts of text data that can understand and generate human language. ChatGPT, Claude, and Gemini are all LLMs.

LoRA (Low-Rank Adaptation) – A technique for fine-tuning large AI models efficiently by only updating a small number of parameters. Makes customisation more accessible and affordable.

M

Machine Learning (ML) – A branch of AI where systems learn from data rather than being explicitly programmed. The broader category that includes deep learning.

Mixture of Experts (MoE) – An architecture where a model contains multiple specialised sub-networks and routes each input to the most relevant ones. Allows larger models to run more efficiently.

Multimodal – An AI model that can process and generate multiple types of data such as text, images, audio, and video. GPT-4o and Gemini are multimodal.

N

Natural Language Processing (NLP) – The field of AI focused on enabling computers to understand, interpret, and generate human language.

Neural Network – A computing system inspired by the biological neural networks in the brain. The fundamental building block of modern AI.

O

Open Source – AI models whose code and weights are publicly available for anyone to use, modify, and deploy. Meta Llama and Mistral are prominent open-source models.

Overfitting – When an AI model performs well on its training data but poorly on new data. Like memorising answers to a test instead of understanding the subject.

P

Parameters – The internal values of an AI model that are adjusted during training. More parameters generally means more capability but also more computational cost.

Prompt – The text input you provide to an AI model. Prompt quality significantly affects output quality.

Prompt Engineering – The practice of crafting effective prompts to get better results from AI models. Includes techniques like chain-of-thought, few-shot examples, and role-playing.

R

RAG (Retrieval-Augmented Generation) – A technique that combines AI text generation with information retrieval from external sources. Reduces hallucinations by grounding outputs in real data.

Reinforcement Learning from Human Feedback (RLHF) – A training method where human reviewers rate AI outputs, and the model learns to produce responses that humans prefer.

S

Scaling Laws – The observed relationship between model size, training data, and performance. Generally, bigger models trained on more data perform better, though this trend may have limits.

Superintelligence – A hypothetical AI system that surpasses human intelligence in virtually every domain. A subject of both research and philosophical debate.

T

Temperature – A setting that controls how random or creative AI outputs are. Low temperature produces more predictable, focused responses. High temperature produces more varied, creative ones.

Token – The basic unit of text that AI models process. Roughly equivalent to 0.75 words in English. Models have token limits for both input and output.

Transformer – The neural network architecture behind virtually all modern large language models. Introduced in the 2017 paper “Attention Is All You Need” by Google researchers.

V

Vector Database – A database optimised for storing and searching embeddings. Used to give AI models access to custom knowledge bases.

Z

Zero-Shot Learning – Asking an AI model to perform a task without providing any examples. Modern LLMs can handle many tasks zero-shot thanks to their broad training data.


Missing a term? Let us know and we will add it. This glossary is updated regularly as new concepts emerge in the AI space.

Scroll to Top