Every AI term you need to know, explained in plain English. Bookmark this page and come back whenever you hit a term you do not recognise.
A
AGI (Artificial General Intelligence) – A hypothetical AI system that can perform any intellectual task a human can. Does not exist yet.
AI Agent – An AI system that can take actions autonomously, such as browsing the web, writing code, or completing multi-step tasks without human intervention at each step.
Alignment – The challenge of ensuring AI systems behave in ways that match human values and intentions. A major focus of AI safety research.
Attention Mechanism – A technique that allows AI models to focus on the most relevant parts of the input when generating output. The foundation of transformer architecture.
B
Benchmark – A standardised test used to measure and compare AI model performance across specific tasks.
Bias – Systematic errors in AI outputs caused by imbalanced or unrepresentative training data. Can lead to unfair or discriminatory results.
C
Chain of Thought – A prompting technique where you ask the AI to show its reasoning step by step, which often improves accuracy on complex problems.
ChatGPT – A conversational AI product by OpenAI, powered by their GPT series of models. Launched in November 2022 and sparked the current AI boom.
Claude – An AI assistant built by Anthropic, known for strong performance in coding, analysis, and following complex instructions.
Context Window – The maximum amount of text an AI model can process in a single conversation. Measured in tokens. Larger windows allow the model to work with more information at once.
D
Deep Learning – A subset of machine learning that uses multi-layered neural networks. Powers most modern AI including image recognition, language models, and speech synthesis.
Diffusion Model – The technology behind AI image generators like Midjourney and DALL-E. Works by learning to gradually remove noise from random static until a coherent image emerges.
E
Embedding – A numerical representation of text, images, or other data that captures its meaning. Used for search, recommendations, and similarity matching.
Emergent Behaviour – Capabilities that appear in large AI models that were not explicitly programmed or expected. Larger models sometimes develop abilities that smaller ones lack.
F
Few-Shot Learning – Giving an AI model a few examples of what you want before asking it to perform the task. Improves output quality without retraining.
Fine-Tuning – Taking a pre-trained AI model and training it further on specific data to improve performance for a particular task or domain.
Foundation Model – A large AI model trained on broad data that can be adapted for many different tasks. GPT-4, Claude, and Gemini are all foundation models.
G
Generative AI – AI systems that create new content such as text, images, music, code, or video. The category that includes ChatGPT, Midjourney, and similar tools.
GPT (Generative Pre-trained Transformer) – The family of large language models developed by OpenAI. GPT-4 and GPT-4o are the current generation.
GPU (Graphics Processing Unit) – The hardware that powers AI training and inference. Originally designed for gaming graphics, GPUs turned out to be ideal for the parallel calculations AI requires. NVIDIA dominates this market.
H
Hallucination – When an AI model generates confident but factually incorrect information. One of the biggest challenges with current language models.
I
Inference – The process of running a trained AI model to generate outputs. When you send a message to ChatGPT and get a response, that is inference.
L
Large Language Model (LLM) – An AI model trained on massive amounts of text data that can understand and generate human language. ChatGPT, Claude, and Gemini are all LLMs.
LoRA (Low-Rank Adaptation) – A technique for fine-tuning large AI models efficiently by only updating a small number of parameters. Makes customisation more accessible and affordable.
M
Machine Learning (ML) – A branch of AI where systems learn from data rather than being explicitly programmed. The broader category that includes deep learning.
Mixture of Experts (MoE) – An architecture where a model contains multiple specialised sub-networks and routes each input to the most relevant ones. Allows larger models to run more efficiently.
Multimodal – An AI model that can process and generate multiple types of data such as text, images, audio, and video. GPT-4o and Gemini are multimodal.
N
Natural Language Processing (NLP) – The field of AI focused on enabling computers to understand, interpret, and generate human language.
Neural Network – A computing system inspired by the biological neural networks in the brain. The fundamental building block of modern AI.
O
Open Source – AI models whose code and weights are publicly available for anyone to use, modify, and deploy. Meta Llama and Mistral are prominent open-source models.
Overfitting – When an AI model performs well on its training data but poorly on new data. Like memorising answers to a test instead of understanding the subject.
P
Parameters – The internal values of an AI model that are adjusted during training. More parameters generally means more capability but also more computational cost.
Prompt – The text input you provide to an AI model. Prompt quality significantly affects output quality.
Prompt Engineering – The practice of crafting effective prompts to get better results from AI models. Includes techniques like chain-of-thought, few-shot examples, and role-playing.
R
RAG (Retrieval-Augmented Generation) – A technique that combines AI text generation with information retrieval from external sources. Reduces hallucinations by grounding outputs in real data.
Reinforcement Learning from Human Feedback (RLHF) – A training method where human reviewers rate AI outputs, and the model learns to produce responses that humans prefer.
S
Scaling Laws – The observed relationship between model size, training data, and performance. Generally, bigger models trained on more data perform better, though this trend may have limits.
Superintelligence – A hypothetical AI system that surpasses human intelligence in virtually every domain. A subject of both research and philosophical debate.
T
Temperature – A setting that controls how random or creative AI outputs are. Low temperature produces more predictable, focused responses. High temperature produces more varied, creative ones.
Token – The basic unit of text that AI models process. Roughly equivalent to 0.75 words in English. Models have token limits for both input and output.
Transformer – The neural network architecture behind virtually all modern large language models. Introduced in the 2017 paper “Attention Is All You Need” by Google researchers.
V
Vector Database – A database optimised for storing and searching embeddings. Used to give AI models access to custom knowledge bases.
Z
Zero-Shot Learning – Asking an AI model to perform a task without providing any examples. Modern LLMs can handle many tasks zero-shot thanks to their broad training data.
Missing a term? Let us know and we will add it. This glossary is updated regularly as new concepts emerge in the AI space.
