AI Glossary Reference Page

Jenny Kay Pollock
Oct 29
5 min read

A Glossary of Common AI Terms & Concepts by Tamara Gracon, Joanna Ridgeway, Paula Fontana, Jenny Kay Pollock and Reut Lazo.

Neon green circuit brain design on black background with text "Glossary of Common AI Terms" in white, conveying a tech theme.

Here’s a quick guide to help you stay current in the AI era. From AGI to LLM, we’ve got you covered with clear, simple definitions of the terms you’ll hear most often in today’s conversations about artificial intelligence.

📘 Download the printable version

A - C

Agentic AI – AI that can set and work toward goals on its own, making decisions and adapting with little or no human direction.

Algorithm – A set of defined steps used by a model to solve a problem or perform a task.

Artificial General Intelligence (AGI) – A hypothetical form of AI capable of performing any intellectual task a human can do (also called General AI or Strong AI).

Artificial Intelligence (AI) – Technology that can perform tasks requiring thinking, learning, or decision-making, sometimes in ways that go beyond what humans can do.

Backward Chaining – A method that starts with a goal and works backward to find supporting data.

Bias – When an AI system produces results that are unfair or skewed because of problems in the data, design, or training process.

Big Data – Large or complex datasets that traditional data processing tools cannot handle.

Bounding Box – An imaginary rectangle used in image tagging to identify and label objects.

Chatbot – A software tool that simulates conversation with users, typically through text or voice.

Chaining – Connecting AI steps or systems so the output of one becomes the input for the next, creating a longer process or workflow.

Cognitive Computing – A marketing-friendly term for AI, emphasizing human-like reasoning.

Computational Learning Theory – A field focused on analyzing and developing machine learning algorithms.

Corpus – A large collection of text used to train models in natural language tasks.

D - G

Data Design and Training – Process of selecting, organizing, and preparing data to train ML models effectively, ensuring it represents the problem domain and supports accurate learning. It includes tasks like data labeling, feature engineering, and data splitting.

Data Lake – A centralized repository that stores large volumes of structured, semi-structured, and unstructured data in its raw form, enabling flexible access and analysis for AI and analytic applications.

Data Mining – Analyzing large datasets to discover patterns that can inform model behavior.

Data Science – An interdisciplinary field combining statistics, programming, and domain knowledge to interpret data.

Dataset – A structured collection of data used in training, validation, or testing.

Deep Learning (DL) – A type of machine learning that uses many layers of processing to find patterns in data, loosely inspired by how the brain works.

Embeddings – Vector representations of words or data used to capture relationships in models.

Entity Annotation – Tagging parts of text (e.g., names, places) to help models understand context.

Entity Extraction – Structuring data by identifying entities such as people, places, or objects.

Forward Chaining – A reasoning method that moves from known data to possible conclusions.

Generative AI – AI that creates new content such as text, images, or music based on training data.

General AI – A hypothetical form of AI capable of performing any intellectual task a human can do (also called Strong AI or Artificial General Intelligence also known as AGI).

H - L

Hallucinations – Instances where AI generates information that is plausible-sounding but incorrect or fabricated.

Hyperparameter – A value set outside the model that affects how the model learns during training.

Intent – The underlying goal behind a user’s input, used especially in chatbots and NLP.

Label – The correct output associated with input data, used in supervised learning.

Large Language Model (LLM) – A neural network trained on massive text datasets to generate human-like language.

Linguistic Annotation – Marking up text with information such as sentence structure or parts of speech.

M - N

Machine Intelligence – A general term encompassing all types of AI, machine learning (ML), and deep learning (DL) systems.

Machine Learning (ML) – A subset of AI that allows computers to learn from data and improve over time.

Machine Translation – Automatic translation of text between languages using algorithms.

Model – The output of a training process, used to make predictions or decisions.

Natural Language Generation (NLG) – The process of converting data into human-readable text or speech.

Natural Language Processing (NLP) – A field of AI focused on enabling machines to understand and generate human language.

Natural Language Understanding (NLU) – A branch of NLP focused on interpreting the intent, meaning, and context behind human language inputs.

Neural Network – A model inspired by the human brain, used especially in deep learning applications.

O - R

Overfitting – A model’s tendency to learn the training data too well, resulting in poor performance on new data.

Parameter – Internal variables in a model that are learned during training.

Pattern Recognition – Identifying recurring patterns in data to inform AI decisions.

Predictive Analytics – Using historical data and ML to predict future events or trends.

Prompt Engineering – Crafting effective inputs (prompts) to guide LLMs in producing desired outputs.

Python – A widely used programming language in AI development for its simplicity and flexibility.

Reinforcement Learning – A learning method where models learn through trial and error, based on reward signals.

S - T

Semantic Annotation – Tagging data with meaning to improve search or classification tasks.

Sentiment Analysis – Detecting emotions, attitudes, or opinions in a piece of text.

Strong AI – AI with reasoning abilities equivalent to human cognition (also known as general AI or AGI).

Supervised Learning – Training models using labeled data to predict outcomes.

Test Data – Unseen data used to evaluate a model's performance.

Training Data – Labeled data used to teach a model how to make predictions.

Transfer Learning – Using knowledge gained from one task to improve performance on another.

Turing Test – A test to determine if a machine's behavior is indistinguishable from that of a human.

U - Z

Unsupervised Learning – Training models on data without labels to uncover hidden structures or patterns.

Validation Data – Used to tune model parameters and check for overfitting during training.

Variance – How much a model’s predictions change if the training data changes — high variance can mean it’s not reliable. Often linked to overfitting.

Variation – Different ways a person might say the same thing, used to help chatbots understand natural conversation.

📘 Download the printable version

Keep this glossary handy as you continue exploring how AI is reshaping the way we work, build, and lead.