Your AI Glossary - Every AI Phrase You Should Know

Let's dive into some key words to help you get up to speed in AI dialogue. Cele leverages AI to recommend gifts for you and your loved ones. But if you're going to rely on AI for personalized decisions such as gift giving, it can help to understand more about the key AI terms.

Glossary:

Algorithm: A step-by-step procedure or set of rules followed by a computer to solve a specific problem or accomplish a task. Algorithm is an intimidating word, but here is a simple examples: tying your shoe. Since you perform the task in a series of a few steps, it is considered an algorithm.

Artificial Intelligence (AI): The development of computer systems that can perform tasks that typically require human intelligence. AI has 3 groupings: basic AI (ctrl + f to find a certain word in an article, is a form of the basic AI), Machine Learning (ML), and Deep Learning.

Chatbots and Virtual Assistants: ChatGPT is a version of a chatbot and virtual assistant. It relies on NLP processes that results in its ability to understand and respond to natural language queries, helping users with tasks, answering questions, and providing personalized recommendations or customer support.

ChatGPT: An AI model trained by OpenAI, and is a fine-tuned, next-generation version of their earlier foundation model: GPT-3. ChatGPT was the fastest ever tech product to reach 100 million users.

Computer Vision: The field of AI that enables computers to understand and interpret visual information from images or videos, enabling tasks like object recognition and image classification.

Collaborative Filtering: An AI system that groups types of people together to recommend similar products. The whole idea is: birds of a feather, flock together. If you like a gift, someone who has similar interests as you may also like the same gift. Collaborative filtering is often a powerful tool for most recommendation systems.

Content Filtering: An AI system that understands how products relate. You buy an umbrella, and we'll recommend rain boots to go with it. At Cele, we use content filtering to understand the relationship between products and the history you may have with a gift recipient. If you buy them a drone for Christmas, we may recommend some related drone accessories for the following birthday.

Deep Learning: A specific type of machine learning that uses neural networks with multiple layers to process complex data and extract patterns. An example of deep learning is ChatGPT. Oftentimes, Deep Learning refers to AI that has millions, even billions of datapoints. Similar to ML, it will "predict" the next word in a sentence, or the next pixel in an image based on a huge subset of data. Deep Learning is the most sophisticated version of AI.

Diffusion: Diffusion is a technique in AI that allows models to generate high-quality content (often images) by starting with random noise (think a pixelated image) and gradually refining it based on learned patterns and semantics. So, it starts with pure random noise or gibberish. Then it slowly transforms that noise into something more structured by making many tiny changes, kind of like gradually developing a blurry photo into a clear image. Each small change brings the noise closer to a realistic output, until it finally looks like a clear photo, sentence, etc. The key is that it happens incrementally through many small diffusion steps, not all at once.

Embedding: Usually refers to representing a data point within a vector database. More often than not, this refers to words or language that must be transformed for LLMs to read and understand. Although LLMs like ChatGPT seem to know how to read and write, it's actually only doing math. Vectors way to place really nuanced data within a graph, using math to measure how far the points are away from one another. Datapoints that are close have similar meaning. One example is longitude and latitude, 2-dimensional vectors that can use simple math to measure that New York is close to Philadelphia, so it's likely that Philadelphia is more similar to New York than Buenos Aires. In AI, those vectors are dozens or hundreds of dimensions.

Explainable AI (XAI): The development of AI systems that can provide transparent explanations or justifications for their decisions and predictions, enabling better understanding and trust from users.

Fine-tuning: The process of taking one AI model and feeding it new information to adapt it to a new task. Most commonly used in LLMs, which are trained as generalist models. By fine-tuning, one could take a generalist that understands language really well, and focus it to do one task really well, while keeping its language understanding.

Foundation Model: Large AI models with general or broad training data. Foundation models are often decent at many tasks, but should be fine-tuned to be strong for specific ones. ChatGPT, Bard, Claude 2 are all examples of foundation models. Good for basic questions, but not strong at specific tasks. For example, if you were to ask ChatGPT for gift ideas, you'd likely get the same general answer as those bland online listicles.

Generative AI: One form of deep learning AI that is designed to build original or novel content.

GPT: Stands for Generative Pretrained Transformer. It was developed by OpenAI, although many other firms are using GPT to represent an LLM that was trained in a similar fashion.

GPU: Stands for Graphics Processing Unit and is a special type of microchip or microprocessor that is highly efficient for AI training. Nvidia is a household name due to their dominance in the GPU market.

Hallucination: Usually refers to the output from an LLM that is either completely wrong or complete gibberish. LLMs don't actually know anything, so if they are just using math to make predictions, that prediction can be wrong: hence an output that is not based in reality. You should never fully trust any generative AI model if being wrong is a concern. Sometimes, like in writing, hallucinations can act as creativity. So not all hallucinations are bad... this one is.

Inference: Often refers to the output of an AI model, usually used in context of an LLM's answer.

Language Translation: NLP powers machine translation systems that can automatically translate text from one language to another. Companies like Google Translate and Microsoft Translator utilize NLP techniques to provide real-time language translation services, bridging language barriers and facilitating global communication.

Large Language Model (LLM): A type of generative AI model that can create language. ChatGPT is one example of an LLM model. The key for using and understanding LLMs... they're trained on language, not math, not reasoning.

Machine Learning (ML): A subset of AI that enables computers to learn and make predictions from data without being explicitly programmed. For example, if you watch a video with captions, oftentimes those captions are made with a ML program. The program is made to review existing audio and transcripts, and understand the relationships between tone, pitch, volume, etc. and the words that correspond to that audio data. From there, it can "predict" what word should be written in the next transcription.

Multimodal: Also referred to as multimodality, refers to a generative AI model that can make multiple types of content, usually text and images. Multimodal models are likely the next frontier: think of a ChatGPT that not only creates you the slogan for an Instagram ad, but designs the ad too!

Named Entity Recognition (NER): NER algorithms can identify and classify named entities, such as names of people, organizations, locations, or dates, within a given text. This application is useful in information extraction, document indexing, and entity-based search systems.

Natural Language Processing (NLP): The field of AI that focuses on enabling computers to understand and interact with human language, such as speech recognition and language translation. NLP sounds complicated, but it's simply a way for computers to turn English (for example) into the 0s and 1s computers can understand.

Neural Networks: Computational models inspired by the human brain that are composed of interconnected nodes (neurons) used for learning and pattern recognition. Similar to the human brain, we don't fully understand how Neural Networks work (in theory). I like to think of them like the brain of a baby. A baby needs to be "trained" on what to do, and it's brain cells will rewire themselves to learn. For example, a baby does not know that they should not touch a hot stove, until they are told or find out for themselves. Neural networks learn in a similar fashion.

Overfitting: A negative outcome of training AI models and one that scientists are keenly aware of as they build. It occurs when a model performs well on training data but fails to generalize to new, unseen data. Often due to excessive complexity or memorization of noise in the training set. Overfitting only really applies to those building AI, because it leads to a model that makes too many errors.

Parameter: Variables within a neural network. A parameter can refer to a 'neuron' within a neural network and the relationship it has with another 'neuron'.

Question Answering Systems: NLP-powered question answering systems aim to provide accurate answers to user questions based on large amounts of text data. Such systems are utilized in virtual assistants, search engines, and customer support chatbots.

Recommendation Systems: AI algorithms that provide personalized suggestions or recommendations based on user preferences and behavior, commonly used in e-commerce and content platforms. Spotify's next best song recommendation is an example of a recommendation system in action.

Reinforcement Learning: A machine learning approach in which an engine learns by receiving rewards or punishments based on its actions. The idea is that if you give an engine an incentive to maximize its rewards, it'll act in the way you intend. ChatGPT is a famous example.

Sentiment Analysis: NLP algorithms can analyze text data, such as customer reviews or social media posts, to determine the sentiment expressed (positive, negative, or neutral). This analysis is valuable for businesses to understand public opinion, monitor brand sentiment, and improve customer satisfaction. Through its understanding of human language, NLP systems can accurately predict how someone feels based on their text, even sarcastic comments.

Supervised Learning: A machine learning approach in which the model is trained using labeled examples, where the input data is paired with the corresponding correct output. Supervised learning is great, when we know what data we're working with. For example, if I were a chef, and wanted to know the best ways to make fried egg, I'd make it in a nonstick pan, and a cast-iron pan, and on high heat, or low heat, and finally ask my customers to rate on a 1-5 scale, every variation I could experiment with. Each attempt would have the "variables" and the grade, a supervised learning system could identify which grouping of variables led to the best grade.

Text Classification: NLP algorithms can classify text documents into predefined categories or labels. This application is valuable for tasks such as spam email detection, sentiment classification, news categorization, and content moderation.

Text Summarization: NLP algorithms can automatically generate concise summaries of long texts, such as news articles, research papers, or legal documents. This application is valuable for quickly extracting key information and aiding in information retrieval and decision-making processes.

Text-to-Speech (TTS) Systems: NLP enables the conversion of written text into spoken words, making it possible to create natural-sounding synthesized speech. TTS systems find applications in accessibility, language learning, audiobook production, and voice-overs for digital assistants or navigation systems.

Training Data: A set of examples or data used to train an AI system or machine learning model. It helps the model learn patterns and make accurate predictions.

Transfer Learning: A technique in machine learning where knowledge gained from solving one task is applied to a different but related task, often resulting in improved performance and reduced training time.

Unsupervised learning: A type of AI model where the machine only understands the data based on the patterns it sees. Because there is nothing to help the machine learn, often it only works with massive amounts of data... i.e., deep learning.

AI is a constantly evolving space and we'll revisit this glossary frequently. At Cele, we use many of these concepts to recommend you the best, most personalized gifts possible.