We take a deep dive into the inner workings of the wildly popular AI chatbot, ChatGPT. If you want to know how its generative AI magic happens, read on.
Back in the day (and by “in the day,” I mean late 2022, before AI chatbots exploded on the scene), tools like Google and Wolfram Alpha interacted with users via a single-line text entry field and provided text results. Google returned search results — a list of web pages and articles that would (hopefully) provide information related to the search queries. Wolfram Alpha generally provided answers that were mathematical and data analysis-related.
ChatGPT, by contrast, provides a response based on the context and intent behind a user’s question. Google, of course, has changed up its response mode. It now provides AI-based responses before search results, and it’s likely to continue to do so. Wolfram Alpha, on the other hand, uses AI behind the scenes to help it with its calculations but does not provide AI-based answers.
Fundamentally, Google’s searching power is its ability to do enormous database lookups and provide a series of matches. Wolfram Alpha’s power is its ability to parse data-related questions and perform calculations.
ChatGPT’s power (and that of almost any other AI chatbot, like Claude, Copilot, Perplexity, and Google Gemini) is the ability to parse queries and produce fully fleshed-out answers and results based on most of the world’s digitally accessible text-based information. Some chatbots have restrictions based on when they stopped scanning information, but most can now access the live Internet to factor current data into their answers.
In this article, we’ll see how ChatGPT can produce those fully fleshed-out answers using a technology called generative artificial intelligence. We’ll start by looking at the main phases of ChatGPT operation, then cover some core AI architecture components that make it all work.
The two main phases of ChatGPT operation
Let’s use Google Search (as distinguished from Google Gemini AI) as an analogy again. When you ask Google Search to look up something, you probably know that it doesn’t — at the moment you ask — go out and scour the entire web for answers. Instead, Google searches its database for pages that match that request. Google search has two main phases: the spidering and data-gathering phase, and the user interaction/lookup phase.
Roughly speaking, ChatGPT and the other AI chatbots work the same way. The data-gathering phase is called pre-training, while the user responsiveness phase is known as inference. The magic behind generative AI and the reason it has exploded is that the way pre-training works has proven to be enormously scalable. That scalability has been made possible by recent innovations in affordable hardware technology and cloud computing.
How pre-training AI works
Generally speaking (because getting into specifics would take volumes), AIs pre-train using two principal approaches: supervised and non-supervised. Most AI projects until the current crop of generative AI systems like ChatGPT used the supervised approach.
Supervised pre-training is a process where a model is trained on a labelled dataset, where each input is associated with a corresponding output.
For example, an AI could be trained on a dataset of customer service conversations, where the user’s questions and complaints are labelled with the appropriate responses from the customer service representative. To train the AI, questions like, “How can I reset my password?” would be provided as user input, and answers like, “You can reset your password by visiting the account settings page on our website and following the prompts,” would be provided as output.
In a supervised training approach, the overall model is trained to learn a mapping function that can map inputs to outputs accurately. This process is often used in supervised learning tasks, such as classification, regression, and sequence labelling.
Click HERE to read more
Feature Image Credit: Elyse Betters Picaro / ZDNET