How LLM works: how to find out with a 3D graphical representation

I LLM (Large Language Model) at the base of a chatbot like ChatGPT or any other similar product, carry out a series of processing before returning a response to the user. This activity of computing can be rather difficult to understand: thus, the authors of the LLM Visualization project have managed to achieve what until yesterday seemed complicated to even remotely imagine.

An entirely web-based online tool like LLM Visualization allows you to view in 3D which parameters are stored internally by the model and which calculations are managed.

Starting from a simple model, Nano-GPT, to understand how LLMs work

Nano-GPT is a shortened version of the GPT model (Generative Pre-trained Transformer), widely used by OpenAI and now in its fourth generation (GPT-4), with further advancements already on the horizon for some time. This more compact model can generate text in a similar way to GPT, and is suitable for handling tasks related to natural language generation albeit on a small scale.

By clicking on the button Continuein the left column of LLM Visualization, the step-by-step guide platform to discover how a model like Nano-GPT works. By simply pressing the space bar, you advance from paragraph to paragraph with the graphic representation updated in real time in the right panel.

Explore the structure of the model, graphically

By placing the pointer on a specific element of the 3D model, you can check which structure the object belongs to, select or check the number of rows and columns. You can also check the calculation formulas and realize the corresponding results. For more complex structures, you can examine them more closely by zooming in.

Visual representation of LLM AI models

The first time you log in LLM Visualizationthe web application shows the model structure Nano-GPT which, in its proposed form, consists of just 85,000 parameters. Try it and see what happens when you select GPT-2 (small) as a model (made up, in this case, of over 124 million parameters). Or again, selecting later GPT-3 which is based on approximately 175 billion parameters.

The example chosen to understand how LLMs work

A generative model like Nano-GPT helps to better appreciate the “fundamentals” of LLMs. The objective of the model proposed by LLM Visualization consists of putting in alphabetical order a sequence of six letters: CBABBC, which represent i token. Tokens are individual units of the sequence and their diversity constitutes the vocabulary of the model. In the specific case, for greater simplicity, the vocabulary is composed only of tokens A, B and C, with their corresponding ones indexes 0, 1 e 2.

The sequence of letters it is therefore represented numerically according to the indices of the corresponding tokens: 2 1 0 1 1 2. The numbers are inserted into the model as an integral part of it.

The model uses an operation called “embedding” to turn each number into a vector representation of 48 elements. The representation, elaborated through the model, consists of a series of layers called “transformers”.

The ultimate goal of the model is to make a prediction on the next token in the sequence. By placing that prediction back at the top of the model and repeating the process, the model continues to predict for the next letter in the sequence, iterating the process and improving its behavior with each iteration.

Opening image credit: Chandaeng


Please enter your comment!
Please enter your name here