With the rise in popularity of applications such as ChatGPT it has become important to understand exactly what a Large Language Model (LLM) is and, if you’re in business, how to use an LLM as a tool. This article looks at what an LLM is, when a question is “LLMable”, and how it works exactly.
What is an LLM?
What is a Large Language Model (LLM)? This is the ideal question to ask an LLM. I asked ChatGPT (an example of an LLM chatbot) to answer this question for me, and this is the response I received:
“A large language model is an AI system trained on vast text data to understand and generate human language. It’s capable of various tasks like translation, summarisation, and question answering.”
— ChatGPT (2024)
I also turned to an academic article by Chang et al. (2023), who define an LLM as follows:
“Language models (LMs) are computational models that can understand and generate human language. LMs have the transformative ability to predict the likelihood of word sequences or generate new text based on a given input. N-gram models, the most common type of LM, estimate word probabilities based on the preceding context. However, LMs also face challenges, such as the issue of rare or unseen words, the problem of overfitting, and the difficulty in capturing complex linguistic phenomena. […] Large Language Models (LLMs) are advanced language models with massive parameter sizes and exceptional learning capabilities.”
— Chang et al. (2023), A survey on evaluation of large language models
In summary, here are a few characteristics of an LLM:
- It is an AI model
- It is trained on a vast amount of data
- It can “understand” a natural language prompt
- It can produce a natural language response
- It uses probability to understand prompts and create responses
It is important to note that LLMs are not very good at creating original text, as they can only reproduce content if they have been trained on similar content.
When is a question “LLMable”
In an attempt to find a simpler definition, I turned to Urban Dictionary. They do not have an entry for LLMs, but they do have an entry for “LLMable”:
“The ability to use Large Language Models (LLMs). Reasoning questions are not LLMable, questions about how many cats can fit in a school bus are LLMable.”
— Urban Dictionary
Sometimes an LLM hallucinates (produces an incorrect response). One of the ways to avoid hallucinations is to only ask the LLM questions that it can answer. Not all questions are “LLMable” and, unfortunately, the model values answering above answering correctly. That means that if it is not able to answer, it will hallucinate.
What, then, are “LLMable” questions? LLMs are good with text-based questions. Andrey Kudryavets explains this simply. You could use an LLM if you want to:
- Shorten a piece of text or summarise it
- Have the spelling and grammar of a text checked
- Change the tone of a piece of writing
- Find synonyms, antonyms, or the right words to explain something
- Extract information from a piece of text
- Manipulate a piece of text in any way
How does an LLM work exactly
How does the LLM know how to write? An LLM makes use of probability. Essentially, it is predicting what would be the most probable word to follow. Even though, intuitively, as a user, it might feel like the LLM “understands” what you are telling it to do, in reality it does not understand. It just successfully predicts what you need.
Summary
The use of LLMs and LLM chatbots like ChatGPT has gained popularity at a rapid rate. This raises the questions: what is an LLM, how does it work, and what types of questions can it answer? Ultimately, an LLM is an AI model that is trained on vast amounts of data, can be prompted in natural languages, generates responses in natural language, and uses prediction to do so. It is well suited to generating and manipulating text and is an ideal tool for activities such as summarising, proofreading, and helping you find the right words. More and more businesses are using LLM chatbots as internal or external tools.
Praelexis is the ideal partner for your business’s LLM journey. Contact us today to discuss how LLMs can be used in your business.
Bibliography
Chang, Y., Wang, X., Wang, J., Wu, Y., Yang, L., Zhu, K., Chen, H., Yi, X., Wang, C., Wang, Y. and Ye, W. (2023). A survey on evaluation of large language models. ACM Transactions on Intelligent Systems and Technology.
Kudryavets, A. (2023). Large Language Models: what are they good and not good for? LinkedIn.