Classifying text using LLMs

Learn how to automatically tag text with the usage of Large Language Models, and what are the trade-offs between different methods!

 

Text classification is one of the most common use cases in Natural Language Processing, with numerous practical applications – now easier to access with Large Language Models. Companies use text classification in multiple scenarios to become more efficient:

  • Tagging large volumes of data: reducing manual labor with better filtering, automatically organizing large volumes of text.
  • Enhancing Search/Recommendation Systems: Search and recommendation can be enhanced by a better understanding of the searched queries.
  • Sentiment Analysis: Understanding public opinion/customer feedback by determining the emotion expressed in text is valuable for.
  • Customer Support: Facilitate ticket prioritization and routing to the correct team by categorizing customer support tickets.

All of these use cases were solvable in the past without using LLMs. However, the uprising of these models has reduced the amount of necessary training data for obtaining good results, and has also increased the average performance of these use cases, taking less time for reaching them!

In this blog post, we will cover several techniques for text classification before the uprising of the most recent LLMs (OpenAI, LLaMA, Bing, …) and after.  

FREE eBook: How to transform your business with AI

Download our eBook and discover the most common pitfalls when implementing AI projects and how to prevent them.

Send me the eBook

Most common techniques for Text Classification using Large Language Models

The most common techniques for text classification are:

  • Zero-Shot Classification: asking a model for a label directly, without giving any examples. Although it’s the simplest option, and you don’t need any data, performance is quite limited, and you can end-up with an outcome that is not a part of your fixed class list (hallucination). 
    • Pre-LLMs: Using open-source models such as TARS 
    • Post-LLMs: Directly requesting LLMs to generate a label, passing a final structure. This approach is slower than pre-LLMs: although much more accurate.
  • Few-Shot Classification: you pass a few examples per class, and require a low amount of annotated data.
    • Pre-LLMs: Using open-source models such as TARS
    • Post-LLMs: Using LLMs by passing in the prompt’s context the samples of each class. Will be more accurate than the previous approach.
  • Raw embedding feature extraction: we convert the text into a numerical representation (embedding) and train a model on top of that, which retrieves a probability score that can be used for making decisions.  However, you require a larger amount of annotated data.   
    • Pre-LLMs: Using open-source embeddings such as GloVE.
    • Post-LLMs: Using OpenAI embeddings, which are trained on larger amounts of data and typically outperform other embedding methods. This is a paid option, of which you need to consider the trade-offs compared to using an open source solution. 
  • Embeddings of enriched text: Before extracting the embeddings, we try to uncover more information about the text, “enriching it”. 
    • Pre-LLMs: Not frequently used. 
    • Post-LLMs: ask the LLM to give you more information about the text: for example, if it’s a Google Search, LLMs can give you more information about what that search encompasses. It’s a slower approach than Pre-LLMs, but it’s the technique with the highest scores we’ve seen so far.

“Let’s assume you’re an Encyclopedia, and you have to define the concepts I’m providing. Your explanation must be succinct (couple of paragraphs), like the summary section of a Wikipedia article talking about the concept. (…)”

Below is a comparative chart, summarizing the trade-offs of the methods in terms of required data, speed and accuracy.

 

Conclusion

We showed you several ways of doing text classification using Large Language Models. LLMs allow you to reach acceptable performance in a few hours of work and are pretty good for an initial benchmark – despite this, don’t forget about older methods, which can be a fallback when you want faster outcomes or when paying for LLMs’ requests is not feasible in the scale of your use case. 

Want to revolutionize the way you do text classification? Know more by contacting us!

Do you want to further discuss this idea?

Book a meeting with Paulo Maia

Meet Paulo Learn More

Like this story?

Subscribe to Our Newsletter

Special offers, latest news and quality content in your inbox once per month.

Signup single post

Consent(Required)
This field is for validation purposes and should be left unchanged.

Recommended Articles

Article
Link to Leaders Awarded NILG.AI Startup of the Month

NILG.AI is Startup of the Month Link to Leaders awarded NILG.AI the Startup of the Month (check news). Beta-i nominated us after winning two of their Open Innovation challenges: VOXPOP Urban Mobility Initiatives and Re-Source. AI with Geospatial Data At VOXPOP, NILG.AI built an AI-based mobility index for wheelchair users for the municipality of Lisbon […]

Read More
Article
Can Machine Learning Revolutionize Your Business?

Today, the buzz around machine learning (ML) is louder than ever. But what is it exactly, and more importantly, can it revolutionize your business? In essence, ML is a technology that empowers machines to learn from data, improve over time, and make predictive decisions. It has the potential to redefine how businesses operate. In this […]

Read More
Article
Can ‘Old but Gold’ Predictions Minimize AI Costs?

There’s a common pattern in artificial intelligence (AI) where large corporations build massive infrastructures to support their AI use cases. The goal is to make quick predictions and constantly update with new data to scale up your infrastructure. However, this approach often overlooks the trade-off between infrastructure cost and the size of the opportunity that […]

Read More