News and Updates

Introduction to RAG — GenAI Systems for Knowledge

Leon Zucchini

Feb 7, 2024

Retrieval augmented generation

Retrieval-Augmented Generation (RAG) is reshaping how we use generative AI to work with information — as individuals and as teams. In this guide, we’ll introduce you to how RAG works, and explain its advantages and applications to keep your team informed and efficient.

Read on to learn:

  1. What is RAG? It’s like ChatGPT for your data

  2. How can I use RAG? Working with personal and company knowledge

  3. What advantages does RAG offer? It works better than a pure LLM

  4. How do RAG systems work? They combine retrieval and generation

  5. How can I get RAG for myself/my team? Use Curiosity or build it yourself

Let’s jump right in!

1. What Is Retrieval-Augmented Generation (RAG)?

Retrieval-Augmented Generation (RAG) is a knowledge system that can provide a personal ChatGPT for your company’s data, making it easier to find and use the knowledge you need. It can help you interact with a large amount of information quickly and efficiently.

RAG systems work in two steps:

  1. Retrieval: The system digs through your data to find useful pieces of information.

  2. Generation: A generative AI model uses the retrieved information to create clear and accurate answers to your questions.

Bonus: What’s the difference between RAG and LLM?

Large Language Models (LLMs) are models that generate text (e.g. for ChatGPT). RAG combines two systems: Retrieval to get information from a data source, and an LLM to generate a response. LLMs are a component of RAG systems (and of course they’re used for other things as well).

2. How can I use RAG?

For Individuals and Teams

For individuals, RAG acts as a personal AI assistant, efficiently navigating through personal data to answer queries or help craft customized content like emails or summaries.

In a team setting, RAG becomes an invaluable asset for managing shared knowledge, combining inputs from various data sources to offer the team quick insights, consistent answers to common questions, and distributing knowledge among newcomers.

Common Use-Cases

RAG systems are flexible and can be used in many ways, whether by an individual or an entire team.

  • Question Answering: RAG models shine when asked to answer questions, as they can fetch the necessary details from relevant sources and craft a response that’s clear and informative, citing where the information was found.

  • Document Summarization: Faced with lengthy documents, RAG can identify the main points and distill them into short, digestible summaries.

  • Content Generation: When creating articles or reports, RAG aids by incorporating pertinent information from various sources into a cohesive piece. You can even use it to write emails.

These examples just scratch the surface of what’s possible with RAG. As people continue to experiment and innovate, even more applications are emerging, expanding the ways we can use this technology.

3. What advantages does RAG offer?

Everyone wants a “ChatGPT for their data,” but building reliable systems with only LLMs can be challenging. RAG helps by augmenting the LLMs with additional information. That helps them provide context-sensitive responses that outperform what you’d get from purely generative models.

For knowledge systems, RAG has several advantages over “naked” LLM systems:

  • Accuracy: RAG reduces “hallucination,” where LLMs might give plausible but incorrect information. It does that by “grounding” the LLM’s responses in accurate data retreived from your team’s data sources to generate reliable responses.

  • Transparency: Good RAG systems can provide references that let users check where the information came from, adding a layer of trust and accountability to the answers provided by RAG models.

  • Customization: RAG systems can use specific data from your company or field (e.g., naming conventions), making them adaptable and ensuring responses are relevant to your unique context.

4. How Do RAG Systems Work?

RAG systems combine two parts: Retrieval and Generation.

Retrieval: When a user types a prompt, the retrieval part is tasked with sourcing relevant information. It searches through a knowledge base — e.g. a text corpus, knowledge graph, or database — using the user prompt to pinpoint the most pertinent data.

There are lots of methods for retrieval, but it commonly uses “vector search”** (aka “semantic search”) to match information to the user prompt, but there are lots of options here. Sometimes an LLM is also used to set up the vector search, but that’s optional.

Generation: Once the information is retrieved, the LLM steps in for generation. It takes the user prompt and tries to respond while using the retrieved information (aka “context”). Basically, the LLM gets the instruction: “Respond to the user prompt using these documents”.

5. How can I get RAG for myself/my team?

So how can you get RAG for your data to be more productive? If you’re a developer, you could build a system from the ground up — there are several step-by-step guides around (e.g. here or here). Alternatively, you could use Curiosity.

  • Individuals: For individuals, the Curiosity Desktop App offers RAG capabilities like summarization, talk-to-documents, soon talk-to-folders. It’s designed to improve your personal individual workflow and productivity.

  • Teams: For companies, Curiosity Workspaces come with pre-installed RAG systems for team knowledge. These can be further customized to fit a team’s specific requirements, helping teams share knowledge and work more efficiently.

Curiosity’s key benefit is its ready-to-use system that integrates with your data sources, ensures robust data security, and is optimized to handle large datasets. It even allows you to choose the LLM that best suits your needs.

If you’re interested in trying RAG with Curiosity, download the app or get in touch about a Team Workspace.

If you’re curious to know more about RAG models and LLMs:

Leon Zucchini

Share this post