Tutorial: Creating Your First QA Pipeline with Retrieval-Augmentation

_{Last Updated:
February 12, 2024}

Level: Beginner
Time to complete: 10 minutes
Components Used: InMemoryDocumentStore, InMemoryBM25Retriever, PromptBuilder, OpenAIGenerator
Prerequisites: You must have an API key from an active OpenAI account as this tutorial is using the gpt-3.5-turbo model by OpenAI.
Goal: After completing this tutorial, you’ll have learned the new prompt syntax and how to use PromptBuilder and OpenAIGenerator to build a generative question-answering pipeline with retrieval-augmentation.

This tutorial uses Haystack 2.0 Beta. To learn more, read the Haystack 2.0 Beta announcement or see Haystack 2.0 Documentation.

Overview

This tutorial shows you how to create a generative question-answering pipeline using the retrieval-augmentation ( RAG) approach with Haystack 2.0. The process involves three main components: InMemoryBM25Retriever for fetching relevant documents, PromptBuilder for creating a template prompt, and OpenAIGenerator for generating responses.

For this tutorial, you’ll use the Wikipedia pages of Seven Wonders of the Ancient World as Documents, but you can replace them with any text you want.

Preparing the Colab Environment

Installing Haystack

Install Haystack 2.0 Beta and datasets with pip:

%%bash

pip install haystack-ai
pip install "datasets>=2.6.1"

Enabling Telemetry

Knowing you’re using this tutorial helps us decide where to invest our efforts to build a better product but you can always opt out by commenting the following line. See Telemetry for more details.

from haystack.telemetry import tutorial_running

tutorial_running(27)

Initializing the DocumentStore

You’ll start creating your question answering system by initializing a DocumentStore. A DocumentStore stores the Documents that the question answering system uses to find answers to your questions. In this tutorial, you’ll be using the InMemoryDocumentStore.

from haystack.document_stores.in_memory import InMemoryDocumentStore

document_store = InMemoryDocumentStore()

InMemoryDocumentStore is the simplest DocumentStore to get started with. It requires no external dependencies and it’s a good option for smaller projects and debugging. But it doesn’t scale up so well to larger Document collections, so it’s not a good choice for production systems. To learn more about the different types of external databases that Haystack supports, see DocumentStore Integrations.

The DocumentStore is now ready. Now it’s time to fill it with some Documents.

Fetching and Writing Documents

You’ll use the Wikipedia pages of Seven Wonders of the Ancient World as Documents. We preprocessed the data and uploaded to a Hugging Face Space: Seven Wonders. Thus, you don’t need to perform any additional cleaning or splitting.

Fetch the data and write it to the DocumentStore:

from datasets import load_dataset
from haystack import Document

dataset = load_dataset("bilgeyucel/seven-wonders", split="train")
docs = [Document(content=doc["content"], meta=doc["meta"]) for doc in dataset]
document_store.write_documents(docs)

Initializing the Retriever

Initialize a InMemoryBM25Retriever and make it use the InMemoryDocumentStore we initialized earlier in this tutorial. This Retriever will get the relevant documents to the query:

from haystack.components.retrievers.in_memory import InMemoryBM25Retriever

retriever = InMemoryBM25Retriever(document_store)

Defining a Template Prompt

Create a custom prompt for a generative question answering task using the RAG approach. The prompt should take in two parameters: documents, which are retrieved from a document store, and a question from the user. Use the Jinja2 looping syntax to combine the content of the retrieved documents in the prompt.

Next, initialize a PromptBuilder instance with your prompt template. The PromptBuilder, when given the necessary values, will automatically fill in the variable values and generate a complete prompt. This approach allows for a more tailored and effective question-answering experience.

from haystack.components.builders import PromptBuilder

template = """
Given the following information, answer the question.

Context:
{% for document in documents %}
    {{ document.content }}
{% endfor %}

Question: {{question}}
Answer:
"""

prompt_builder = PromptBuilder(template=template)

Initializing a Generator

Generators are the components that interacts with large language models (LLMs). Now, set OPENAI_API_KEY environment variable and initialize a OpenAIGenerator that can communicate with OpenAI GPT models. If you don’t provide any model, the OpenAIGenerator defaults to gpt-3.5-turbo:

import os
from getpass import getpass

from haystack.components.generators import OpenAIGenerator

os.environ["OPENAI_API_KEY"] = getpass("Enter OpenAI API key: ")
generator = OpenAIGenerator()

You can replace OpenAIGenerator in your pipeline with another Generator. Check out the full list of generators here.

Building the Pipeline

To build a pipeline, add all components to your pipeline and connect them. Create connections from retriever to the prompt_builder and from prompt_builder to llm. Explicitly connect the output of retriever with “documents” input of the prompt_builder to make the connection obvious as prompt_builder has two inputs (“documents” and “question”). For more information on pipelines and creating connections, refer to Creating Pipelines documentation.

from haystack import Pipeline

basic_rag_pipeline = Pipeline()
# Add components to your pipeline
basic_rag_pipeline.add_component("retriever", retriever)
basic_rag_pipeline.add_component("prompt_builder", prompt_builder)
basic_rag_pipeline.add_component("llm", generator)

# Now, connect the components to each other
basic_rag_pipeline.connect("retriever", "prompt_builder.documents")
basic_rag_pipeline.connect("prompt_builder", "llm")

Visualize the Pipeline

Draw the pipeline with the draw() method to confirm the connections are correct. You can find the diagram in the Files section of this Colab.

basic_rag_pipeline.draw("basic-rag-pipeline.png")

That’s it! The pipeline’s ready to generate answers to questions!

Asking a Question

When asking a question, use the run() method of the pipeline. Make sure to provide the question to both the retriever and the prompt_builder. This ensures that the {{question}} variable in the template prompt gets replaced with your specific question.

question = "What does Rhodes Statue look like?"

response = basic_rag_pipeline.run({"retriever": {"query": question}, "prompt_builder": {"question": question}})

print(response["llm"]["replies"][0])

Here are some other example questions to test:

examples = [
    "Where is Gardens of Babylon?",
    "Why did people build Great Pyramid of Giza?",
    "What does Rhodes Statue look like?",
    "Why did people visit the Temple of Artemis?",
    "What is the importance of Colossus of Rhodes?",
    "What happened to the Tomb of Mausolus?",
    "How did Colossus of Rhodes collapse?",
]

🎉 Congratulations! You’ve learned how to create a generative QA system for your documents with RAG approach.

Filtering Documents with Metadata