Skip to main content

Command Palette

Search for a command to run...

Quick Intro to RAG-Fusion πŸš€

Published
β€’3 min read
A
AI data engineer wiring agents, infra, and unapologetic build logs

RAG-Fusion is like having a team of detectives πŸ•΅οΈβ€β™‚οΈ investigating every angle of a case, instead of relying on just one perspective. It boosts retrieval quality by generating multiple variations of a query and then reranking the results for maximum relevanceβ€”ensuring your AI always focuses on the most important evidence!


Why Does RAG-Fusion Matter? 🎯

  1. Covers More Ground 🌍 – Generates multiple query variations to catch different angles of user intent.

  2. Improves Accuracy πŸ” – Uses Reciprocal Rank Fusion (RRF) to combine scores and prioritize the best results.

  3. Reduces Bias πŸ›‘οΈ – Avoids over-relying on exact keyword matches by reranking based on semantic meaning.

  4. Enhances Context πŸ“š – Pulls richer data, ensuring contextually aware responses from the LLM.


Where Does RAG-Fusion Kick In? πŸ”„

Key Steps:

  1. Multi-Query Generation (Step 5) – Generates different versions of the query to widen the search.

  2. Reciprocal Rank Fusion (Step 6) – Reranks results based on scores from multiple queries.

  3. Final Answer Generation (Step 7) – Combines optimized results into a precise response.


Full RAG-Fusion Code πŸ’»

import os
from langchain.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.vectorstores import Chroma
from langchain_core.runnables import RunnableLambda, RunnablePassthrough
from operator import itemgetter

# 1. Set Environment Keys
os.environ['LANGCHAIN_TRACING_V2'] = 'true'
os.environ['LANGCHAIN_ENDPOINT'] = 'https://api.smith.langchain.com'
os.environ['LANGCHAIN_API_KEY'] = ''  # Add your LangSmith API key
os.environ['LANGCHAIN_PROJECT'] = 'RAG-Fusion'
OPENAI_API_KEY = os.environ["OPENAI_API_KEY"] = ''  # Add your OpenAI API key
if OPENAI_API_KEY == "":
    raise ValueError("Please set the OPENAI_API_KEY environment variable")

# 2. Load and Split Documents
docs = ["example_doc1.txt", "example_doc2.txt"]  # Replace with your file paths
loaded_docs = []
for doc in docs:
    with open(doc, 'r') as file:
        loaded_docs.append(file.read())

# Split documents into chunks
text_splitter = RecursiveCharacterTextSplitter(chunk_size=400, chunk_overlap=60)
chunks = text_splitter.split_documents(loaded_docs)

# 3. Index Documents
vectorstore = Chroma.from_documents(chunks, embedding=OpenAIEmbeddings())
retriever = vectorstore.as_retriever()

# 4. Generate Multi-Queries (RAG-Fusion Step 5)
template = """You are an AI assistant tasked with generating search queries for a vector search engine. 
Generate 5 variations of the following question to capture different aspects of the query:
Original question: {question}"""
prompt_template = ChatPromptTemplate.from_template(template)

multi_query_chain = (
    prompt_template
    | ChatOpenAI(temperature=0)
    | StrOutputParser()
    | (lambda x: x.split("\n"))
)

# 5. RRF: Reciprocal Rank Fusion (Step 6)
def reciprocal_rank_fusion(results, k=60):
    """Re-rank results based on Reciprocal Rank Fusion."""
    fused_scores = {}
    for docs in results:
        for rank, doc in enumerate(docs):
            doc_str = str(doc)
            if doc_str not in fused_scores:
                fused_scores[doc_str] = 0
            fused_scores[doc_str] += 1 / (rank + k)
    reranked_results = sorted(fused_scores.items(), key=lambda x: x[1], reverse=True)
    return [doc[0] for doc in reranked_results]

# Process Queries
query = "What is LangSmith, and why do we need it?"
query_variations = multi_query_chain.invoke({"question": query})

retrieved_docs = []
for variation in query_variations:
    retrieved_docs.append(retriever.invoke(variation))  # Retrieve docs for each variation

reranked_docs = reciprocal_rank_fusion(retrieved_docs)

# 6. Final RAG Model (Step 7)
response_template = """Answer the following question based on this context:
{context}
Question: {question}"""

prompt = ChatPromptTemplate.from_template(response_template)

llm = ChatOpenAI(temperature=0)

final_rag_chain = (
    {"context": RunnablePassthrough(), "question": itemgetter("question")}
    | prompt
    | llm
    | StrOutputParser()
)

result = final_rag_chain.invoke({"context": reranked_docs, "question": query})
print(result)

Key RAG-Fusion Steps Highlighted πŸ”₯

  1. Multi-Query Generation (Step 5)

     multi_query_chain = (
         prompt_template
         | ChatOpenAI(temperature=0)
         | StrOutputParser()
         | (lambda x: x.split("\n"))
     )
    
    • Purpose: Generates 5 rephrased queries to capture different intents.
  2. Reciprocal Rank Fusion (RRF) (Step 6)

     reranked_docs = reciprocal_rank_fusion(retrieved_docs)
    
    • Purpose: Combines results across all queries and reranks based on scores.
  3. Final Answer Generation (Step 7)

     final_rag_chain = (
         {"context": RunnablePassthrough(), "question": itemgetter("question")}
         | prompt
         | llm
         | StrOutputParser()
     )
    
    • Purpose: Uses reranked documents to generate the final answer.

Key Takeaway 🍰

RAG-Fusion is like hiring a detective squad πŸ•΅οΈβ€β™‚οΈ to analyze your question from multiple angles, filter out irrelevant data, and deliver the best evidence to the LLM for answering accurately. πŸ’‘

More from this blog

Anix Lynch – Technical Notes & Engineering Playbooks

176 posts

Deploying mode πŸš€ one csv at a time.