This post seeks to give an overview of the research in this field and explain the difference between frozen, advanced, and fully dynamic RAG.

This article draws inspiration from the excellent lecture "Stanford CS25: V3 I Retrieval Augmented Language Models" by Douwe Kiela, who, along with Patrick Lewis, Ethan Perez, et al., invented RAG in May 2020.

The idea of enabling computers to extract information from a knowledge base to assist in language tasks goes back decades, with early question-answering systems from the 1960s and IBM's Watson Jeopardy system having similar conceptual underpinnings. To understand the origins of the first RAG-like systems in 2017 and its invention in 2021 we have to understand the underlying retrieval technology.

|800
Technology tree of RAG research development featuring representative works By Gao et al 2023

|800
Taxonomy of RAG's Core Components

Retrieval

Sparse vs Dense Retrieval

ORQA: Latent Retrieval for Weakly Supervised Open Domain Question Answering

|900

Vector DBs and Sparse & Dense Hybrids

ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT

|1000

SPLADE: Sparse Lexical and Expansion Model for First Stage Ranking

With term expansion on our query we will a much larger overlap because we’re now able to identify similar words.|1000

DRAGON: Diverse Augmentation Towards Generalizable Dense Retrieval

|1000

SANTA: Structure-Aware Language Model Pretraining Improves Dense Retrieval on Structured Data

🧊 Frozen vs Dynamic RAG đŸ”„

The industry has mostly viewed the components of the RAG architecture as separate components that work in isolation.
We can call this “Frozen RAG”. In contrast, some research has focused on iteratively improving the individual components (we can call this “Advanced RAG”).

Ideally, in a “Fully Dynamic” model, the gradients from the loss function would flow back into the entire system (end-to-end training): retriever, generator, and document encoder.
However, this is computationally challenging and has not been done successfully.

đŸ”„ Dynamic Retriever but Fixed Generator 🧊

In-Context Retrieval-Augmented Language Models

REPLUG: Retrieval-Augmented Black-Box Language Models

|800

DREditor: A Time-efficient Approach for Building a Domain-specific Dense Retrieval Model

|800

|800

🧊 Fixed Retriever but Dynamic Generator đŸ”„

FiD: Fusion-in-Decoder

|800

KG-FiD: Infusing Knowledge Graph in Fusion-in-Decoder for Open-Domain Question Answering

|800

SURGE:Knowledge Graph-Augmented Language Models for Knowledge-Grounded Dialogue Generation

|800

KNN-LM: Generalization through Memorization: Nearest Neighbor Language Models

|800

RAG: Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

|800

RETRO: Improving language models by retrieving from trillions of tokens

|800

Fully Dynamic RAG

REALM: Retrieval-Augmented Language Model Pre-Training

|500

Other Retrieval Research

RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval

|800

|800

FLARE: Forward-looking active retrieval augmentation

|800

HyDE: Hypothetical Document Embeddings

|1000

MuGI: Enhancing Information Retrieval through Multi-Text Generation Integration with Large Language Models

|800

Query Rewriting for Retrieval-Augmented Large Language Models

|800

Lost in the Middle: How Language Models Use Long Contexts

|500

|800

Augmentation/interactivity

CRAG: Corrective Retrieval Augmented Generation

|800

WebGPT: Browser-assisted question-answering with human feedback

|800

Toolformer: Language Models Can Teach Themselves to Use Tools

|800

Gorilla: Large Language Model Connected with Massive APIs

|800

Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection

|800

GRIT: Generative Representational Instruction Tuning

|800

|800

Conclusion

Further Reading

Other summaries/literature reviews

I plan to summarize more of these here when I find the time. If you think I’m missing a paper feel free to leave a comment or DM me about it:

Filtering and ranking

Transformer memory

Multi-modality

Knowledge Graphs & Reasoning

Other Reasoning Techniques

Instruction & Memory

References

Thoughts đŸ€” by Soumendra Kumar Sahoo is licensed under CC BY 4.0