If you are using LLM RAG – you should be doing RAFT

  • This is a three-way collaboration between Berkeley AI, Microsoft Azure, and Meta AI! RAFT involves the concept of domain-specific RAG, which represents a more focused and growingly favored area compared to the broader concept of the general open-book exam. In such exams, the domain in which the LLM will be evaluated is known in advance and used for inference. The LLM is capable of addressing prompts by leveraging any and all information from this particular domain, on which it has been specifically fine-tuned.

    Blogs: https://gorilla.cs.berkeley.edu/blogs/9_raft.html

  • It's "Retrieval Augmented Fine Tuning". The related blog post is interesting: https://gorilla.cs.berkeley.edu/blogs/9_raft.html

  • > Retrieval Aware Fine-Tuning (RAFT), presents a novel recipe to prepare fine-tuning data to tailor the models for domain-specific open-book setting, equivalent to in-domain RAG.

    I cannot see the insight on why this is a for a limited domain? The key problem that is being solved is the known problem where RAG returns an irrelevant chunk. It seems like the "benefit" is training a model to ignore irrelevant chunks.

    I am guessing because it costs money to train on multi-domains so they limited their research on one-domain at a time but not sure if there is a "bigger reason" why this isn't an approach to a fine-tuned "make answers from only relevant chunks" model? The paper seems to imply this is only works for specific-domains but I can't see why.

  • From [1]:

    > We demonstrate that our RAG approach trains the model to perform better RAG on the set of documents it is trained on i.e., in-domain. By removing the oracle documents in some instances of the training data, we are compelling the model to memorize domain-knowledge.

    What if you wanted to train it to say that it didn't find the answer?

    [1] https://gorilla.cs.berkeley.edu/blogs/9_raft.html

  • Does any of this matter when you have 5 million token input and can just shove everything into the input?

  • > They hypothesized that a student who studies the textbooks before the open-book exam was likely to perform better than a student who studies the textbook.

    interesting hypothesis, but probably not what was meant.

    “editor” is shaping up to be the hot new job for fleshy consciousnesses by the early thirties.

  • Page me when this process is at least partially automated and continuous.

  • RAFT is amazing!!

  • Am I missing the point or is this not how everyone was doing RAG before?

    I've had my much greater success doing the RAG process on a fine tuned model.