Can LLMs Achieve High Recall?

Explore if current LLMs can reliably piece together information from multiple sources to answer complex questions, mimicking human research processes for RAG systems.

Overview

Imagine you want to create a RAG system to answer questions like, “What are the main topics department X at company Y is working on?” Now imagine there is no single document within the department that provides an up-to-date overview of these main topics (who really has these, anyway?). As a human, what would you do (assuming you can’t just ask the team leader, of course ;))? You would likely dig into multiple data sources from the department, apply some filters, run some queries, iterate multiple times, collect notes, and eventually piece together the information.

My question is: can we create an LLM application that can perform this task in a somewhat reliable way?

Tech stack