Adaptive Granularity Retrieval for Retrieval-Augmented Generation

Requirements

M.Sc. in Machine Learning, Data Science, Computer Science, Mathematics, Telecommunications, or similar
Knowledge of Python
Software development skills
Knowledge of signals
Basic knowleddge of natural language modelling and semantic embedding
Basic knowleddge of retrieval

Description

Retrieval-Augmented Generation (RAG) is arguably one of the technology with most traction at the moment. Most RAG systems, however, struggle to achieve their full potential because they rely on a fixed retrieval granularity, typically retrieving passages or chunks of uniform size. This approach often leads to mismatches between the information need and the retrieved evidence: broad questions like “What are these documents about?” demand high-level summaries, while specific factoid queries like “Where was John born?” require fine-grained snippets. The challenge is to design a retrieval system that can dynamically adapt to the level of detail a query requires. This thesis asks: how can we model and predict query intent to select the appropriate retrieval granularity, and how does such an adaptive system impact answer accuracy, coherence, and efficiency compared to fixed-granularity RAG methods?

Contacts

Lorenzo Bongiovanni