Exploring Multi-Table Retrieval Through Iterative Search
Published in AI for Tabular Data Workshop @ EurIPS, 2025
Open-domain question answering over datalakes requires retrieving and composing information from multiple tables, a challenging subtask that demands semantic relevance and structural coherence (e.g., joinability). While exact optimization methods like Mixed-Integer Programming (MIP) can ensure coherence, their computational complexity is often prohibitive. This paper frames multi-table retrieval as an iterative search process, arguing this approach offers advantages in scalability, interpretability, and flexibility. We propose a general framework and a concrete instantiation: a fast, effective Greedy Join-Aware Retrieval algorithm that holistically balances relevance, coverage, and joinability. Experiments across 5 NL2SQL benchmarks demonstrate that our iterative method achieves competitive retrieval performance compared to the MIP-based approach while being 4-400x faster.
Citation: Boutaleb, A., Amann, B., Angarita, R., & Naacke, H. (2025). Exploring Multi-Table Retrieval Through Iterative Search. Proceedings of AI for Tabular Data Workshop @ EurIPS 2025.
Download Paper
