DDRO-Generative-Document-Retrieval
Collection
Step-3 DDRO optimized checkpoints (final policy) + accompanying datasets/artifacts (docIDs, pseudo-queries, testsets) to reproduce the paper. β’ 8 items β’ Updated β’ 1
This collection contains four generative retrieval models trained using Direct Document Relevance Optimization (DDRO), a lightweight alternative to reinforcement learning for aligning docid generation with document-level relevance through pairwise ranking.
The models are trained on two benchmark datasets (MS MARCO (MS300K) and Natural Questions (NQ320K)) with two types of document identifiers:
| Dataset | Docid Type | Model Name | MRR@10 | R@10 |
|---|---|---|---|---|
| MS MARCO (MS300K) | PQ | ddro-msmarco-pq |
45.76 | 73.02 |
| MS MARCO (MS300K) | TU | ddro-msmarco-tu |
50.07 | 74.01 |
| Natural Questions (NQ320K) | PQ | ddro-nq-pq |
55.51 | 67.31 |
| πNatural Questions (NQ320K) | TU | ddro-nq-tu |
45.99 | 55.98 |
If you use these models, please cite:
@inproceedings{anonymous2025ddro,
title={Lightweight and Direct Document Relevance Optimization for Generative Information Retrieval},
author={Anonymous},
booktitle={Proceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR β25)},
year={2025},
}
Base model
google-t5/t5-base