1. CriteoPrivateAd: A Real-World Bidding Dataset to Design Private Advertising Systems
- Author
-
Sebbar, Mehdi, Odic, Corentin, Léchine, Mathieu, Bissuel, Aloïs, Chrysanthos, Nicolas, D'Amato, Anthony, Gilotte, Alexandre, Höring, Fabian, Nogueira, Sarah, and Vono, Maxime
- Subjects
Computer Science - Cryptography and Security ,Statistics - Computation - Abstract
In the past years, many proposals have emerged in order to address online advertising use-cases without access to third-party cookies. All these proposals leverage some privacy-enhancing technologies such as aggregation or differential privacy. Yet, no public and rich-enough ground truth is currently available to assess the relevancy of aforementioned private advertising frameworks. We are releasing the largest, in terms of number of features, bidding dataset specifically built in alignment with the design of major browser vendors proposals such as Chrome Privacy Sandbox. This dataset, coined CriteoPrivateAd, stands for an anonymised version of Criteo production logs and provides sufficient data to learn bidding models commonly used in online advertising under many privacy constraints (delayed reports, display and user-level differential privacy, user signal quantisation or aggregated reports). We ensured that this dataset, while being anonymised, is able to provide offline results close to production performance of adtech companies including Criteo - making it a relevant ground truth to design private advertising systems. The dataset is available in Hugging Face: https://huggingface.co/datasets/criteo/CriteoPrivateAd., Comment: 11 pages
- Published
- 2025