This is our new dataset that we continually update each day in cooperation with Open Terms Archive. The engine scrapes policy pages of currently 11 platforms for updates and automatically stores snapshots and new versions, based on our curation.
Platforms: ChatGPT, Claude.ai, DeepSeek, Google Generative AI Services, Grok, Le Chat, Llama API, Meta AI, Microsoft Copilot, Perplexity and Qwen Chat.
Time Frame: 2025 – …
Using the Data
We are more than happy if you want to use our dataset in your research, reporting, and explorations. If you do:
- Consult the respective data documentation;
- reference this project and the actual dataset;
- send us a note so that we include you in our research and output page.
GenGA is made available under the Open Data Commons Attribution License (that means what we say above: use it, but reference us).
Documentation
This dataset is hosted at Github and continuously updated. It builds on the scraping engine of the Open Terms Archive and curated by the PGA team. See the Github pages of GenGA and OTA for more details.