Platform Governance Archive

Dataset PGA v2 (Ongoing Collection)

This is our new dataset that we continually update each day in cooperation with Open Terms Archive. The engine scrapes policy pages of currently 25 platforms for updates and automatically stores snapshots and new versions, based on our curation of platforms and policies.

Platforms: BeReal, Bluesky, Facebook, Instagram, LINE, LinkedIn, Moltbook, Parler, Pintest, Quora, Reddit, Snapchat, Spotify, Telegram, Threads, TikTok, TruthSocial, Tumblr, Twitch, Twitter, WeChat, Upscrolled, WhatsApp, X, YouTube.

Time Frame: 2022 – …

Download Data

Explore Data

Using the Data

We are more than happy if you want to use our dataset in your research, reporting, and explorations. If you do:

Consult the respective data documentation;
reference this project and the actual dataset;
send us a note so that we include you in our research and output page.

PGA v1 is made available under the Open Data Commons Attribution License (that means what we say above: use it, but reference us).

Cite the Project

Katzenbach, C., et al. (2023). The Platform Governance Archive. Centre for Media, Communication and Information Research (ZeMKI), University of Bremen. DOI: 10.17605/OSF.IO/XSBPT. URL: https://platformgovernancearchive.org.

Cite the Dataset

Katzenbach, C., Dergachava, D., Fischer, A., Kopps, A., Kolesnikov, S., Redeker. D., Viejo Otero, P. (2023). Platform Governance Archive (PGA) v2. [data set]. DOI: 10.17605/OSF.IO/XSBPT. URL: https://www.platformgovernancearchive.org/data/dataset-pga-v2-ongoing-collection/.

Cite a Single Document (recommended)

Name of platform. (Date of version). Name of policy. Platform Governance Archive. Direct URL.

Documentation

This dataset is hosted at Github and continuously updated. It builds on the scraping engine of the Open Terms Archive and curated by the PGA team. See the Github pages of PGA and OTA for more details.