Datasets

Finding datasets that are feature rich and openly accessible is a difficult task. Below, we provide links to a few online resources that may contain interesting datasets for starting your journey in data journalism.

FiveThirtyEight

FiveThirtyEight provides access to a range of data sources, from polling to sports and entertainment reports.

Kaggle

Kaggle is very popular in the machine learning community as scientists try to build better predictive models on openly available datasets.

Pew Research Center

The Pew Research Center is the main polling organization in the United States.

The Global Database of Events, Language, and Tone

The Global Database of Events, Language, and Tone or GDELT monitors the world’s broadcast, print, and web news from nearly every corner of every country in over 100 languages and identifies the people, locations, organizations, themes, sources, emotions, counts, quotes, images and events driving our global society every second of every day, creating a free open platform for computing on the entire world. For a gentle introduction as well as tool to query GDELT, check out this paper.