Skip to main content
Link to Libraries homepage
Link to Libraries homepage
Rutgers University Libraries

Digital Humanities: Quantitative Analysis: Humanities Datasets

A resource guide for learning about and starting projects in the Digital Humanities

Humanities Data Sets

MONK Project: University of Illinois at Urbana-Champaign

The MONK Project makes available two datasets: One for public use, and one for use by member schools fo the Big Ten Academic Alliance.

The public data set offers data for 525 works of American literature from the 18th and 19th centuries, in addition to some works of Shakespeare, all submitted by a consortium of educational institutions.

The enhanced dataset for Big 10 institutions includes data from the public data set as well as roughly one thousand works of British literature spanning the 16th-19th centuries.

Unvetted ACRLTHATCamp Datasets

The following resources, also crowdsourced at ACRL, have not been examined or vetted. Links may be non-functional, or they may point to dead ends. They are offered here as a supplement for users to explore further.

DBpedia for info about Wikipedia

Marvel Universe Social Graph

Social Media Analysis

                  Keyhole

                  Tweetreach

                  Tweet Binder

NYPL Labs

- links to a number of DH-relevant datasets

Data Hub

Open Bibliographic Datasets

Arts Datasets

LastFM Data

Creative Vitality Index

Cross-Disciplinary Datasets

Caselaw Access Project (CAP)

The Caselaw Access Project, makes 360 years of case law freely available online, digitized from the collections of the Harvard Law School Library.

Options for data access include:

CAP API
CAP Case Browser
Search
Bulk Data Service
Historical Trends
(visualize how words are used in U.S. case law over time)

If you're not sure how to utilize the data here, the CAP has a great gallery of sample projects from which to draw ideas.

Humanities Datasets: THATCamp ACRL 2013

The set of resources featured here was crowdsourced during THATCamp ACRL2013. Contributors include Amanda Rust, Keith Stranger, Steve Stone, and other participants of THATCamp ACRL 2013. The list below represents updates and edits by Krista White.

Cultural Data Project

Users must submit an application to CDP in order to receive datasets. Datasets must be destroyed three months after their use. For more information, see the CDP FAQ page.

 

Association of Religion Data Archives (ARDA)

Surveys, polls and other data about religion from around the world. All data are submitted by researchers. Data is heavily weighted toward Christian religion in the U.S., with some international data.

 

CPANDA Arts and Cultural Organizations Portal

This site provides links to data about arts and cultural institutions from a variety of sources including the U.S. Economic Census, the Unified Datab ase of Arts Organizations, teh National Assembly of State Arts Agencies and more.

 

Amazon AWS Book Data for Use in Google Ngrams

Using this dataset requires a fairly steep learning curve, but it's worth it if you are up to the challenge. You'll want to start by reading this article on Amazon Elastic MapReduce and the getting started tutorial. Then you can dive in to the data!

 

20th Century Bestseller Database

Work done by students at the Graduate School of Library and Information Science at the University of Illinois Urbana-Champaign. To learn more about the project, visit the course page. The tables compiled here are pure html.

 

Dr.Who Villians and Monsters since 1963

Requires the creation of a Tableau Public account to work with the file in its proprietary format. May or may not be exportable to .CSV or other formats.

 

Theatrical Lighting Database

The New York Public Library does it again with a collection of plots, focus charts, cue sheets and more from four landmark Broadway productions digitized from their collections.

 

Old Time Radio Network

More than 12,000 old time radio shows available for listening. Metadata include show name, episode name, air date and episode length.

 

Open GLAM Datasets

Datasets from GLAM (Galleries, Libraries, Archives and Museums) institutions that are open for (re-)use.

 

AusStage

AusStage, led by the Drama Department at Flinders University, works with researchers to map patterns of live performance in order to explore networks of artistic collaboration in the creative and perfoming arts to expose opportunities for audience research and development.

Unesco Stats

The United Nations Educational, Scientific and Cultural Organization (UNESCO) has a series of datasets avialable about a variety of topics in the humanities. Users may utilized the predefined tables prepared by UNESCO or they can create customized tables to explore data in more depth.

Digital Humanities Librarian

Krista White's picture
Krista White
Contact:
Digital Scholarship & Pedagogies Librarian

John Cotton Dana Library, Rutgers-Newark

Rutgers, The State University of New Jersey, an equal access/equal opportunity institution. Individuals with disabilities are encouraged to direct suggestions, comments, or complaints concerning any accessibility issues with Rutgers web sites to: accessibility@rutgers.edu or complete the Report Accessibility Barrier / Provide Feedback Form.