The Data Services guide describes services available to assist users in finding and using data.
Contents to date include:
See Data by Subject for suggested resources categorized by academic discipline.
See Data Management for advice on handling your research data.
See the Rutgersdata blog for ongoing announcements of newly available datasets and other news.
Data Services helps you find and access data, primarily numeric. Data Services makes selected software available, purchases data sets, hold occasional workshops on data topics, and helps users with databases and data archives that Rutgers has subscriptions and membership to, such as ICPSR.
Data Services is not a statistical consulting service and can neither perform analysis for you, nor advise on correct analysis techniques. But Data Services will get you to the data!
Spring 2019 workshop master schedule (For workshops grouped by topic, click here)
Citation Management with Zotero ☞ RSVP for the DH workshops.
Zotero is a free application that collects, manages, and formats citations and bibliographies. In this introductory, hands-on workshop, we’ll learn how to organize sources, attach PDFs and notes, create tags for easy searching, and generate citations and bibliographies in Word. Bring your personal laptop, download Zotero 5.0 for your OS, and the Zotero Connector for your favorite browser.
Python Basics and Data Exploration ☞ RSVP for the Python workshops.
This workshop will be an accelerated introduction to fundamental concepts such as variable assignment, data types, basic calculations, working with strings and lists, control structures (e.g. for-loops), functions. We will also start working with pandas, a popular data science library in Python, to explore a dataset on foodborne outbreaks reported to the CDC.
Introduction to SAS (no need to RSVP)
This workshop provides an introduction to SAS, covering the basics of navigation, loading data, graphics, and elementary descriptive statistics and regression using a sample dataset.
SAS is a powerful and long-standing system that handles large data sets well, and is popular in the pharmaceutical industry and health sciences, among other applications.
Visualizing Demographic Data in Social Explorer ☞ RSVP for the DH workshops.
This workshop will introduce you to Social Explorer, an online mapping tool that allows you to explore and visualize demographic data. We will explore the tool's basic capabilities and make sample maps using data from the American Community Survey (ACS).
Data Manipulation and Analysis with Python ☞ RSVP for the Python workshops.
In this workshop, we will dive into the world of arrays and data frames using the NumPy and pandas libraries. We'll cover data cleaning and pre-processing, joining and merging, group operations, and more. If you work with tabular data, this workshop is for you!
Introduction to R (no need to RSVP)
This session provides a three-part orientation to the R programming environment, covering statistical techniques, graphics, and data manipulation.
Data Visualization and Machine Learning with Python ☞ RSVP for the Python workshops.
Interested in finding patterns and predicting unknown attribute values in your data? Join us for an overview of machine learning techniques implemented using the scikit-learn library. We'll also learn how to do data visualization with matplotlib, a popular plotting library in Python.
Data Visualization with R (no need to RSVP)
This workshop discusses principles for effective data visualization, and demonstrates techniques for implementing these using R. Some prior familiarity with R is assumed (packages, structure, syntax), but the presentation can be followed without this background.
Introduction to Quantitative Text Analysis ☞ RSVP for the DH workshops.
This hands-on workshop will introduce participants to the basics of quantitative textual analysis using the R programming language. Participants will each first select a text of their choice from Project Gutenberg (literary or otherwise), which we will then explore through the demonstration of a variety of approaches, including word frequency, distribution, and co-appearance. No coding experience required.
Data Scraping: Interaction with APIs with Python ☞ RSVP for the Python workshops.
This workshop is intended to show how to use Python to interact with third-party APIs for data collection. Different type of APIs with real applications will be introduced. Examples such as Rest API for FRED and Quandl will be discussed. A project regarding interacting with FRED API and merging with historical data will be demonstrated in detail.
Data Mining: Regression and Classification with Python ☞ RSVP for the Python workshops.
The traditional Least Square estimation, KNN face severe overfitting issues when the dataset has high-dimensional features. Modern data mining regression techniques such as lasso and classification techniques such as SVM gives a better estimation result in such a situation. The workshop intends to show how lasso and SVM works in Python. Compare the estimation result of Lasso with least square estimation, SVM with KNN in the high-dimensional setting.
Introduction to Mapping ☞ RSVP for the DH workshops.
What kind of information should be mapped? Which tool is best for the job? If you’ve ever found yourself asking either of these questions – or any other about getting started with mapping – this workshop is for you. We’ll begin with a primer how to identify what kind of data is best suited by a map and what data is necessary to make a map. Then, we’ll explain how to get started with a few common mapping programs (StoryMap JS, Palladio, Tableau, Carto) and evaluate what kinds of uses each is best suited to.
Accessing and Exploring Twitter Data ☞ RSVP for the DH workshops.
This hands-on workshop will step participants through the process of collecting social media data from twitter (by handle, hashtag, and/or search phrase) and some of the concerns involved. Participants will then be introduced to a few simple ways to begin analyzing tweet content and metadata, such as the number of likes and retweets.
Introduction to NVivo ☞ RSVP for the NVivo workshops.
This workshop is intended as a basic introduction to using NVivo, a software that supports qualitative and mixed methods research. The workshop focuses on introducing key mechanisms of the software that may be applied as required by different analytical approaches.
Approaches to Web Scraping in R ☞ RSVP for the DH workshops.
This workshop explores web scraping techniques and strategies in the R programming environment.
Building a Neural Network from Scratch ☞ RSVP for the Python workshops.
This workshop is intended to help you establish a Neural Network mindset, and hone your intuitions about Deep Learning. We start by building a logistic regression as the baseline model to recognize cats. Then we develop a single hidden layer NN and extend to a deep NN by adding as many hidden layers as you want. Hopefully, you will see an improvement in accuracy relative to previous logistic regression.
Advanced NVivo: Data Visualizations Using NVivo 12 ☞ RSVP for the NVivo workshops.
This workshop focuses on the introduction of a suite of visualizations that help you gain deeper insights from your data by exploring and unearthing patterns, trends and connections.