Skip to Main Content

Data Topics

This guide provides links to the "Data Topics" series of workshops by Ryan Womack, Data Librarian.

R workshop schedule

Fall 2024 workshop information now available at:

NBL Workshop Calendar - https://libcal.rutgers.edu/calendar/nblworkshops

This integrated calendar contains information on all open workshops offered by the New Brunswick Libraries. Topics include R, Python, Machine Learning, Digital Humanities, GIS, NVivo, and the Data Science Basics Workshop Series.

Including upcoming R workshops here.

 

For ongoing updates about data-related workshops and information from the New Brunswick Libraries, subscribe to the

nbl_data listserv

This is a low-traffic list, but will provide more frequent updates about data events than the typical beginning of semester announcement.

Data Topics

From Fall 2023, the Data Topics series is being slowing phased in to supplement and ultimately supplant the "tidyverse approach" series.

Current offerings in this series are:

  • Data Analysis 2 (statistical tests, regression, sampling, bootstrap with R + comparison with Python)
  • Data Visualization 2 (interactive visualizations with Shiny, including hosting a Shiny server)
  • Data Publication 2 (publishing to data repositories and creating R packages)

"tidyverse approach" workshops

Materials are available at

https://github.com/ryandata/tidyverse_approach

for the following workshops:

  • R for Data Analytics, a tidyverse approach
  • R graphics with ggplot2
  • R data wrangling with dplyr, tidyr, readr and more
  • R for interactivity: an introduction to Shiny [last offered Spring 2024]
  • R for reproducible scientific documents: knitr, rmarkdown, and beyond

See the full playlist for videos of the R "tidyverse approach" workshop series.

R for data analysis: a tidyverse approach 

The session introduces the R statistical software environment and basic methods of data analysis, and also introduces the "tidyverse".  While R is much more than the "tidyverse", the development of the "tidyverse" set of packages, led by RStudio, has provided a powerful and connected toolkit to get started with using R.  Note that graphics and data manipulation are covered in subsequent sessions.

R graphics with ggplot2 

The ggplot2 package from the tidyverse provides extensive and flexible graphical capabilities within a consistent framework.  This session introduces the main features of ggplot2. Some prior familiarity with R is assumed (packages, structure, syntax), but the presentation can be followed without this background.  

R data wrangling with dplyr, tidyr, readr and more 

Some of the most powerful features of the tidyverse relate to its abilities to import, filter, and otherwise manipulate data.  This session reviews major packages within the tidyverse that relate to the essential data handling steps require before (and during) data analysis.

R for interactivity: an introduction to Shiny  

[last offered Spring 2024]

Shiny is an R package that enables the creation of interactive websites for data visualization.   This session provides a brief overview of the Shiny framework, and how to edit and publish Shiny sites in RStudio (with shinyapps.io).  Familiarity with R/RStudio is assumed.

R for reproducible scientific documents: knitr, rmarkdown, and beyond 

The RStudio environment enables the easy creation of documents in various formats (HTML, DOC, PDF) using Rmarkdown, while knitr allows the incorporation of executable R code to produce the tables and figures in those documents. This session introduces these concepts and other packages and practices supporting reproducibility with the R environment.

Also see the Digital Humanities workshops series for several workshops involving R.

R Videos

Screencast versions of older versions of the workshops are linked below.

Special Topics

R scripts for Special Topics workshops

About R

R is open source software for statistical analysis.  Being open source (Gnu GPL licensed) doesn't just mean that the software is free.  It means that you can use it for a variety of applications, and install it virtually anywhere you'd like, without any restrictions.  Open source also means that the code for all statistical procedures and analysis can be independently checked and verified.  The activity community of R users is constantly developing new add-on packages that use the latest techniques, which you are free to do as well.  And, being free, you can always have access to the latest version of the software, no matter where you are.

R is also a programming language, which makes it easy to document, reuse and reproduce all the steps of your statistical analysis. 

You can get R, and full documentation on R, at www.r-project.org or by downloading from any CRAN mirror (Comprehensive R Archive Network).

R Learning Links

Guides and Tutorials
Searching for R on the Internet
More Information
  • RDocumentation - searchable site for packages across CRAN, Bioconductor, and Github
  • Task Views – excellent outlines of the packages that are relevant to different disciplines
  • R Inferno common problems with R
  • R-bloggers – combining posts from several R blogs
 
Enjoy R!

The R Help System

Help Commands within R
 • help.start() - launches interactive help system
 • help(function) or ?function launch the manual pages
   describing a function
 • example(function) provides detailed examples
 • for help on a whole package, try library(help=packagename)
 • apropos and help.search (deep vs. fuzzy search, respectively)
 • vignette("mypackage")

R Tips and Tricks

These are some miscellaneous useful and interesting links that may help you accomplish some specific tasks in R.