To download zipped files from GitHub repositories, click on the green "Clone or download" button on the upper right section of the repository page. Use Jupyter Notebook to open the .ipynb files in an interactive environment.
Data is all around us - in every industry and academic field, behind every online purchase recommendation and driving route calculation. Sometimes we have more data than we know what to do with. If solving data problems intrigues you (or if you just need some data for a class project...), check out the links below.
Spring 2021 workshop information is available at:
NBL Workshop Calendar - https://libcal.rutgers.edu/nblworkshops
This integrated calendar contains information on all open workshops offered by the New Brunswick Libraries. Topics include Python, R, Digital Humanities, GIS, NVivo, Amarel/HPC, and more tools to help you with your research in the Science Workshop series.
Spring 2021 Python workshop materials are available at
https://github.com/NBLGraduateSpecialistProgram/Python2021_Palmere
Python Basics and Data Exploration
This workshop will be an introduction to fundamental concepts such as variable assignment, data types, basic calculations, working with strings and lists, control structures (e.g. for-loops), functions.
Data Manipulation and Analysis with Python
In this workshop, we will dive into the world of arrays and data frames using the NumPy and pandas libraries. We'll cover data cleaning and pre-processing, joining and merging, group operations, and more.
This workshop will give an introduction to data visualization with matplotlib and seaborn library, popular plotting libraries in Python.
Exercise and Practice in Python
This workshop will go over some exercises and practice questions using Python for beginners. If you’re starting out with Python, this workshop is a good way to test your knowledge and learn how to make some small programs.
Introduction to Working with Classes in Python
This workshop will focus on properties and implementations of classes in Python. Inheritance (polymorphism), encapsulation, special methods, decorators, basic applications and examples will be discussed.
C++ in Python
This workshop will focus on integrating C++ and Python. This includes writing Python modules in C++ and speed ups of Python code by using Cython.
Data Mining in the Protein Data Bank
During this session we will combine the material we covered in previous sessions to find patterns in the Protein Data Bank (PDB). With well over 100,000 protein structure entries, we will pay particular attention to the performance of our code.
Basic Applications Development in Python
We will focus on small applications development during this session with the goal being to develop a functional application for simple graphical analyses of our data. We will discuss how to setup our Python project directory with a virtual environment, as well as other broader aspects of software engineering with Python.
Introduction to Machine Learning with Python
This session will introduce the practical aspects of machine learning using the Keras package of Python. We will discuss deep learning models including convolutional neural networks, restricted Boltzmann machines and recurrent neural networks.
Three popular options for installing Python on your computer:
[1] S. Byrnes, "Python for scientific computing: Where to start," Steve Byrnes's Homepage, Oct. 2017. [Online]. Available: http://sjbyrnes.com/python/. [Accessed 27 Apr. 2018].
Since Python is open source, there are abundant online resources to help learners find their way around the language. If you have a specific programming task you need help to achieve, a Google search is often the best way to start. Here is a list of resources you may find helpful if you're interested in a particular topic!
General Python Learning
Visualizing Code Execution
Specific Topics in Python
NumPy and Pandas (Data Manipulation & Analysis)
Data Visualization
Machine Learning
This guide was originally created by Miranda So as the inaugural cohort of the Graduate Specialist Program. To follow Miranda's work, take a look at her GitHub page here.
Hang Miao served as Quantitative Data Graduate Specialist for the 2018-2019 Academic Year, and updated and expanded the workshop content. To follow Hang's work, see his Github page.
Further additions to the workshop content, including topics on statistical inference, machine learning, and HPC with Amarel, were added by Sanket Badhe and Ziqiu (Sly) Zhong, Quantitative Data Graduate Specialists from Fall 2019 to Fall 2020.
There are two separate series of Python workshops listed here, with different instructors and different content. Sly Zhong's series is more geared to beginners with the language (labeled "Beginners"), while Sanket Badhe's series will move at a faster pace (labeled "Accelerated").
YouTube Playlist for all of Ziqiu (Sly) Zhong's Python workshop series.
Python Basics and Data Exploration (Accelerated 1)
This workshop will be an accelerated introduction to fundamental concepts such as variable assignment, data types, basic calculations, working with strings and lists, control structures (e.g. for-loops), functions.
Python Basics and Data Exploration (Beginners 1)
This workshop will be a more deliberate introduction to fundamental concepts such as variable assignment, data types, basic calculations, working with strings and lists, control structures (e.g. for-loops), functions.
Data Manipulation and Analysis with Python (Accelerated 2)
In this workshop, we will dive into the world of arrays and data frames using the NumPy and pandas libraries. We'll cover data cleaning and pre-processing, joining and merging, group operations, and more. If you work with tabular data, this workshop is for you!
Data Manipulation and Analysis with Python (Beginners 2)
In this workshop, we will dive into the world of arrays and data frames using the NumPy and pandas libraries. We'll cover data cleaning and pre-processing, joining and merging, group operations, and more.
This workshop will give an introduction to data visualization with matplotlib and seaborn library, popular plotting libraries in Python.
Data Visualization (Beginners 3)
This workshop will continue with Numpy and Panda libraries. Data visualization with matplotlib, a popular plotting library in Python, will also be covered. Turn data into line, bar, scatter plots etc. Environmental Science and Economics data will be used and examples.scikit-learn library. We'll also learn how to do data visualization with matplotlib, a popular plotting library in Python.
Cryptocurrency API, Visualization, and Comparison project
Statistical Hypothesis Tests - Basic Concepts and Implementation
This workshop delves into a wider variety of basic and most commonly used statistical tests including Null Hypothesis Testing, Critical Value, p-value, Z-test, T-test and Chi-Square Test etc. We will also introduce some examples about how to implement those tests with given database.
Intro to Tableau 1
The workshop will introduce the basics of using Tableau for Data Visualization. Design principles of quantitative and qualitive presenting and meaningful display methods.
Statistical Inference with Python
Recording of Session (Instructor, Sanket Badhe)
In this workshop, we will explore basic principles behind using data for estimation and for assessing theories. The workshop will focus on inference procedures, constructing confidence intervals, and hypothesis testing.
Supervised Learning - Regression
Instructor, Sanket Badhe
In this workshop, we will give an introduction to machine learning, supervised learning and unsupervised learning. Next, we will discuss different methods for train and test split. Finally, we will deepen our understanding of regression, specifically simple linear regression, multiple linear regression.
Supervised Learning - Classification 1
Recording of Session (Instructor, Sanket Badhe)
Instructor, Sanket Badhe
This workshop will first give an introduction about classification problems and then discuss classification algorithms such as K Nearest Neighbour, logistic regression. The latter half of the workshop will focus on classification metrics such as Confusion Matrix, Accuracy, Precision, Recall etc.
Exercise and Practice in Python (Beginner 4)
This workshop will go over some exercises and practice questions using Python for beginners. If you’re starting out with Python, this workshop is a good way to test your knowledge and learn how to make some small programs.
Intro to Tableau 2
More Tableau functions and data visualization options will be covered in this workshop.
Data Science with Python, part 2
This workshop focuses on advanced supervised learning methods for both classification and regression (Decision Tree, Random Forest, Support Vector Machine, Ensemble learning, Neural Network). We will apply all these techniques on a dataset and compare the results of each technique
Neural Networks
This workshop describe Neural Network techniques for data analysis.
Interaction with API in Economics
An API, or application programming interface, is a common tool for interacting with data on the web. This workshop will present how APIs are used in Finance (Equity and Cryptocurrency) and Economics (FRED) industry.Cryptocurrency) and Economics (FRED) industry.
Statistical Hypothesis Tests in Python/SAS/R
This workshop will introduce how to run most commonly used statistical tests in different programming languages including Python and R and show comparison of each of the languages.
Spark Introduction
This workshop will introduce you to pyspark, its features and components.
© , Rutgers, The State University of New Jersey
Rutgers is an equal access/equal opportunity institution. Individuals with disabilities are encouraged to direct suggestions, comments, or complaints concerning any accessibility issues with Rutgers websites to accessibility@rutgers.edu or complete the Report Accessibility Barrier / Provide Feedback form.