Data Science Basics: Getting Started with Exploratory Data Analysis (EDA) in Excel (Beginner)
Thursday | October 08, 2020 | 3:00pm ET (online) --- Workshop Recording
NOTE: The dataset referenced in this recording has since been updated on data.gov. Please go here to get a copy of the dataset I used for the demonstration.
This workshop will be a beginner level introduction to understanding and summarizing data in Excel. We will look at descriptive statistics and create visualizations to explore Cancer Rate data (or another freely available alternative), as well as discuss strategies for handling missing data.
Data Science Basics: Introduction to Statistical Analysis in Excel (Beginner)
Thursday | October 15, 2020 | 3:00pm ET (online) --- Workshop Recording
This workshop will be a beginner level introduction to statistical analyses in Excel. We will use video game sales data (or another freely available alternative), to introduce statistical analyses that compare average values, including the z-test, t-test, and if time permits, ANOVA.
Data Science Basics: Introduction to Predictive Analytics in Excel (Beginner)
Thursday | October 22, 2020 | 3:00pm ET (online) --- Workshop Recording
This workshop is a beginner level introduction to regression analysis in Excel. Regression analysis uses data to estimate the effect of different variables and make predictions. We will perform a regression analysis using heart rate failure data (or another freely available alternative) and walk you through the different parts of the model and analysis (e.g. intercept, coefficients, p-values, predictions, etc.).
Don't have Microsoft Excel?
Rutgers provides free access to various Microsoft Office products for all current students, including Excel!
Instructions on accessing this service can be found at: https://it.rutgers.edu/microsoft-office/microsoft-office-for-students/#
Note: We will use the Data Analysis Tool in Excel. See instructions on loading this add-in at: https://support.microsoft.com/en-us/office/load-the-analysis-toolpak-in-excel-6a63e598-cd6d-42e3-9317-6b40ba1a66b4
Slides from each workshop will be posted here.
Some cheat sheets to assist you in using Excel are included below:
Kaggle: Kaggle is a dataset repository, data science competition host, tutorial provider, and more. It has an active community that discusses and contributes solutions to various data science problems.
DATA.GOV: A home for U.S. government open data. Topics include climate, education, finance, and many more.
UCI Machine Learning Repository: The UCI Machine Learning Repository is a collection of databases, domain theories, and data generators that are used by the machine learning community for the empirical analysis of machine learning algorithms. This project is in collaboration with Rexa.info at the University of Massachusetts Amherst and receives funding support from the National Science Foundation.
Rutgers, The State University of New Jersey, an equal access/equal opportunity institution. Individuals with disabilities are encouraged to direct suggestions, comments, or complaints concerning any accessibility issues with Rutgers web sites to: firstname.lastname@example.org or complete the Report Accessibility Barrier / Provide Feedback Form.