Data Analytics
This course is an introduction to statistical learning using R. In this course, students are exposed to a collection of relatively simple statistical models with varying degrees of complexity. The emphasis is on the hands-on application of machine learning methods on various datasets rather than a theoretical treatment. This is a standard 1 semester course with a lab component. Previous experience with a statistical programming language is recommended but not required. Likewise, previous knowledge in college-level calculus, linear algegra, and statistics is recommended but not required. The core structure of the course is as follows:
- An Introduction to R
- Data Types
- Data Structures
- Functions, Packages
- Control Structures, Debugging
- Plotting
- Regression
- k Nearest Neighours Regression
- Regression Trees
- Gradient Descent
- Linear Regression
- Classification
- k Nearest Neighours Classification
- Classification Trees
- Logistic Regression
- Discriminant Analysis
- Support Vector Machines
- Neural Networks
- Model Evaluation and Selection
- Evaluation, Confusion Matrix, and the ROC curve
- Cross-validation
- Feature Selection