English | MP4 | AVC 1280×720 | AAC 44KHz 2ch | 83 lectures (5h 52m) | 2.76 GB

Data Science, Python, sk learn, Decision Trees, Random Forests, KNNs, Ridge Lasso Regression, SVMs

Why should you consider taking the Supervised Machine Learning course?

The supervised machine learning algorithms you will learn here are some of the most powerful data science tools you need to solve regression and classification tasks. These are invaluable skills anyone who wants to work as a machine learning engineer and data scientist should have in their toolkit.

Naïve Bayes, KNNs, Support Vector Machines, Decision Trees, Random Forests, Ridge and Lasso Regression.

In this course, you will learn the theory behind all 6 algorithms, and then apply your skills to practical case studies tailored to each one of them, using Python’s sci-kit learn library.

First, we cover naïve Bayes – a powerful technique based on Bayesian statistics. Its strong point is that it’s great at performing tasks in real-time. Some of the most common use cases are filtering spam e-mails, flagging inappropriate comments on social media, or performing sentiment analysis. In the course, we have a practical example of how exactly that works, so stay tuned!

Next up is K-nearest-neighbors – one of the most widely used machine learning algorithms. Why is that? Because of its simplicity when using distance-based metrics to make accurate predictions.

We’ll follow up with decision tree algorithms, which will serve as the basis for our next topic – namely random forests. They are powerful ensemble learners, capable of harnessing the power of multiple decision trees to make accurate predictions.

After that, we’ll meet Support Vector Machines – classification and regression models, capable of utilizing different kernels to solve a wide variety of problems. In the practical part of this section, we’ll build a model for classifying mushrooms as either poisonous or edible. Exciting!

Finally, you’ll learn about Ridge and Lasso Regression – they are regularization algorithms that improve the linear regression mechanism by limiting the power of individual features and preventing overfitting. We’ll go over the differences and similarities, as well as the pros and cons of both regression techniques.

Each section of this course is organized in a uniform way for an optimal learning experience:

– We start with the fundamental theory for each algorithm. To enhance your understanding of the topic, we’ll walk you through a theoretical case, as well as introduce mathematical formulas behind the algorithm.

– Then, we move on to building a model in order to solve a practical problem with it. This is done using Python’s famous sklearn library.

– We analyze the performance of our models with the aid of metrics such as accuracy, precision, recall, and the F1 score.

– We also study various techniques such as grid search and cross-validation to improve the model’s performance.

To top it all off, we have a range of complementary exercises and quizzes, so that you can enhance your skill set. Not only that, but we also offer comprehensive course materials to guide you through the course, which you can consult at any time.

The lessons have been created in 365’s unique teaching style many of you are familiar with. We aim to deliver complex topics in an easy-to-understand way, focusing on practical application and visual learning.

With the power of animations, quiz questions, exercises, and well-crafted course notes, the Supervised Machine Learning course will fulfill all your learning needs.

If you want to take your data science skills to the next level and add in-demand tools to your resume, this course is the perfect choice for you.

What you’ll learn

- Regression and Classification Algorithms
- Using sk-learn and Python to implement supervised machine learning techniques
- K-nearest neighbors for both classification and regression
- Naïve Bayes
- Ridge and Lasso Regression
- Decision Trees
- Random Forests
- Support Vector Machines
- Practical case studies for training, testing and evaluating and improving model performance
- Cross-validation for parameter optimization
- Learn to use metrics such as Precision, Recall, F1-score, as well as a confusion matrix to evaluate true model performance
- You will dive into the theoretical foundation behind each algorithm with the aid of intuitive explanation of formulas and mathematical notions

## Table of Contents

**Setting up the Environment**

Installing Anaconda

Jupyter Dashboard Part 1

Jupyter Dashboard Part 2

Installing the relevant packages

**Naïve Bayes**

Bayes Thought Experiment

Bayes Theorem

The HamorSpam Example

Motivation

Bayes Thought Experiment

Bayes Thought Experiment Assignment

Bayes Theorem

The HamorSpam Example

The HamorSpam Example Assignment

The YouTube Dataset Creating the Data Frame

CountVectorizer

The YouTube Dataset Preprocessing

The YouTube Dataset Preprocessing Assignment

The YouTube Dataset Classification

The YouTube Dataset Classification Assignment

The YouTube Dataset Confusion Matrix

The YouTube Dataset Accuracy Precision Recall and the F1 score

The YouTube Dataset Changing the Priors

Naïve Bayes Assignment

**KNearest Neighbors**

Motivation

Math Prerequisites Distance Metrics

Motivation

Math Prerequisites Distance Metrics

Random Dataset Generating the Dataset

Random Dataset Visualizing the Dataset

Random Dataset Classification

Random Dataset How to Break a Tie

Random Dataset Decision Regions

Random Dataset Choosing the Best Kvalue

Random Dataset Grid Search

Random Dataset Model Performance

KNeighbors Classifier Assignment

Theory with a Practical Example

KNN vs Linear Regression A Linear Problem

KNN vs Linear Regression A Nonlinear Problem

KNeighbors Regressor Assignment

Pros and Cons

**Decision Trees and Random Forests**

What is a Tree in Computer Science

The Concept of Decision Trees

Decision Trees in Machine Learning

Decision Trees Pros and Cons

Practical Example The Iris Dataset

Practical Example Creating a Decision Tree

Practical Example Plotting the Tree

Decision Tree Metrics Intuition Gini Inpurity

Decision Tree Metrics Information Gain

Tree Pruning Dealing with Overfitting

Random Forest as Ensemble Learning

Bootstrapping

From Bootstrapping to Random Forests

Random Forest in Code Glass Dataset

Census Data and Income Preprocessing

Training the Decision Tree

Training the Random Forest

**Support Vector Machines**

Intro to SVMs

Hard margin problem

Kernels

Implementing a linear SVM

Introduction to Support Vector Machines

Linearly separable classes hard margin problem

Nonlinearly separable classes soft margin problem

Kernels Intuition

Intro to the practical case

Preprocessing the data

Splitting the data into train and test and rescaling

Implementing a linear SVM

Analyzing the results– Confusion Matrix Precision and Recall

Crossvalidation

Choosing the kernels and C values for crossvalidation

Hyperparameter tuning using GridSearchCV

Support Vector Machines Assignment

**Ridge and Lasso Regression**

Ridge Regression Mechanics

Lasso Regression Basics

Crossvalidation for Choosing a Tuning Parameter

Regression Analysis Overview

Overfitting and Multicollinearity

Introduction to Regularization

Ridge Regression Basics

Ridge Regression Mechanics

Regularization in More Complicated Scenarios

Lasso Regression Basics

Lasso Regression vs Ridge Regression

The Hitters Dataset Preprocessing and Preparation

Exploratory Data Analysis

Performing Linear Regression

Crossvalidation for Choosing a Tuning Parameter

Performing Ridge Regression with Crossvalidation

Performing Lasso Regression with Crossvalidation

Comparing the Results

Replacing the Missing Values in the DataFrame

Resolve the captcha to access the links!