The Data Science Course 2019: Complete Data Science Bootcamp

The Data Science Course 2019: Complete Data Science Bootcamp
The Data Science Course 2019: Complete Data Science Bootcamp
English | MP4 | AVC 1280×720 | AAC 44KHz 2ch | 20.5 Hours | 11.2 GB

Complete Data Science Training: Mathematics, Statistics, Python, Advanced Statistics in Python, Machine & Deep Learning

The Problem

Data scientist is one of the best suited professions to thrive this century. It is digital, programming-oriented, and analytical. Therefore, it comes as no surprise that the demand for data scientists has been surging in the job marketplace.

However, supply has been very limited. It is difficult to acquire the skills necessary to be hired as a data scientist.

And how can you do that?

Universities have been slow at creating specialized data science programs. (not to mention that the ones that exist are very expensive and time consuming)

Most online courses focus on a specific topic and it is difficult to understand how the skill they teach fit in the complete picture

The Solution

Data science is a multidisciplinary field. It encompasses a wide range of topics.

  • Understanding of the data science field and the type of analysis carried out
  • Mathematics
  • Statistics
  • Python
  • Applying advanced statistical techniques in Python
  • Data Visualization
  • Machine Learning
  • Deep Learning

Each of these topics builds on the previous ones. And you risk getting lost along the way if you don’t acquire these skills in the right order. For example, one would struggle in the application of Machine Learning techniques before understanding the underlying Mathematics. Or, it can be overwhelming to study regression analysis in Python before knowing what a regression is.

So, in an effort to create the most effective, time-efficient, and structured data science training available online, we created The Data Science Course 2019.

We believe this is the first training program that solves the biggest challenge to entering the data science field – having all the necessary resources in one place.

Moreover, our focus is to teach topics that flow smoothly and complement each other. The course teaches you everything you need to know to become a data scientist at a fraction of the cost of traditional programs (not to mention the amount of time you will save).

The Skills

1. Intro to Data and Data Science

Big data, business intelligence, business analytics, machine learning and artificial intelligence. We know these buzzwords belong to the field of data science but what do they all mean?

Why learn it? As a candidate data scientist, you must understand the ins and outs of each of these areas and recognise the appropriate approach to solving a problem. This ‘Intro to data and data science’ will give you a comprehensive look at all these buzzwords and where they fit in the realm of data science.

2. Mathematics

Learning the tools is the first step to doing data science. You must first see the big picture to then examine the parts in detail.

We take a detailed look specifically at calculus and linear algebra as they are the subfields data science relies on.

Why learn it?

Calculus and linear algebra are essential for programming in data science. If you want to understand advanced machine learning algorithms, then you need these skills in your arsenal.

3. Statistics

You need to think like a scientist before you can become a scientist. Statistics trains your mind to frame problems as hypotheses and gives you techniques to test these hypotheses, just like a scientist.

Why learn it?

This course doesn’t just give you the tools you need but teaches you how to use them. Statistics trains you to think like a scientist.

4. Python

Python is a relatively new programming language and, unlike R, it is a general-purpose programming language. You can do anything with it! Web applications, computer games and data science are among many of its capabilities. That’s why, in a short space of time, it has managed to disrupt many disciplines. Extremely powerful libraries have been developed to enable data manipulation, transformation, and visualisation. Where Python really shines however, is when it deals with machine and deep learning.

Why learn it?

When it comes to developing, implementing, and deploying machine learning models through powerful frameworks such as scikit-learn, TensorFlow, etc, Python is a must have programming language.

5. Tableau

Data scientists don’t just need to deal with data and solve data driven problems. They also need to convince company executives of the right decisions to make. These executives may not be well versed in data science, so the data scientist must but be able to present and visualise the data’s story in a way they will understand. That’s where Tableau comes in – and we will help you become an expert story teller using the leading visualisation software in business intelligence and data science.

Why learn it?

A data scientist relies on business intelligence tools like Tableau to communicate complex results to non-technical decision makers.

6. Advanced Statistics

Regressions, clustering, and factor analysis are all disciplines that were invented before machine learning. However, now these statistical methods are all performed through machine learning to provide predictions with unparalleled accuracy. This section will look at these techniques in detail.

Why learn it?

Data science is all about predictive modelling and you can become an expert in these methods through this ‘advance statistics’ section.

7. Machine Learning

The final part of the program and what every section has been leading up to is deep learning. Being able to employ machine and deep learning in their work is what often separates a data scientist from a data analyst. This section covers all common machine learning techniques and deep learning methods with TensorFlow.

What you’ll learn

  • The course provides the entire toolbox you need to become a data scientist
  • Fill up your resume with in demand data science skills: Statistical analysis, Python programming with NumPy, pandas, matplotlib, and Seaborn, Advanced statistical analysis, Tableau,
  • Machine Learning with stats models and scikit-learn, Deep learning with TensorFlow
  • Impress interviewers by showing an understanding of the data science field
  • Learn how to pre-process data
  • Understand the mathematics behind Machine Learning (an absolute must which other courses don’t teach!)
  • Start coding in Python and learn how to use it for statistical analysis
  • Perform linear and logistic regressions in Python
  • Carry out cluster and factor analysis
  • Be able to create Machine Learning algorithms in Python, using NumPy, statsmodels and scikit-learn
  • Apply your skills to real-life business cases
  • Use state-of-the-art Deep Learning frameworks such as Google’s TensorFlowDevelop a business intuition while coding and solving tasks with big data
  • Unfold the power of deep neural networks
  • Improve Machine Learning algorithms by studying underfitting, overfitting, training, validation, n-fold cross validation, testing, and how hyperparameters could improve performance
  • Warm up your fingers as you will be eager to apply everything you have learned here to more and more real-life situations
Table of Contents

Part 1: Introduction
1 A Practical Example: What You Will Learn in This Course
2 What Does the Course Cover
3 Download All Resources and Important FAQ

The Field of Data Science – The Various Data Science Disciplines
4 Data Science and Business Buzzwords: Why are there so many?
5 What is the difference between Analysis and Analytics
6 Business Analytics, Data Analytics, and Data Science: An Introduction
7 Continuing with BI, ML, and AI
8 A Breakdown of our Data Science Infographic

The Field of Data Science – Connecting the Data Science Disciplines
9 Applying Traditional Data, Big Data, BI, Traditional Data Science and ML

The Field of Data Science – The Benefits of Each Discipline
10 The Reason behind these Disciplines

The Field of Data Science – Popular Data Science Techniques
11 Techniques for Working with Traditional Data
12 Real Life Examples of Traditional Data
13 Techniques for Working with Big Data
14 Real Life Examples of Big Data
15 Business Intelligence (BI) Techniques
16 Real Life Examples of Business Intelligence (BI)
17 Techniques for Working with Traditional Methods
18 Real Life Examples of Traditional Methods
19 Machine Learning (ML) Techniques
20 Types of Machine Learning
21 Real Life Examples of Machine Learning (ML)

The Field of Data Science – Popular Data Science Tools
22 Necessary Programming Languages and Software Used in Data Science

The Field of Data Science – Careers in Data Science
23 Finding the Job – What to Expect and What to Look for

The Field of Data Science – Debunking Common Misconceptions
24 Debunking Common Misconceptions

Part 2: Statistics
25 Population and Sample

Statistics – Descriptive Statistics
26 Types of Data
27 Levels of Measurement
28 Categorical Variables – Visualization Techniques
29 Categorical Variables Exercise
30 Numerical Variables – Frequency Distribution Table
31 Numerical Variables Exercise
32 The Histogram
33 Histogram Exercise
34 Cross Tables and Scatter Plots
35 Cross Tables and Scatter Plots Exercise
36 Mean, median and mode
37 Mean, Median and Mode Exercise
38 Skewness
39 Skewness Exercise
40 Variance
41 Variance Exercise
42 Standard Deviation and Coefficient of Variation
43 Standard Deviation and Coefficient of Variation Exercise
44 Covariance
45 Covariance Exercise
46 Correlation Coefficient
47 Correlation Coefficient Exercise

Statistics – Practical Example: Descriptive Statistics
48 Practical Example: Descriptive Statistics
49 Practical Example: Descriptive Statistics Exercise

Statistics – Inferential Statistics Fundamentals
50 Introduction
51 What is a Distribution
52 The Normal Distribution
53 The Standard Normal Distribution
54 The Standard Normal Distribution Exercise
55 Central Limit Theorem
56 Standard error
57 Estimators and Estimates

Statistics – Inferential Statistics: Confidence Intervals
58 What are Confidence Intervals?
59 Confidence Intervals; Population Variance Known; z-score
60 Confidence Intervals; Population Variance Known; z-score; Exercise
61 Confidence Interval Clarifications
62 Student’s T Distribution
63 Confidence Intervals; Population Variance Unknown; t-score
64 Confidence Intervals; Population Variance Unknown; t-score; Exercise
65 Margin of Error
66 Confidence intervals. Two means. Dependent samples
67 Confidence intervals. Two means. Dependent samples Exercise
68 Confidence intervals. Two means. Independent samples (Part 1)
69 Confidence intervals. Two means. Independent samples (Part 1) Exercise
70 Confidence intervals. Two means. Independent samples (Part 2)
71 Confidence intervals. Two means. Independent samples (Part 2) Exercise
72 Confidence intervals. Two means. Independent samples (Part 3)

Statistics – Practical Example: Inferential Statistics
73 Practical Example: Inferential Statistics
74 Practical Example: Inferential Statistics Exercise

Statistics – Hypothesis Testing
75 Null vs Alternative Hypothesis
76 Further Reading on Null and Alternative Hypothesis
77 Rejection Region and Significance Level
78 Type I Error and Type II Error
79 Test for the Mean. Population Variance Known
80 Test for the Mean. Population Variance Known Exercise
81 p-value
82 Test for the Mean. Population Variance Unknown
83 Test for the Mean. Population Variance Unknown Exercise
84 Test for the Mean. Dependent Samples
85 Test for the Mean. Dependent Samples Exercise
86 Test for the mean. Independent samples (Part 1)
87 Test for the mean. Independent samples (Part 1). Exercise
88 Test for the mean. Independent samples (Part 2)
89 Test for the mean. Independent samples (Part 2) Exercise

Statistics – Practical Example: Hypothesis Testing
90 Practical Example: Hypothesis Testing
91 Practical Example: Hypothesis Testing Exercise

Part 3: Introduction to Python
92 Introduction to Programming
93 Why Python?
94 Why Jupyter?
95 Installing Python and Jupyter
96 Understanding Jupyter’s Interface – the Notebook Dashboard
97 Prerequisites for Coding in the Jupyter Notebooks

Python – Variables and Data Types
98 Variables
99 Numbers and Boolean Values in Python
100 Python Strings

Python – Basic Python Syntax
101 Using Arithmetic Operators in Python
102 The Double Equality Sign
103 How to Reassign Values
104 Add Comments
105 Understanding Line Continuation
106 Indexing Elements
107 Structuring with Indentation

Python – Other Python Operators
108 Comparison Operators
109 Logical and Identity Operators

Python – Conditional Statements
110 The IF Statement
111 The ELSE Statement
112 The ELIF Statement
113 A Note on Boolean Values

Python – Python Functions
114 Defining a Function in Python
115 How to Create a Function with a Parameter
116 Defining a Function in Python – Part II
117 How to Use a Function within a Function
118 Conditional Statements and Functions
119 Functions Containing a Few Arguments
120 Built-in Functions in Python

Python – Sequences
121 Lists
122 Using Methods
123 List Slicing
124 Tuples
125 Dictionaries

Python – Iterations
126 For Loops
127 While Loops and Incrementing
128 Lists with the range() Function
129 Conditional Statements and Loops
130 Conditional Statements, Functions, and Loops
131 How to Iterate over Dictionaries

Python – Advanced Python Tools
132 Object Oriented Programming
133 Modules and Packages
134 What is the Standard Library?
135 Importing Modules in Python

Part 4: Advanced Statistical Methods in Python
136 Introduction to Regression Analysis

Advanced Statistical Methods – Linear regression
137 The Linear Regression Model
138 Correlation vs Regression
139 Geometrical Representation of the Linear Regression Model
140 Python Packages Installation
141 First Regression in Python
142 First Regression in Python Exercise
143 Using Seaborn for Graphs
144 How to Interpret the Regression Table
145 Decomposition of Variability
146 What is the OLS?
147 R-Squared

Advanced Statistical Methods – Multiple Linear Regression
148 Multiple Linear Regression
149 Adjusted R-Squared
150 Multiple Linear Regression Exercise
151 Test for Significance of the Model (F-Test)
152 OLS Assumptions
153 A1: Linearity
154 A2: No Endogeneity
155 A3: Normality and Homoscedasticity
156 A4: No Autocorrelation
157 A5: No Multicollinearity
158 Dealing with Categorical Data – Dummy Variables
159 Dealing with Categorical Data – Dummy Variables
160 Making Predictions with the Linear Regression

Advanced Statistical Methods – Logistic Regression
161 Introduction to Logistic Regression
162 A Simple Example in Python
163 Logistic vs Logit Function
164 Building a Logistic Regression
165 Building a Logistic Regression – Exercise
166 An Invaluable Coding Tip
167 Understanding Logistic Regression Tables
168 Understanding Logistic Regression Tables – Exercise
169 What do the Odds Actually Mean
170 Binary Predictors in a Logistic Regression
171 Binary Predictors in a Logistic Regression – Exercise
172 Calculating the Accuracy of the Model
173 Calculating the Accuracy of the Model
174 Underfitting and Overfitting
175 Testing the Model
176 Testing the Model – Exercise

Advanced Statistical Methods – Cluster Analysis
177 Introduction to Cluster Analysis
178 Some Examples of Clusters
179 Difference between Classification and Clustering
180 Math Prerequisites

Advanced Statistical Methods – K-Means Clustering
181 K-Means Clustering
182 A Simple Example of Clustering
183 A Simple Example of Clustering – Exercise
184 Clustering Categorical Data
185 Clustering Categorical Data – Exercise
186 How to Choose the Number of Clusters
187 How to Choose the Number of Clusters – Exercise
188 Pros and Cons of K-Means Clustering
189 To Standardize or not to Standardize
190 Relationship between Clustering and Regression
191 Market Segmentation with Cluster Analysis (Part 1)
192 Market Segmentation with Cluster Analysis (Part 2)
193 How is Clustering Useful?
194 EXERCISE: Species Segmentation with Cluster Analysis (Part 1)
195 EXERCISE: Species Segmentation with Cluster Analysis (Part 2)

Advanced Statistical Methods – Other Types of Clustering
196 Types of Clustering
197 Dendrogram
198 Heatmaps

Part 5: Mathematics
199 What is a matrix?
200 Scalars and Vectors
201 Linear Algebra and Geometry
202 Arrays in Python – A Convenient Way To Represent Matrices
203 What is a Tensor?
204 Addition and Subtraction of Matrices
205 Errors when Adding Matrices
206 Transpose of a Matrix
207 Dot Product
208 Dot Product of Matrices
209 Why is Linear Algebra Useful?

Part 6: Deep Learning
210 What to Expect from this Part?

Deep Learning – Introduction to Neural Networks
211 Introduction to Neural Networks
212 Training the Model
213 Types of Machine Learning
214 The Linear Model (Linear Algebraic Version)
215 The Linear Model with Multiple Inputs
216 The Linear model with Multiple Inputs and Multiple Outputs
217 Graphical Representation of Simple Neural Networks
218 What is the Objective Function?
219 Common Objective Functions: L2-norm Loss
220 Common Objective Functions: Cross-Entropy Loss
221 Optimization Algorithm: 1-Parameter Gradient Descent
222 Optimization Algorithm: n-Parameter Gradient Descent

Deep Learning – How to Build a Neural Network from Scratch with NumPy
223 Basic NN Example (Part 1)
224 Basic NN Example (Part 2)
225 Basic NN Example (Part 3)
226 Basic NN Example (Part 4)
227 Basic NN Example Exercises

Deep Learning – TensorFlow: Introduction
228 How to Install TensorFlow
229 A Note on Installing Packages in Anaconda
230 TensorFlow Outline and Logic
231 Actual Introduction to TensorFlow
232 Types of File Formats, supporting Tensors
233 Basic NN Example with TF: Inputs, Outputs, Targets, Weights, Biases
234 Basic NN Example with TF: Loss Function and Gradient Descent
235 Basic NN Example with TF: Model Output
236 Basic NN Example with TF Exercises

Deep Learning – Digging Deeper into NNs: Introducing Deep Neural Networks
237 What is a Layer?
238 What is a Deep Net?
239 Digging into a Deep Net
240 Non-Linearities and their Purpose
241 Activation Functions
242 Activation Functions: Softmax Activation
243 Backpropagation
244 Backpropagation picture
245 Backpropagation – A Peek into the Mathematics of Optimization

Deep Learning – Overfitting
246 What is Overfitting?
247 Underfitting and Overfitting for Classification
248 What is Validation?
249 Training, Validation, and Test Datasets
250 N-Fold Cross Validation
251 Early Stopping or When to Stop Training

Deep Learning – Initialization
252 What is Initialization?
253 Types of Simple Initializations
254 State-of-the-Art Method – (Xavier) Glorot Initialization

Deep Learning – Digging into Gradient Descent and Learning Rate Schedules
255 Stochastic Gradient Descent
256 Problems with Gradient Descent
257 Momentum
258 Learning Rate Schedules, or How to Choose the Optimal Learning Rate
259 Learning Rate Schedules Visualized
260 Adaptive Learning Rate Schedules (AdaGrad and RMSprop )
261 Adam (Adaptive Moment Estimation)

Deep Learning – Preprocessing
262 Preprocessing Introduction
263 Types of Basic Preprocessing
264 Standardization
265 Preprocessing Categorical Data
266 Binary and One-Hot Encoding

Deep Learning – Classifying on the MNIST Dataset
267 MNIST: What is the MNIST Dataset?
268 MNIST: How to Tackle the MNIST
269 MNIST: Relevant Packages
270 MNIST: Model Outline
271 MNIST: Loss and Optimization Algorithm
272 Calculating the Accuracy of the Model
273 MNIST: Batching and Early Stopping
274 MNIST: Learning
275 MNIST: Results and Testing
276 MNIST: Exercises
277 MNIST: Solutions

Deep Learning – Business Case Example
278 Business Case: Getting acquainted with the dataset
279 Business Case: Outlining the Solution
280 The Importance of Working with a Balanced Dataset
281 Business Case: Preprocessing
282 Business Case: Preprocessing Exercise
283 Creating a Data Provider
284 Business Case: Model Outline
285 Business Case: Optimization
286 Business Case: Interpretation
287 Business Case: Testing the Model
288 Business Case: A Comment on the Homework
289 Business Case: Final Exercise

Deep Learning – Conclusion
290 Summary on What You’ve Learned
291 What’s Further out there in terms of Machine Learning
292 An overview of CNNs
293 DeepMind and Deep Learning
294 An Overview of RNNs
295 An Overview of non-NN Approaches
296 Download All Resources

Software Integration
297 What are Data, Servers, Clients, Requests, and Responses
298 What are Data Connectivity, APIs, and Endpoints?
299 Taking a Closer Look at APIs
300 Communication between Software Products through Text Files
301 Software Integration – Explained

Case Study – What’s Next in the Course?
302 Game Plan for this Python, SQL, and Tableau Business Exercise
303 The Business Task
304 Introducing the Data Set

Case Study – Preprocessing the ‘Absenteeism_data’
305 What to Expect from the Following Sections?
306 Importing the Absenteeism Data in Python
307 Checking the Content of the Data Set
308 Introduction to Terms with Multiple Meanings
309 What’s Regression Analysis – a Quick Refresher
310 Using a Statistical Approach towards the Solution to the Exercise
311 Dropping a Column from a DataFrame in Python
312 EXERCISE – Dropping a Column from a DataFrame in Python
313 SOLUTION – Dropping a Column from a DataFrame in Python
314 Analyzing the Reasons for Absence
315 Obtaining Dummies from a Single Feature
316 EXERCISE – Obtaining Dummies from a Single Feature
317 SOLUTION – Obtaining Dummies from a Single Feature
318 Dropping a Dummy Variable from the Data Set
319 More on Dummy Variables: A Statistical Perspective
320 Classifying the Various Reasons for Absence
321 Using .concat() in Python
322 EXERCISE – Using .concat() in Python
323 SOLUTION – Using .concat() in Python
324 Reordering Columns in a Pandas DataFrame in Python
325 EXERCISE – Reordering Columns in a Pandas DataFrame in Python
326 SOLUTION – Reordering Columns in a Pandas DataFrame in Python
327 Creating Checkpoints while Coding in Jupyter
328 EXERCISE – Creating Checkpoints while Coding in Jupyter
329 SOLUTION – Creating Checkpoints while Coding in Jupyter
330 Analyzing the Dates from the Initial Data Set
331 Extracting the Month Value from the “Date” Column
332 Extracting the Day of the Week from the “Date” Column
333 EXERCISE – Removing the “Date” Column
334 Analyzing Several “Straightforward” Columns for this Exercise
335 Working on “Education”, “Children”, and “Pets”
336 Final Remarks of this Section

Case Study – Applying Machine Learning to Create the ‘absenteeism_module’
337 Exploring the Problem with a Machine Learning Mindset
338 Creating the Targets for the Logistic Regression
339 Selecting the Inputs for the Logistic Regression
340 Standardizing the Data
341 Splitting the Data for Training and Testing
342 Fitting the Model and Assessing its Accuracy
343 Creating a Summary Table with the Coefficients and Intercept
344 Interpreting the Coefficients for Our Problem
345 Standardizing only the Numerical Variables (Creating a Custom Scaler)
346 Interpreting the Coefficients of the Logistic Regression
347 Backward Elimination or How to Simplify Your Model
348 Testing the Model We Created
349 Saving the Model and Preparing it for Deployment
350 ARTICLE – A Note on ‘pickling’
351 EXERCISE – Saving the Model (and Scaler)
352 Preparing the Deployment of the Model through a Module

Case Study – Loading the ‘absenteeism_module’
353 Are You Sure You’re All Set?
354 Deploying the ‘absenteeism_module’ – Part I
355 Deploying the ‘absenteeism_module’ – Part II
356 Exporting the Obtained Data Set as a *.csv

Case Study – Analyzing the Predicted Outputs in Tableau
357 EXERCISE – Age vs Probability
358 Analyzing Age vs Probability in Tableau
359 EXERCISE – Reasons vs Probability
360 Analyzing Reasons vs Probability in Tableau
361 EXERCISE – Transportation Expense vs Probability
362 Analyzing Transportation Expense vs Probability in Tableau