Data Cleaning in Python Essential Training

Data Cleaning in Python Essential Training

English | MP4 | AVC 1280×720 | AAC 44KHz 2ch | 0h 49m | 210 MB

Do you need to understand how to keep data clean and well-organized for your company? In this course, instructor Miki Tebeka explains why clean data is so important, what can cause errors, and how to detect, prevent, and fix errors to keep your data clean. Miki explains the types of errors that can occur in data, as well as missing values or bad values in the data. He goes over how human errors, machine-introduced errors, and design errors can find their way into your data, then shows you how to detect these errors. Miki dives into error prevention, with techniques like digital signatures, data pipelines and automation, and transactions. He concludes with ways you can fix errors, including renaming fields, fixing types, joining and splitting data, and more.

Table of Contents

1 Why is clean data important
2 What you should know

1. Bad Data
3 Types of errors
4 Missing values
5 Bad values
6 Duplicates

2. Causes of Errors
7 Human errors
8 Machine errors
9 Design errors
10 Challenge UI design
11 Solution UI design

3. Detecting Errors
12 Schemas
13 Validation
14 Finding missing data
15 Domain knowledge
16 Subgroups
17 Challenge Find bad data
18 Solution Find bad data

4. Preventing Errors
19 Serialization formats
20 Digital signatures
21 Data pipelines and automation
22 Transactions
23 Data organization and tidy data
24 Process and data quality metrics
25 Challenge ETL
26 Solution ETL

5. Fixing Errors
27 Renaming fields
28 Fixing types
29 Joining and splitting data
30 Deleting bad data
31 Filling missing values
32 Reshaping data
33 Challenge Workshop earnings
34 Solution Workshop earnings

35 Next steps