Hands-On Approach to Learning the Essential Computer Science for Machine Learning Applications
Data Structures, Algorithms, and Machine Learning Optimization LiveLessons provides you with a functional, hands-on understanding of the essential computer science for machine learning applications
Jon Krohn is Chief Data Scientist at the machine learning company untapt. He authored the book Deep Learning Illustrated, an instant #1 bestseller that was translated into six languages. Jon is renowned for his compelling lectures, which he offers in-person at Columbia University and New York University, as well as online via O’Reilly, YouTube, and the SuperDataScience podcast. Jon holds a PhD from Oxford and has been publishing on machine learning in leading academic journals since 2010; his papers have been cited over a thousand times.
Learn How To
- Use “big O” notation to characterize the time efficiency and space efficiency of a given algorithm, enabling you to select or devise the most sensible approach for tackling a particular machine learning problem with the hardware resources available to you.
- Get acquainted with the entire range of the most widely-used Python data structures, including list-, dictionary-, tree-, and graph-based structures.
- Develop a working understanding of all of the essential algorithms for working with data, including those for searching, sorting, hashing, and traversing.
- Discover how the statistical and machine learning approaches to optimization differ, and why you would select one or the other for a given problem you’re solving.
- Understand exactly how the extremely versatile (stochastic) gradient descent optimization algorithm works and how to apply it.
- Familiarize yourself with the “fancy” optimizers that are available for advanced machine learning approaches (e.g., deep learning) and when you should consider using them.
Who Should Take This Course
- You use high-level software libraries (e.g., scikit-learn, Keras, TensorFlow) to train or deploy machine learning algorithms, and would now like to understand the fundamentals underlying the abstractions, enabling you to expand your capabilities
- You’re a software developer who would like to develop a firm foundation for the deployment of machine learning algorithms into production systems
- You’re a data scientist who would like to reinforce your understanding of the subjects at the core of your professional discipline
- You’re a data analyst or AI enthusiast who would like to become a data scientist or data/ML engineer, and so you’re keen to deeply understand the field you’re entering from the ground up (very wise of you!)
- Mathematics: Familiarity with secondary school-level mathematics will make the class easier to follow along with. If you are comfortable dealing with quantitative information–such as understanding charts and rearranging simple equations–then you should be well-prepared to follow along with all of the mathematics.
- Programming: All code demos will be in Python so experience with it or another object-oriented programming language would be helpful for following along with the hands-on examples.
Lesson 1: Orientation to Data Structures and Algorithms
In Lesson 1, Jon provides an orientation to data structures and algorithms. He starts by familiarizing you with his Machine Learning Foundations curriculum and then provides you with historical context on both data and algorithms. He concludes with a discussion of applications of data structures and algorithms to the field of machine learning.
Lesson 2: “Big O” Notation
Lesson 2 focuses on “big O” notation, a fundamental computer science concept that is a prerequisite for understanding almost everything else in these LiveLessons. Jon explores three of the most common “big O” runtimes: constant, linear, and polynomial. He wraps up the lesson with an overview of the other common runtimes and performance variation based on the particular data you are working with.
Lesson 3: List-Based Data Structures
Lesson 3 is all about list-based data structures. Jon surveys all of the key types, including arrays, linked lists, stacks, queues, and deques.
Lesson 4: Searching and Sorting
In Lesson 4, Jon helps you hone your understanding of “big O” notation by applying searching and sorting algorithms to lists. Specifically, he covers binary search and three exemplary sorting algorithms: bubble, merge, and quick.
Lesson 5: Sets and Hashing
In Lesson 5, Jon details maps and dictionaries, which are types of sets. He digs into hash functions, which enable mind-bogglingly efficient data retrieval, including taking into account collisions, load factor, hash maps, string keys, and machine learning applications.
Lesson 6: Trees
In Lesson 6, Jon provides you with an introduction to the trees, a hugely useful data structure in machine learning. He presents specific hands-on examples involving decision trees, random forests, and gradient boosting.
Lesson 7: Graphs
Lesson 7 provides you with an introduction to graphs, another hugely useful data structure in machine learning. Jon discusses graph direction and cycles before wrapping up the coverage of data structures and algorithms with a note on DataFrames and his recommended resources for further study of the computer science field.
Lesson 8: Machine Learning Optimization
With Lesson 8, Jon shifts gears from data structures and algorithms to machine learning-specific optimization. He starts off by discussing when statistical optimization approaches break down and then digs into objective functions, particularly mean absolute error and mean squared error. Jon carries on by detailing how to optimize objective functions with gradient descent and what critical points are. He concludes the lesson with neat tricks like mini-batch sampling, learning rate scheduling, and gradient ascent.
Lesson 9: Fancy Deep Learning Optimizers
Lesson 9 wraps a bow not only on these particular LiveLessons but also on Jon’s entire Machine Learning Foundations series. In this lesson Jon provides an overview of Jacobian and Hessian matrices as well as the fancy deep learning optimizers they facilitate that have momentum and are adaptive. Jon leaves you with his recommended next steps for moving forward with your machine learning journey.
Table of Contents
1 Data Structures, Algorithms, and Machine Learning Optimization – Introduction
3 Orientation to the Machine Learning Foundations Series
4 A Brief History of Data
5 A Brief History of Algorithms
6 Applications to Machine Learning
9 Constant Time
10 Linear Time
11 Polynomial Time
12 Common Runtimes
13 Best versus Worst Case
17 Linked Lists
18 Doubly-Linked Lists
23 Binary Search
24 Bubble Sort
25 Merge Sort
26 Quick Sort
28 Maps and Dictionaries
30 Hash Functions
32 Load Factor
33 Hash Maps
34 String Keys
35 Hashing in ML
38 Decision Trees
39 Random Forests
40 XGBoost – Gradient-Boosted Trees
41 Additional Concepts
44 Directed versus Undirected Graphs
45 DAGs – Directed Acyclic Graphs
46 Additional Concepts
47 Bonus – Pandas DataFrames
48 Resources for Further Study of DSA
50 Statistics versus Machine Learning
51 Objective Functions
52 Mean Absolute Error
53 Mean Squared Error
54 Minimizing Cost with Gradient Descent
55 Gradient Descent from Scratch with PyTorch
56 Critical Points
57 Stochastic Gradient Descent
58 Learning Rate Scheduling
59 Maximizing Reward with Gradient Ascent
61 Jacobian Matrices
62 Second-Order Optimization and Hessians
64 Adaptive Optimizers
65 Congratulations and Next Steps
66 Data Structures, Algorithms, and Machine Learning Optimization – Summary