Web Scraping with Python

Web Scraping with Python

English | MP4 | AVC 1280×720 | AAC 48KHz 2ch | 1h 23m | 332 MB


Instructor Ryan Mitchell teaches the practice of web scraping using the Python programming language. Ryan helps you understand how a human browsing the web is different from a web scraper. She introduces the Chrome developer tools and how to use them to examine network calls. Ryan shows you how to install Scrapy with pip and how to write some “Hello, World” code to scrape a simple web page. She covers how to use the Scrapy LinkExtractor to find internal links on a web page, then demonstrates how to configure Scrapy and the ItemPipeline to write data to various file formats. Ryan walks you through best practices for organizing your projects, writing reusable parsers, and future-proofing your spiders. She explains how APIs work and how they can be used to retrieve data directly. Ryan explores headers and cookies, then goes into browser automation and how to integrate Selenium with Scrapy. In conclusion, she offers ideas to continue your studies in computer science and think creatively about automation.

+ Table of Contents

1 How to learn to stop worrying and love the bot
2 What you should know
3 What is web scraping
4 How the internet works A brief summary
5 Hello world with Scrapy
6 Challenge Scraping all data on a page
7 Solution Scraping all data on a page
8 Crawling a website
9 Recording data
10 Scrapy settings file
11 Structuring your scrapers for extensibility reusability
12 Challenge Scraping news sites
13 Solution Scraping news sites
14 Submitting a form
15 Finding and using hidden APIs
16 Site maps and robots.txt
17 Challenge Using CNN’s sitemap
18 Solution Using CNN’s sitemap
19 Logging in
20 Browser automation with Selenium
21 Interacting with a page
22 Next steps