Scrape Websites using PhantomJS and CasperJS

Scrape Websites using PhantomJS and CasperJS
Scrape Websites using PhantomJS and CasperJS
English | MP4 | AVC 1280×720 | AAC 44KHz 2ch | 2 Hours | 718 MB

Become a Better JavaScript Developer and Learn to Mine Data from Websites. Use NodeJS, jQuery and Functional Programming

In this course you will learn how to scrape data from web pages using CasperJS.

This course consists of 5 example projects to help you fully understand the powers of the headless browser using the CasperJS API.

You Will Learn

You will gain a thorough understanding of advanced web scraping concepts and also gain an insight into how to use the CasperJS for Testing DOM manipulation and UI interaction.

What to Expect

  • We’ll begin with an overview of how both PhantomJS and CasperJS works along with how to install these frameworks.
  • Next, we’ll discuss what our workflow will look like and the options we can pass into a Casper object.
  • Then we’ll dive into the meat of this course by working through 5 projects.

The Projects Will Cover

  • How to scrape websites that are rendered with JavaScript instead of standard HTML
  • How to wait for AJAX loaded data to appear before scraping elements
  • How to submit forms both for Authorization and when making searches
  • How to define navigation Steps – like logging into a site, clicking a button and following links
  • How to write and save specified data in tables then output as an .html file or as JSON.
  • And how to take screenshots both of full web pages and specific containers

What is PhantomJS?

PhantomJS is a Full Web Stack that employs a headless browser. Phantom gives us the power to perform many interesting actions on a web page, such as: performing page manipulation, simulating user interaction and the ability to dynamically capture and save website data.

What is CasperJS?

CasperJS is a stand-alone framework built on top Phantom and is compatible with most operating systems. The focus of this course will be on the Casper API and we’ll be using this API to write all our web scraping scripts.

What You Should Know

You should already know JavaScript basics. Including what a callback function is. It will help if you know some jQuery. We use lodash in of our examples but only as a replacement for the built-in Map method that’s part of the native Javascript API.

By the End You Will Be Able to

  • Know how to use JavaScript for Data Mining
  • Be able to Capture, Download and Save Website Data
  • Understand how to use CasperJS and PhantomJS
  • Create Your Own Scripts for Scraping Data
  • Apply What You’ve Learned to UI Testing
  • Have a Better Understanding of Functional Programming,
  • Fully Understand JavaScript and jQuery Selectors
Table of Contents

1 Introduction
Intro and Projects Overview
Disclaimer

2 Overview and Install
What is PhantomJS
What is CasperJS
Installing PhantomJS
Installing CasperJS

3 Getting Started with CasperJS
Setting Up a Project
Options and Workflow

4 Scraping Search Results
Introduction to Project
Get Results from Bing.com
Get Results from Bing.com – Part 2

5 Scraping JavaScript Rendered Web Pages
Introduction to Project
Scraping JS-Rendered – Part 1
Scraping JS-Rendered – Part 2
Scraping JS-Rendered – Part 3
Scraping JS-Rendered – Part 4

6 Scraping Hotel Data
Introduction to Project
Project Setup
Get Names and Prices – Part 1
Get Names and Prices – Part 2

7 Scrape and Capture Multiple Pages
Introduction to Project
Project Setup
Scrape Product Reviews Page – Part 1
Scrape Product Reviews Page – Part 2
Scrape Product Reviews Page – Part 3
Scrape Product Reviews Page – Part 4

8 Log In and Search
Introduction to Project
Twitter Log In Search
Twitter Log In Search – Part 2

9 Conclusion
Extras and Tips
Thank You
Bonus Lecture.html