Modern Computer Vision with PyTorch: Concepts and hands-on Implementations of over 50 real-world image applications of deep learning

Modern Computer Vision with PyTorch: Concepts and hands-on Implementations of over 50 real-world image applications of deep learning

English | 2020 | ISBN: 978-1839213472 | 647 Pages | PDF, EPUB, MOBI | 417 MB


Packed with hands-on implementations of deep learning techniques to build image processing applications using PyTorch. Each chapter is accompanied by a GitHub folder with code notebooks and questions to cement your understanding.
Deep learning for computer vision (CV) has had a considerable positive impact on several applications.
First you will learn to implement a neural network (NN) from scratch using both NumPy, PyTorch and then learn the best practices of tweaking a NN’s hyper-parameters.
As we progress, you will learn about CNNs, transfer-learning with a focus on classifying images. You will also learn about the practical aspects to take care of while building a NN model.
Next you will learn about multi-object detection, segmentation and implement them using R-CNN family, SSD, YOLO, U-Net, Mask-RCNN architectures. You will then learn to use Detectron2 framework to simplify the process of building a NN for object detection and human-pose-estimation. Finally you will implement 3-D object detection.
Subsequently, you will learn about auto-encoders and GANs with a strong focus on image manipulation and generation. Here, you will implement VAE, DCGAN, CGAN, Pix2Pix, CycleGan, StyleGAN2, SRGAN, Style-Transfer.
You will then learn to combine NLP and CV techniques while performing OCR, Image Captioning, object detection with transformers. Next, you will learn to combine RL with CV techniques to implement a self-driving car agent.
Finally, you’ll wrap up with moving a NN model to production and learn conventional CV techniques using open-cv library.
What you will learn

  • Train a neural network from scratch in NumPy and then in PyTorch
  • Implement 2D, 3D multi-object detection and segmentation
  • Generate digits, DeepFakes, HD-Faces with autoencoders and advanced GANs
  • Manipulate images using CycleGAN, Pix2PixGAN, StyleGAN2 and SRGAN
  • Combine CV, NLP to perform OCR, image captioning, object detection
  • Combine CV, RL to build agents that play pong and self-drive a car
  • Deploy a Deep Learning model on AWS server using FastAPI, Docker
  • Dive deep and implement over 35 NN architectures and common OpenCV utilities