Machine Learning and AI Beyond the Basics

by Sebastian Raschka

March 2024, 232 pp.

ISBN-13:

9781718503762

Download Chapter 17: Encoder- And Decoder-Style Transformers

If you’ve locked down the basics of machine learning and AI and want a fun way to address lingering knowledge gaps, this book is for you. This rapid-fire series of short chapters addresses 30 essential questions in the field, helping you stay current on the latest technologies you can implement in your own work.

Each chapter of Machine Learning and AI Beyond the Basics asks and answers a central question, with diagrams to explain new concepts and ample references for further reading. This practical, cutting-edge information is missing from most introductory coursework, but critical for real-world applications, research, and acing technical interviews. You won’t need to solve proofs or run code, so this book is a perfect travel companion or coffee table read. You’ll learn a wide range of new concepts in deep neural network architectures, computer vision, natural language processing, production and deployment, and model evaluation, including how to:

Reduce overfitting with altered data or model modifications
Handle common sources of randomness when training deep neural networks
Speed up model inference through optimization without changing the model architecture or sacrificing accuracy
Practically apply the lottery ticket hypothesis and the distributional hypothesis
Use and finetune pretrained large language models
Set up k-fold cross-validation at the appropriate time

You’ll also learn to distinguish between self-attention and regular attention; name the most common data augmentation techniques for text data; use various self-supervised learning techniques, multi-GPU training paradigms, and types of generative AI; and much more.

Whether you’re a machine learning beginner or an experienced practitioner, add new techniques to your arsenal and keep abreast of exciting developments in a rapidly changing field.

Author Bio

Sebastian Raschka, PhD, is a machine learning and AI researcher with a passion for education. As Lead AI Educator at Lightning AI, he is excited about making AI and deep learning more accessible. Raschka previously was Assistant Professor of Statistics at the University of Wisconsin-Madison, where he specialized in researching deep learning and machine learning, and is the author of the bestselling books Python Machine Learning and Machine Learning with PyTorch and Scikit-Learn. You can find out more about his research on his website at https://sebastianraschka.com.

Table of contents

Introduction
PART I: NEURAL NETWORKS AND DEEP LEARNING
Chapter 1: Embeddings, Representations, and Latent Space
Chapter 2: Self-Supervised Learning
Chapter 3: Few-Shot Learning
Chapter 4: The Lottery Ticket Hypothesis
Chapter 5: Reducing Overfitting with Data
Chapter 6: Reducing Overfitting with Model Modifications
Chapter 7: Multi-GPU Training Paradigms
Chapter 8: The Keys to the Success of Transformers
Chapter 9: Generative AI Models
Chapter 10: Sources of Randomness
PART II: COMPUTER VISION
Chapter 11: Calculating the Number of Parameters
Chapter 12: The Equivalence of Fully Connected and Convolutional Layers
Chapter 13: Large Training Sets for Vision Transformers
PART III: NATURAL LANGUAGE PROCESSING
Chapter 14: The Distributional Hypothesis
Chapter 15: Data Augmentation for Text
Chapter 16: “Self”-Attention
Chapter 17: Encoder- And Decoder-Style Transformers
Chapter 18: Using and Finetuning Pretrained Transformers
Chapter 19: Evaluating Generative Large Language Models
PART IV: PRODUCTION AND DEPLOYMENT
Chapter 20: Stateless And Stateful Training
Chapter 21: Data-Centric AI
Chapter 22: Speeding Up Inference
Chapter 23: Data Distribution Shifts
PART V: PREDICTIVE PERFORMANCE AND MODEL EVALUATION
Chapter 24: Poisson and Ordinal Regression
Chapter 25: Confidence Intervals
Chapter 26: Confidence Intervals Versus Conformal Predictions
Chapter 27: Proper Metrics
Chapter 28: The K in K-Fold Cross-Validation
Chapter 29: Training and Test Set Discordance
Chapter 30: Limited Labeled Data
Afterword
Appendix: Answers to Exercises
Index

The chapters in red are included in this Early Access PDF.

Reviews

“Sebastian has a gift for distilling complex, AI-related topics into practical takeaways that can be understood by anyone. His new book, Machine Learning and AI Beyond the Basics, is another great resource for AI practitioners of any level.”
–Cameron R. Wolfe, Writer of Deep (Learning) Focus

“Sebastian uniquely combines academic depth, engineering agility, and the ability to demystify complex ideas. He can go deep into any theoretical topics, experiment to validate new ideas, then explain them all to you in simple words. If you’re starting your journey into machine learning, Sebastian is your guide.”
–Chip Huyen, Author of Designing Machine Learning Systems

“One could hardly ask for a better guide than Sebastian, who is, without exaggeration, the best machine learning educator currently in the field. On each page, Sebastian not only imparts his extensive knowledge but also shares the passion and curiosity that mark true expertise.”
–Chris Albon, Director of Machine Learning, The Wikimedia Foundation