title: “Reading companion”

Reading companion

These chapters follow the same story as the lecture decks, but at a slower pace. Use them before class, after class, or when you miss a session and need the ideas explained in plain language.

We keep two example tracks across the week: simulated genes (where we know the ground truth) and Palmer Penguins (real measurements that are messier and more realistic). For context on every dataset, use the data catalog.

If you want the overall narrative first, read the curriculum outline. If you want the long coding versions, jump to the student notebooks.

Chapter	Teaching day	What you will understand	Open
Chapter 1 — Learning from measurements	Day 1 (Mon)	Why train/validation/test splits matter and how to avoid leakage	Open
Chapter 2 — When coefficients need restraint	Day 1 (Mon)	Ridge, lasso, and elastic net in plain language	Open
Chapter 3 — Rules and trees	Day 2 (Tue)	How tree models split data and why “two cultures” helps	Open
Chapter 4 — A reproducible modeling workflow	Day 2 (Tue)	How `recipe`, `workflow`, `last_fit`, and tuning fit together	Open
Chapter 5 — Stronger learners, same discipline	Day 4 (Thu)	How random forests, boosting, and MLPs fit into the same pipeline	Open
Chapter 6 — Scores that match the question	Day 4 (Thu)	How to choose metrics under imbalance and read SHAP cautiously	Open
Chapter 7 — Choosing what to optimize	Day 4 / lab	Why different tuning metrics pick different “best” settings	Open
Chapter 8 — Comparing learners fairly	Day 4 / lab	How to compare model families fairly with fixed folds and recipe	Open

Chapter 1 — Learning from measurements

This chapter sets up the full course: prediction as a disciplined process, not just fitting a line. You will see the same logic on simulated genes and penguins.

Open Chapter 1 →

Chapter 2 — When coefficients need restraint

This chapter explains why regularization helps when predictors are correlated or numerous, and how ridge, lasso, and elastic net differ in practice.

Open Chapter 2 →

Chapter 3 — Rules and trees

This chapter introduces decision trees as nested if-then rules and places them in the broader “two cultures” view: assumptions-first versus prediction-first.

Open Chapter 3 →

Chapter 4 — A reproducible modeling workflow

This chapter turns modeling into a repeatable workflow: one honest train/test example first, then the same tuned tree pipeline used in Day 2 slides.

Open Chapter 4 →

Chapter 5 — Stronger learners, same discipline

This chapter explains what changes and what stays fixed when you move from single trees to random forests, boosting, and small neural nets.

Open Chapter 5 →

Chapter 6 — Scores that match the question

This chapter focuses on decision-aware evaluation: choosing metrics for the question at hand, handling class imbalance, and reading SHAP with causal caution.

Open Chapter 6 →

Chapter 7 — Choosing what to optimize

This extension chapter shows how metric choice changes tuning outcomes, even when you keep data, folds, and model family fixed.

Open Chapter 7 →

Chapter 8 — Comparing learners fairly

This extension chapter compares three stronger learners fairly by holding preprocessing and folds fixed and changing only the model engine.

Open Chapter 8 →