title: “Reading companion”

Reading companion

These chapters follow the same story as the lecture decks, but at a slower pace. Use them before class, after class, or when you miss a session and need the ideas explained in plain language.

We keep two example tracks across the week: simulated genes (where we know the ground truth) and Palmer Penguins (real measurements that are messier and more realistic). For context on every dataset, use the data catalog.

If you want the overall narrative first, read the curriculum outline. If you want the long coding versions, jump to the student notebooks.

Chapter Teaching day What you will understand Open
Chapter 1 — Learning from measurements Day 1 (Mon) Why train/validation/test splits matter and how to avoid leakage Open
Chapter 2 — When coefficients need restraint Day 1 (Mon) Ridge, lasso, and elastic net in plain language Open
Chapter 3 — Rules and trees Day 2 (Tue) How tree models split data and why “two cultures” helps Open
Chapter 4 — A reproducible modeling workflow Day 2 (Tue) How recipe, workflow, last_fit, and tuning fit together Open
Chapter 5 — Stronger learners, same discipline Day 4 (Thu) How random forests, boosting, and MLPs fit into the same pipeline Open
Chapter 6 — Scores that match the question Day 4 (Thu) How to choose metrics under imbalance and read SHAP cautiously Open
Chapter 7 — Choosing what to optimize Day 4 / lab Why different tuning metrics pick different “best” settings Open
Chapter 8 — Comparing learners fairly Day 4 / lab How to compare model families fairly with fixed folds and recipe Open

Chapter 1 — Learning from measurements

This chapter sets up the full course: prediction as a disciplined process, not just fitting a line. You will see the same logic on simulated genes and penguins.

Open Chapter 1 →


Chapter 2 — When coefficients need restraint

This chapter explains why regularization helps when predictors are correlated or numerous, and how ridge, lasso, and elastic net differ in practice.

Open Chapter 2 →


Chapter 3 — Rules and trees

This chapter introduces decision trees as nested if-then rules and places them in the broader “two cultures” view: assumptions-first versus prediction-first.

Open Chapter 3 →


Chapter 4 — A reproducible modeling workflow

This chapter turns modeling into a repeatable workflow: one honest train/test example first, then the same tuned tree pipeline used in Day 2 slides.

Open Chapter 4 →


Chapter 5 — Stronger learners, same discipline

This chapter explains what changes and what stays fixed when you move from single trees to random forests, boosting, and small neural nets.

Open Chapter 5 →


Chapter 6 — Scores that match the question

This chapter focuses on decision-aware evaluation: choosing metrics for the question at hand, handling class imbalance, and reading SHAP with causal caution.

Open Chapter 6 →


Chapter 7 — Choosing what to optimize

This extension chapter shows how metric choice changes tuning outcomes, even when you keep data, folds, and model family fixed.

Open Chapter 7 →


Chapter 8 — Comparing learners fairly

This extension chapter compares three stronger learners fairly by holding preprocessing and folds fixed and changing only the model engine.

Open Chapter 8 →