Machine Learning

Machine Learning in the real world

The notebook of an ML practitioner who learned to distrust their own numbers. Three running themes: hunting down data leakage and inflated scores, statistical rigor in model comparison (paired bootstrap, 5x2cv, McNemar), and the leap from an academic score to a business decision (cost-based threshold, audited calibration). We show the buggy code AND the correct code, and always quantify the gap.

20 featured snippets

Back to the Data Lab