Machine Learning

Nested CV: estimate performance AFTER tuning

A GridSearchCV score is optimistic because the hyperparameters were chosen on those very folds. The outer loop of a nested CV gives the unbiased estimate of the full procedure.

Prerequisites

scikit-learn

Python
from sklearn.model_selection import GridSearchCV, cross_val_score, KFold
from sklearn.ensemble import RandomForestClassifier

inner = KFold(n_splits=3, shuffle=True, random_state=1)   # tuning
outer = KFold(n_splits=5, shuffle=True, random_state=2)   # évaluation

search = GridSearchCV(
    RandomForestClassifier(random_state=42),
    {"max_depth": [4, 8, None], "min_samples_leaf": [1, 5, 20]},
    cv=inner, scoring="roc_auc", n_jobs=-1,
)

# La boucle externe ne sert QU'À évaluer la procédure (tuning inclus)
scores = cross_val_score(search, X, y, cv=outer, scoring="roc_auc")
print(f"AUC nested (non biaisée) : {scores.mean():.3f} +/- {scores.std():.3f}")

# Le score interne, lui, est typiquement plus haut (optimiste) :
search.fit(X, y)
print(f"AUC interne du tuning    : {search.best_score_:.3f}")

Result

AUC nested (non biaisée) : 0.842 +/- 0.018
AUC interne du tuning    : 0.871
>>> search.best_params_
{'max_depth': 8, 'min_samples_leaf': 5}
Nested CVGridSearchCVBiaisÉvaluation

Related snippets

Back to the Data Lab