Adversarial validation: are train and test comparable?
Train a classifier to tell train from production: an AUC near 0.5 means similar distributions; above 0.7, the most important features point to the source of the drift.
Prerequisites
scikit-learn, numpy, pandas
Python
import numpy as np
import pandas as pd
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import cross_val_score
X_adv = pd.concat([X_train, X_prod], ignore_index=True)
y_adv = np.r_[np.zeros(len(X_train)), np.ones(len(X_prod))]
clf = RandomForestClassifier(n_estimators=200, max_depth=6,
n_jobs=-1, random_state=42)
auc = cross_val_score(clf, X_adv, y_adv, cv=5, scoring="roc_auc").mean()
print(f"AUC adversariale : {auc:.3f}")
print("~0.50 = distributions similaires | >0.70 = drift sérieux")
if auc > 0.6:
clf.fit(X_adv, y_adv)
imp = pd.Series(clf.feature_importances_, index=X_adv.columns)
print("Features qui trahissent l'époque/la source :")
print(imp.sort_values(ascending=False).head(5).round(3))Result
AUC adversariale : 0.731 ~0.50 = distributions similaires | >0.70 = drift sérieux Features qui trahissent l'époque/la source : montant 0.412 delai_jours 0.218 solde_moyen 0.097 age 0.054 nb_produits 0.041 dtype: float64
Adversarial validationDriftDiagnosticDistribution