Machine Learning

Monotonicity constraints: inject domain knowledge

Forcing the model to respect known relationships (more debt never lowers risk): free regularization, robustness to noise, and a model you can defend before a committee.

Prerequisites

xgboost

Python
import numpy as np
import xgboost as xgb

# 1 = croissant, -1 = décroissant, 0 = libre (ordre des colonnes de X)
colonnes = ["taux_endettement", "anciennete", "revenu", "nb_incidents"]
contraintes = (1, -1, -1, 1)

model = xgb.XGBClassifier(
    n_estimators=800, learning_rate=0.05, max_depth=4,
    monotone_constraints=contraintes,
    early_stopping_rounds=50, eval_metric="auc", random_state=42,
)
model.fit(X_train[colonnes], y_train,
          eval_set=[(X_val[colonnes], y_val)], verbose=False)

# Vérification empirique : score moyen quand on augmente une feature
base = X_val[colonnes].copy()
haut = base.copy(); haut["taux_endettement"] *= 1.5
delta = (model.predict_proba(haut)[:, 1]
         - model.predict_proba(base)[:, 1])
print(f"part de scores qui baissent : {float((delta < 0).mean()):.1%} "
      f"(attendu : 0%)")

Result

part de scores qui baissent : 0.0% (attendu : 0%)
>>> round(float(delta.mean()), 3), round(float(delta.max()), 3)
(0.041, 0.187)
>>> model.best_score   # AUC validation, contrainte incluse
0.8312
MonotonieXGBoostContraintesConformité

Related snippets

Back to the Data Lab