Pareto analysis: value_counts and cumulative share
Counts, percentages and cumulative share in three lines: pinpoints how many categories account for 80% of the volume.
Prerequisites
Python 3.9+, pandas
Python
import pandas as pd
counts = df["motif_retour"].value_counts()
pareto = pd.DataFrame({
"nb": counts,
"part_pct": (counts / counts.sum() * 100).round(1),
})
pareto["cumul_pct"] = pareto["part_pct"].cumsum().round(1)
top = pareto[pareto["cumul_pct"] <= 80]
print(pareto.head(6))
print(f"\n{len(top)} motifs expliquent 80 % des retours")Result
nb part_pct cumul_pct motif_retour taille_incorrecte 612 38.3 38.3 article_endommage 389 24.3 62.6 ne_correspond_pas 201 12.6 75.2 change_avis 118 7.4 82.6 erreur_livraison 97 6.1 88.7 autre 83 5.2 93.9 3 motifs expliquent 80 % des retours
Pandasvalue_countsParetoAnalyse