The Kolmogorov–Smirnov Statistic, Explained: Measuring Model Power in Credit Risk Modeling

days, individuals are taking extra loans than ever. For anybody who desires to construct their very own home, house loans can be found and should you personal a property, you may get a property mortgage. There are additionally agriculture loans, training loans, enterprise loans, gold loans, and plenty of extra.

Along with these, for purchasing objects like televisions, fridges, furnishings and cell phones, we even have EMI choices.

However does everybody get their mortgage software authorised?

Banks don’t give loans to each one that applies; there’s a course of they observe to approve loans.

We all know that machine studying and information science at the moment are utilized throughout industries, and banks additionally make use of them.

When a buyer applies for a mortgage, banks must know the probability that the client will repay on time.

For this, banks use predictive fashions, primarily primarily based on logistic regression or different machine studying strategies,

We already know that by making use of these strategies, every applicant is assigned a likelihood.

This can be a classification mannequin, and we have to classify defaulters and non-defaulters.

Defaulters: Prospects who fail to repay their mortgage (miss funds or cease paying altogether).

Non-defaulters: Prospects who repay their loans on time.

We already discussed accuracy and ROC-AUC to guage the classification fashions.

On this article, we’re going to focus on the Kolmogorov-Smirnov Statistic (KS Statistic) which is used to guage classification fashions particularly within the banking sector.

To grasp the KS Statistic, we are going to use the German Credit score Dataset.

This dataset comprises details about 1000 mortgage candidates, describe by 20 options similar to similar to account standing, mortgage length, credit score quantity, employment, housing, and private standing and so forth.

The goal variable signifies whether or not the applicant is non-defaulter (represented by 1) or defaulter (represented by 2).

You will discover the details about dataset here.

Now we have to construct a classification mannequin to categorise the candidates. Since it’s a binary classification downside, we are going to apply logistic regression on this dataset.

Code:

import pandas as pd
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split

# Load dataset
file_path = "C:/german.information"
information = pd.read_csv(file_path, sep=" ", header=None)

# Rename columns
columns = [f"col_{i}" for i in range(1, 21)] + ["target"]
information.columns = columns

# Options and goal
X = pd.get_dummies(information.drop(columns=["target"]), drop_first=True)
y = information["target"]   # preserve as 1 and a couple of

# Prepare-test cut up
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.3, random_state=42, stratify=y
)

# Prepare logistic regression
mannequin = LogisticRegression(max_iter=10000)
mannequin.match(X_train, y_train)

# Predicted possibilities
y_pred_proba = mannequin.predict_proba(X_test)

# Outcomes DataFrame
outcomes = pd.DataFrame({
    "Precise": y_test.values,
    "Pred_Prob_Class2": y_pred_proba[:, 1]
})

print(outcomes.head())

We already know that once we apply logistic regression, we get predicted possibilities.

Picture by Writer

Now to know how KS Statistic is calculated, let’s take into account a pattern of 10 factors from this output.

Right here the best predicted likelihood is 0.92, which implies there may be 92% probability that this applicant will default.

Now let’s proceed with KS Statistic calculation.

First, we are going to kind the candidates by their predicted possibilities in descending order, in order that greater threat candidates will probably be on the prime.

We already know that ‘1’ represents non-defaulters and ‘2’ represents defaulters.

In subsequent step, we calculate the cumulative depend of non-defaulters and defaulters at every step.

In subsequent step, we convert cumulative counts of defaulters and non-defaulters into cumulative charges.

We divide the cumulative defaulters by the full variety of defaulters, and the cumulative non-defaulters by the full variety of non-defaulters.

Subsequent, we calculate absolutely the distinction between the cumulative defaulter fee and cumulative non-defaulter fee.

The utmost distinction between cumulative defaulter fee and cumulative non-defaulter fee is 0.83, which is the KS Statistic for this pattern.

Right here the KS Statistic is 0.83, occurred at a likelihood of 0.29.

This implies the mannequin captures defaulters 83% extra successfully than non-defaulters at this threshold.

Right here, we will observe that:

Cumulative Defaulter Charge = True Constructive Charge (what number of precise defaulters we now have captured to this point).

Cumulative Non-Defaulter Charge = False Constructive Charge (what number of non-defaulters are incorrectly captured as defaulters).

However as we haven’t mounted any threshold right here, how can we get True Constructive and False Constructive charges?

Let’s see how cumulative charges are equal to TPR and FPR.

First, we take into account each likelihood as a threshold and calculate TPR and FPR.

[
begin{aligned}
mathbf{At threshold 0.92:} & [4pt]
TP &= 1,quad FN = 3,quad FP = 0,quad TN = 6[6pt]
TPR &= tfrac{1}{4} = 0.25[6pt]
FPR &= tfrac{0}{6} = 0[6pt]
Rightarrow (mathrm{FPR},,mathrm{TPR}) &= (0,,0.25)
finish{aligned}
]

[
begin{aligned}
mathbf{At threshold 0.63:} & [4pt]
TP &= 2,quad FN = 2,quad FP = 0,quad TN = 6[6pt]
TPR &= tfrac{2}{4} = 0.50[6pt]
FPR &= tfrac{0}{6} = 0[6pt]
Rightarrow (mathrm{FPR},,mathrm{TPR}) &= (0,,0.50)
finish{aligned}
]
[
begin{aligned}
mathbf{At threshold 0.51:} & [4pt]
TP &= 3,quad FN = 1,quad FP = 0,quad TN = 6[6pt]
TPR &= tfrac{3}{4} = 0.75[6pt]
FPR &= tfrac{0}{6} = 0[6pt]
Rightarrow (mathrm{FPR},,mathrm{TPR}) &= (0,,0.75)
finish{aligned}
]
[
begin{aligned}
mathbf{At threshold 0.39:} & [4pt]
TP &= 3,quad FN = 1,quad FP = 1,quad TN = 5[6pt]
TPR &= tfrac{3}{4} = 0.75[6pt]
FPR &= tfrac{1}{6} approx 0.17[6pt]
Rightarrow (mathrm{FPR},,mathrm{TPR}) &= (0.17,,0.75)
finish{aligned}
]
[
begin{aligned}
mathbf{At threshold 0.29:} & [4pt]
TP &= 4,quad FN = 0,quad FP = 1,quad TN = 5[6pt]
TPR &= tfrac{4}{4} = 1.00[6pt]
FPR &= tfrac{1}{6} approx 0.17[6pt]
Rightarrow (mathrm{FPR},,mathrm{TPR}) &= (0.17,,1.00)
finish{aligned}
]
[
begin{aligned}
mathbf{At threshold 0.20:} & [4pt]
TP &= 4,quad FN = 0,quad FP = 2,quad TN = 4[6pt]
TPR &= tfrac{4}{4} = 1.00[6pt]
FPR &= tfrac{2}{6} approx 0.33[6pt]
Rightarrow (mathrm{FPR},,mathrm{TPR}) &= (0.33,,1.00)
finish{aligned}
]
[
begin{aligned}
mathbf{At threshold 0.13:} & [4pt]
TP &= 4,quad FN = 0,quad FP = 3,quad TN = 3[6pt]
TPR &= tfrac{4}{4} = 1.00[6pt]
FPR &= tfrac{3}{6} = 0.50[6pt]
Rightarrow (mathrm{FPR},,mathrm{TPR}) &= (0.50,,1.00)
finish{aligned}
]
[
begin{aligned}
mathbf{At threshold 0.10:} & [4pt]
TP &= 4,quad FN = 0,quad FP = 4,quad TN = 2[6pt]
TPR &= tfrac{4}{4} = 1.00[6pt]
FPR &= tfrac{4}{6} approx 0.67[6pt]
Rightarrow (mathrm{FPR},,mathrm{TPR}) &= (0.67,,1.00)
finish{aligned}
]
[
begin{aligned}
mathbf{At threshold 0.05:} & [4pt]
TP &= 4,quad FN = 0,quad FP = 5,quad TN = 1[6pt]
TPR &= tfrac{4}{4} = 1.00[6pt]
FPR &= tfrac{5}{6} approx 0.83[6pt]
Rightarrow (mathrm{FPR},,mathrm{TPR}) &= (0.83,,1.00)
finish{aligned}
]
[
begin{aligned}
mathbf{At threshold 0.01:} & [4pt]
TP &= 4,quad FN = 0,quad FP = 6,quad TN = 0[6pt]
TPR &= tfrac{4}{4} = 1.00[6pt]
FPR &= tfrac{6}{6} = 1.00[6pt]
Rightarrow (mathrm{FPR},,mathrm{TPR}) &= (1.00,,1.00)
finish{aligned}
]

From the above calculations, we will see that the cumulative defaulter fee corresponds to the True Constructive Charge (TPR), and the cumulative non-defaulter fee corresponds to the False Constructive Charge (FPR).

When calculating the cumulative default fee and cumulative non-default fee, every row represents a threshold, and the speed is calculated as much as that row.

Right here we will observe that KS Statistic = Max (|TPR – FPR|)

Now let’s calculate the KS Statistic for full dataset.

Code:

# Create DataFrame with precise and predicted probs
outcomes = pd.DataFrame({
    "Precise": y.values,
    "Pred_Prob_Class2": y_pred_proba
})

# Mark defaulters (2) and non-defaulters (1)
outcomes["is_defaulter"] = (outcomes["Actual"] == 2).astype(int)
outcomes["is_nondefaulter"] = 1 - outcomes["is_defaulter"]

# Kind by predicted likelihood
outcomes = outcomes.sort_values("Pred_Prob_Class2", ascending=False).reset_index(drop=True)

# Totals
total_defaulters = outcomes["is_defaulter"].sum()
total_nondefaulters = outcomes["is_nondefaulter"].sum()

# Cumulative counts and charges
outcomes["cum_defaulters"] = outcomes["is_defaulter"].cumsum()
outcomes["cum_nondefaulters"] = outcomes["is_nondefaulter"].cumsum()
outcomes["cum_def_rate"] = outcomes["cum_defaulters"] / total_defaulters
outcomes["cum_nondef_rate"] = outcomes["cum_nondefaulters"] / total_nondefaulters

# KS statistic
outcomes["KS"] = (outcomes["cum_def_rate"] - outcomes["cum_nondef_rate"]).abs()
ks_value = outcomes["KS"].max()
ks_index = outcomes["KS"].idxmax()

print(f"KS Statistic = {ks_value:.3f} at likelihood {outcomes.loc[ks_index, 'Pred_Prob_Class2']:.4f}")

# Plot KS curve
plt.determine(figsize=(8,6))
plt.plot(outcomes.index, outcomes["cum_def_rate"], label="Cumulative Defaulter Charge (TPR)", colour="purple")
plt.plot(outcomes.index, outcomes["cum_nondef_rate"], label="Cumulative Non-Defaulter Charge (FPR)", colour="blue")

# Spotlight KS level
plt.vlines(x=ks_index,
           ymin=outcomes.loc[ks_index, "cum_nondef_rate"],
           ymax=outcomes.loc[ks_index, "cum_def_rate"],
           colours="inexperienced", linestyles="--", label=f"KS = {ks_value:.3f}")

plt.xlabel("Candidates (sorted by predicted likelihood)")
plt.ylabel("Cumulative Charge")
plt.title("Kolmogorov–Smirnov (KS) Curve")
plt.legend(loc="decrease proper")
plt.grid(True)
plt.present()

Plot:

The utmost hole is 0.530 at likelihood of 0.2928.

As we understood the best way to calculate the KS Statistic, let’s focus on the importance of this statistic.

Right here we constructed a classification mannequin and evaluated it utilizing the KS Statistic, however we additionally produce other classification metrics like accuracy, ROC-AUC, and so forth.

We already know that accuracy is particular to at least one threshold, and it adjustments in response to the brink.

ROC-AUC offers us a quantity which exhibits the general rating means of the mannequin.

However why is the KS Statistic utilized in Banks?

The KS statistic offers a single quantity, which represents the utmost hole between the cumulative distributions of defaulters and non-defaulters.

Let’s return to our pattern information.

We obtained KS Statistic 0.83 at likelihood of 0.29.

We already mentioned that every row acts as a threshold.

So, what occurred at 0.29?

Threshold = 0.29 means the chances are higher than or equal to 0.29 are flagged as defaulters.

At 0.29, the highest 5 rows flagged as defaulters. Amongst these 5, 4 are precise defaulters and one is non-defaulter incorrectly predicted as defaulter.

Right here True Positives = 4 and False Constructive = 1.

The remaining 5 rows will probably be predicted as non-defaulters.

At this level, the mannequin has captured all of the 4 defaulters and one non-defaulter incorrectly flagged as defaulter.

Right here TPR is maxed out at 1 and FPR is 0.17.

So, KS Statistic = 1-0.17 = 0.83.

If we go additional and calculate for different possibilities as we accomplished earlier, we will observe that there will probably be no change in TPR however there will probably be improve in FPR, which leads to flagging extra non-defaulters as defaulters.

This reduces the hole between two teams.

Right here we will say that at 0.29, mannequin denied all defaulters and 17% of non-defaulters (in response to pattern information) and authorised 83% of defaulters.

Do banks determine the brink primarily based on the KS Statistic?

Whereas the KS Statistic exhibits the utmost hole between two teams, banks don’t determine threshold primarily based on this statistic.

The KS Statistic is used to validate the mannequin energy, whereas the precise threshold is set by contemplating threat, profitability and regulatory tips.

If KS is under 20, it’s thought-about as a weak mannequin.
Whether it is between 20-40, it’s thought-about acceptable.
If KS is within the vary of 50-70, it’s thought-about as a very good mannequin.

Dataset

The dataset used on this weblog is the German Credit dataset, which is publicly obtainable on the UCI Machine Studying Repository. It’s supplied beneath the Creative Commons Attribution 4.0 International (CC BY 4.0) License. This implies it may be freely used and shared with correct attribution.

I hope this weblog put up has given you a fundamental understanding of the Kolmogorov–Smirnov statistic. In case you loved studying, take into account sharing it together with your community, and be happy to share your ideas.

In case you haven’t learn my weblog on ROC–AUC but, you possibly can test it out here.

Thanks for studying!

Source link

Creating AI that matters | MIT News

Scaling Recommender Transformers to a Billion Parameters

Hidden Gems in NumPy: 7 Functions Every Data Scientist Should Know

Hyper-Realistic AI Video Is Outpacing Our Ability to Label It

Builder.ai kraschade när sanningen kom fram – AI-koden gjordes av indiska programmerare

Ethical AI Innovations for Empowering Linguistic Diversity and Economic Empowerment

10 Data + AI Observations for Fall 2025

US investigators are using AI to detect child abuse images made by AI

Most Popular

Animating Linear Transformations with Quiver

New to LLMs? Start Here | Towards Data Science

Will you be the boss of your own AI workforce?

Our Picks

OpenAIs nya webbläsare ChatGPT Atlas

Creating AI that matters | MIT News

Scaling Recommender Transformers to a Billion Parameters

The Kolmogorov–Smirnov Statistic, Explained: Measuring Model Power in Credit Risk Modeling

However does everybody get their mortgage software authorised?

Do banks determine the brink primarily based on the KS Statistic?

Related Posts