Parcours : Data Scientist

Model Interpretability

with SHAP

Open the black box — understand why your model predicts what it predicts


Parcours : Data Scientist   Duration: ~90 min

About this masterclass

What you will learn

Learning objectives
  • Explain why interpretability matters — beyond accuracy alone
  • Use classic techniques (trees, coefficients, PCA) and know their limits
  • Apply SHAP to tree models for global and local explanations
  • Read summary, dependence, and force plots with confidence
  • Extend SHAP to a multiclass text classification setting
  • Run an independent SHAP analysis on your own dataset
Model Interpretability with SHAP · Parcours : Data Scientist

The problem

A 98% accurate model can still be wrong

← the model looks HERE
The classic story

A wolf vs. husky classifier reaches 98% accuracy. Great model? It actually looks at the snow in the background, not the animal.

The lesson

High performance is not enough. Interpretability is the bridge between performance and trust.

Model Interpretability with SHAP · Parcours : Data Scientist

Why it matters

Four reasons to open the black box

1 · Trust
Stakeholders won't deploy what they can't understand.
2 · Debugging
Spot shortcuts, leakage, spurious correlations.
3 · Fairness
Audit sensitive features like gender, age, ethnicity.
4 · Regulation
GDPR, EU AI Act, credit scoring: a right to explanation.

Two questions, always: Is the model effective? And is its reasoning acceptable?

Model Interpretability with SHAP · Parcours : Data Scientist

Roadmap

Our three-part journey

Part Question we'll answer Data Main tool
Part 1 How far can classic interpretability take us? Census (tabular) Tree, coefficients, PCA, SHAP
Part 2 Does the same logic work on text and multiclass? Emotions (text) Multiclass SHAP
Part 3 Can you run the workflow alone? Your dataset Independent SHAP analysis

Model Interpretability with SHAP · Parcours : Data Scientist

Part 1

From simple models

to the black box

Tabular data · Census income prediction

Model Interpretability with SHAP · Parcours : Data Scientist

The dataset

Predicting US Census income

Target
Does this person earn > $50K per year?  → binary classification
Features
age · education · marital status · occupation · gender · capital gain/loss · hours worked · native country
Why this dataset?

Features are human-readable. When the model highlights marital-status or hours-per-week, we can immediately judge whether the pattern is reasonable — or worth auditing.

Model Interpretability with SHAP · Parcours : Data Scientist

Interpretable by design · 1

Reading a decision tree

                    [ROOT NODE]
                   feature ≤ value
                  /               \
              True                False
          [left child]        [right child]
            |                   |
          [LEAF]              [LEAF]
         class: A            class: B
gini — impurity · 0 = pure leaf, 0.5 = mixed
samples — how many training rows reached this node
value — class counts → predicted majority class
In the notebook → DecisionTreeClassifier(max_depth=3, random_state=42) — depth 3 to stay readable.

The feature at the root is the strongest first separator in the data.

Model Interpretability with SHAP · Parcours : Data Scientist

Interpretable by design · 2

What the tree tells us (and hides)

What it gives us
  • Global view — main decision rules
  • Local view — one person's path
  • Full transparency — read line by line
What it hides
  • Rich interactions between many variables
  • Strong predictive performance
  • A method that transfers to non-tree models

Readable only because it's shallow. Grow the tree, lose the interpretability.

Model Interpretability with SHAP · Parcours : Data Scientist

Interpretable by design · 3

Feature importance & coefficients

Tree importance

Split frequency × impurity reduction.

Limit — tied to the fitted tree, unstable.
Linear coefficients

Positive → class 1 · negative → class 0.

Limit — assumes linearity, scale-sensitive.
⚠ A sensitive feature like gender_Female near the top = signal to audit the model.
In the notebook → tree_clf.feature_importances_ · pipeline['logistic_regression'].coef_[0] (with StandardScaler).
Model Interpretability with SHAP · Parcours : Data Scientist

Interpretable by design · 4

PCA biplot — a global map

          PC2 ▲
              │    ● ● ○ ○ ○       ● = class 0 (≤50K)
   ↑ arrow    │  ● ● ○ ○ ○         ○ = class 1 (>50K)
   for feat A │● ●● ○ ○ ○
              │  ● ○ ○ ○ ○
              └──────────────► PC1
                  → arrow for feature B
Points — one per person
Arrow direction — where the feature is high
Arrow length — influence on the 2D projection

PCA is excellent for exploration. But projection loses information — never a final explanation.

Model Interpretability with SHAP · Parcours : Data Scientist

The tradeoff

Performance vs. interpretability

ModelF1 scoreInterpretability
Decision tree (depth 3)~0.75Readable rule set
Logistic regression~0.78Coefficients directly inspectable
XGBoost~0.81Not directly inspectable

The problem

Better performance → less readable. No single tree to look at. No coefficient table. Reasoning is distributed across hundreds of trees.

In the notebook → xgb.XGBClassifier(eval_metric='logloss', random_state=42) — 100 trees by default, max_depth=6. This is the model we'll explain with SHAP.
Model Interpretability with SHAP · Parcours : Data Scientist

Part 2

SHAP

to the rescue

A unified language for any model

Model Interpretability with SHAP · Parcours : Data Scientist

The core idea

SHAP in one mental picture

The question SHAP answers

For this prediction, how much did each feature contribute, relative to a baseline?

baseline prediction (expected value)
      + contribution from feature 1
      + contribution from feature 2
      + contribution from feature 3
      + ...
      = model output for this person

Think of SHAP as a prediction-decomposition tool. It splits one prediction into additive pieces you can read.

Model Interpretability with SHAP · Parcours : Data Scientist

The foundation

Why SHAP is trustworthy

Game theory
Built on Shapley values — a Nobel-prize idea for fairly splitting a team's reward among its players.
Fair properties
Efficiency · Symmetry · Dummy · Additivity — these four axioms uniquely define SHAP.
Model-agnostic
Works for trees, linear models, neural nets — with an optimized TreeExplainer for XGBoost.
Model Interpretability with SHAP · Parcours : Data Scientist

The workflow

Four steps to apply SHAP

1
Train your model
Any model — but tree boosters (XGBoost, LightGBM) pair with a fast TreeExplainer.
2
Create the explainer
explainer = shap.TreeExplainer(xgb_model, data=X_train, model_output="probability") · data=X_train is the background set — the mean prediction over it becomes the expected value.
3
Compute SHAP values
shap_values = explainer.shap_values(X_test) → a matrix (n_samples, n_features)
4
Visualize & interpret
Bar plot · beeswarm · dependence · force — we'll read all four next.
Model Interpretability with SHAP · Parcours : Data Scientist

Output structure

The SHAP values matrix

                feature 1   feature 2   feature 3   ...
person 1          +0.12       -0.03       +0.01
person 2          -0.08       +0.10        0.00
person 3          +0.02       -0.01       -0.05
Rows
One per person in the test set
Columns
One per feature
Sign
+ pushes up · pushes down

Near zero → the feature barely influenced this particular prediction.

Model Interpretability with SHAP · Parcours : Data Scientist

The baseline

Expected value = average prediction

The expected value is simply the model's average prediction over the background data — E[f(X)] = np.mean(model.predict(X_background)).

Census example · binary
model.predict_proba(X_train)[:,1].mean()
≈ 0.24  ← expected_value
(~24% earn >50K in X_train)
How to read it — before seeing one person's features, the model would start from 0.24. SHAP values measure the distance between this average and the actual prediction for that person.
expected_value + Σ shap_values[person] = model.predict_proba(person)
Model Interpretability with SHAP · Parcours : Data Scientist

Global view · 1

The bar plot — what matters on average

capital-gain 0.18 marital_Married 0.12 educational-num 0.08 hours-per-week 0.06 age 0.05 0 mean |SHAP| →
Each bar = mean |SHAP| — influence with no direction
⚠ Ranks features — does not say whether they push toward ≤50K or >50K
Model Interpretability with SHAP · Parcours : Data Scientist

Global view · 2

The beeswarm — matters how and for whom

pushes ≤50K ← → pushes >50K SHAP = 0 capital-gain marital_Married age hours-per-week gender_Female feature value low high
Each dot = one person
X position = SHAP value
Color = low (cyan) / high (orange)

High capital-gain → >50K. gender_Female slightly pushes toward ≤50K — worth an audit.

Model Interpretability with SHAP · Parcours : Data Scientist

Local view · 1

The dependence plot — how one feature behaves

SHAP value
for age         • •
               • •  • •      color = value of another feature
     0  -----------------
               • •   •
               • •

                low  →  high
                value of age
Upward trend — higher feature value raises the prediction
Flat shape — feature barely matters over this range
Color pattern — reveals interaction with another feature
Model Interpretability with SHAP · Parcours : Data Scientist

Local view · 2

The force plot — one specific decision

base value 0.24 f(x) 0.83 age=28 hours=35 educ=14 married=1 capital-gain=8500 pushes ≤50K pushes >50K 0.24 + (−0.06) + (−0.02) + 0.09 + 0.18 + 0.40 = 0.83 expected_value + Σ SHAP = final prediction
Red blocks → push up
Green blocks → push down
Block width = strength of contribution

The force plot tells you why this specific person got this prediction.

Model Interpretability with SHAP · Parcours : Data Scientist

Know the limits

SHAP is powerful — not magic

Caveat What it means
Correlated features Credit may split between twins in a messy way
Compute cost TreeExplainer is fast — other explainers can be slow
Not causality A strong SHAP value is association, not cause
Local instability Similar people can get visibly different explanations
Explainer choice Different explainers behave differently across models

Use SHAP as a structured way to understand the model — not as absolute truth.

Model Interpretability with SHAP · Parcours : Data Scientist

Part 3

SHAP on text

and multiclass

Emotion classification with 6 classes

Model Interpretability with SHAP · Parcours : Data Scientist

What changes

From tabular to text, from 2 classes to 6

AspectPart 1 (binary)Part 3 (multiclass text)
FeaturesTabular columnsWords / tokens
TaskBinary6-class (sadness · joy · fear · anger · surprise · disgust)
SHAP output spaceProbabilitiesLogits (raw class scores)
Explanation unitOne per predictionOne per class, per prediction

✓ Same workflow — TreeExplainer, shap_values, force plots. We slow down on 2 new concepts.
In the notebook → xgb.XGBClassifier(objective='multi:softprob', num_class=6)
Model Interpretability with SHAP · Parcours : Data Scientist

New concept · 1

Logits vs. probabilities

raw model scores (logits)   ──softmax──▶   probabilities that sum to 1

sadness    3.1                              sadness   0.72
joy        0.7                              joy       0.08
fear       1.0                              fear      0.10
anger      0.2                              anger     0.05
surprise  -0.3                              surprise  0.03
disgust   -0.6                              disgust   0.02

In multiclass, TreeExplainer explains the raw class scores — probabilities come after softmax.

Model Interpretability with SHAP · Parcours : Data Scientist

New concept · 2

One explanation per class

shap_values
│
├── class 0: sadness   → array (n_samples, n_features)
├── class 1: joy       → array (n_samples, n_features)
├── class 2: fear      → array (n_samples, n_features)
├── class 3: anger     → array (n_samples, n_features)
├── class 4: surprise  → array (n_samples, n_features)
└── class 5: disgust   → array (n_samples, n_features)
List format → shap_values[3][5] for person 5, anger class
3-D array → shap_values[5, :, 3] same thing
Model Interpretability with SHAP · Parcours : Data Scientist

Reading multiclass force plots

Which words fire each emotion?

Sadness · obs #4

Words like horrible, ungrateful push the sadness score up. Other words pull gently the other way.

Anger · obs #5

A single strong term (profane) dominates. Calming words like feeling appear on the opposite side.


The point isn't just which words appear — it's which class they support in this explanation.

Model Interpretability with SHAP · Parcours : Data Scientist

Final Part

Now it's your turn

Apply the full SHAP workflow to your own data

Model Interpretability with SHAP · Parcours : Data Scientist

Open challenge

Your 5-step SHAP workflow

1
Load & explore
Missing values, target balance, sensitive columns.
2
Prepare & train
Encode, split, fit an XGBClassifier, print classification_report.
3
Global SHAP — bar + beeswarm
Top 3 features · direction · domain check.
4
Local SHAP — force plots
One clearly positive · one negative · one uncertain.
5
Fairness / risk check
Sensitive attributes · proxies · extra validation needed?
Model Interpretability with SHAP · Parcours : Data Scientist

Summary

Key takeaways

Remember this
  • High accuracy ≠ trustworthy. Always ask: why did it predict that?
  • Classic tools (tree, coefficients, PCA) are intuitive but limited
  • SHAP gives a unified language: decomposition = baseline + contributions
  • Two views: global (bar, beeswarm) and local (force)
  • Multiclass? One explanation per class — in logit space
  • SHAP ≠ causality. It reveals learned patterns, not real-world truth
Model Interpretability with SHAP · Parcours : Data Scientist

Quiz · Check your understanding

What does a positive SHAP value mean?

A — The feature is globally important across the whole dataset
B — This feature pushed this prediction above the baseline ✓
C — The feature caused the outcome in the real world
D — The prediction is definitely correct

SHAP values are local and associative — one prediction, one person, one contribution.

Model Interpretability with SHAP · Parcours : Data Scientist

Further reading

Go deeper

Books
  • Interpretable ML — C. Molnar
  • Hands-On ML — A. Géron (Ch. 6 + bonus)
Papers
  • Lundberg & Lee (2017) — A Unified Approach to Interpreting Model Predictions
  • Ribeiro et al. (2016) — LIME: Why Should I Trust You?
Libraries
Model Interpretability with SHAP · Parcours : Data Scientist

Masterclass complete

Thank you!

Questions? Let's discuss.


Model Interpretability with SHAP · Parcours : Data Scientist