Introduction
are among the many hottest instruments for explaining Machine Studying (ML) and Deep Studying (DL) fashions. Nonetheless, for time-series information, these strategies usually fall quick as a result of they don’t account for the temporal dependencies inherent in such datasets. In a current article, we (Ángel Luis Perales Gómez, Lorenzo Fernández Maimó and me) launched ShaTS, a novel Shapley-based explainability technique particularly designed for time-series fashions. ShaTS addresses the constraints of conventional Shapley strategies by incorporating grouping methods that improve each computational effectivity and explainability.
Shapley values: The inspiration
Shapley values originate in cooperative recreation concept and pretty distribute the whole acquire amongst gamers primarily based on their particular person contributions to a collaborative effort. The Shapley worth for a participant is calculated by contemplating all potential coalitions of gamers and figuring out the marginal contribution of that participant to every coalition.
Formally, the Shapley worth φi for participant i is:
[ varphi_i(v) = sum_{S subseteq N setminus {i}}
frac – N (v(S cup {i}) – v(S)) ]
the place:
- N is the set of all gamers.
- S is a coalition of gamers not together with i.
- v(S) is the worth perform that assigns a worth to every coalition (i.e., the whole acquire that coalition S can obtain).
This system averages the marginal contributions of participant i throughout all potential coalitions, weighted by the chance of every coalition forming.
From Sport Concept to xAI: Shapley values in Machine Studying
Within the context of explainable AI (xAI), Shapley values attribute a mannequin’s output to its enter options. That is significantly helpful for understanding advanced fashions, reminiscent of deep neural networks, the place the connection between enter and output isn’t all the time clear.
Shapley-based strategies could be computationally costly, particularly because the variety of options will increase, as a result of the variety of potential coalitions grows exponentially. Nonetheless, approximation strategies, significantly these carried out within the in style SHAP library, have made them possible in apply. These strategies estimate the Shapley values by sampling a subset of coalitions relatively than evaluating all potential mixtures, considerably lowering the computational burden.
Contemplate an industrial state of affairs with three parts: a water tank, a thermometer, and an engine. Suppose we have now an Anomaly Detection (AD) ML/DL mannequin that detects malicious exercise primarily based on the readings from these parts. Utilizing SHAP, we will decide how a lot every part contributes to the mannequin’s prediction of whether or not the exercise is malicious or benign.
Nonetheless, in additional lifelike eventualities the mannequin makes use of not solely the present studying from every sensor but in addition earlier readings (a temporal window) to make predictions. This strategy permits the mannequin to seize temporal patterns and traits, thereby enhancing its efficiency. Making use of SHAP on this state of affairs to assign duty to every bodily part turns into more difficult as a result of there is no such thing as a longer a one-to-one mapping between options and sensors. Every sensor now contributes a number of options related to completely different time steps. The widespread strategy right here is to calculate the Shapley worth of every function at every time step after which post-hoc mixture these values.

This strategy has two essential drawbacks:
- Computational Complexity: The computational value will increase exponentially with the variety of options, making it impractical for giant time-series datasets.
- Ignoring Temporal Dependencies: SHAP explainers are designed for tabular information with out temporal dependencies. Put up-hoc aggregation can result in inaccurate explanations as a result of it fails to seize temporal relationships between options.
The ShaTS Strategy: Grouping Earlier than Computing Significance
Within the Shapley framework, a participant’s worth is decided solely by evaluating the efficiency of a coalition with and with out that participant. Though the strategy is outlined on the particular person stage, nothing prevents making use of it to teams of gamers relatively than to single people. Thus, if we take into account a set of gamers N divided into p teams G = {G1, … , Gp}, we will compute the Shapley worth for every group Gi by evaluating the marginal contribution of the whole group to all potential coalitions of the remaining teams. Formally, the Shapley worth for group Gi could be expressed as:
[ varphi(G_i) = sum_{T subseteq G setminus G_i} frac – ! left( v(T cup G_i) – v(T) right) ]
the place:
- G is the set of all teams.
- T is a coalition of teams not together with Gi.
- v(T) is the worth perform that assigns a worth to every coalition of teams.
Constructing on this concept, ShaTS operates on time home windows and supplies three distinct ranges of grouping, relying on the explanatory aim:
Temporal
Every group comprises all measurements recorded at a particular immediate throughout the time window. This technique is beneficial for figuring out essential instants that considerably affect the mannequin’s prediction.

Characteristic
Every group represents the measurements of a person function over the time window. This technique isolates the affect of particular options on the mannequin’s choices.

Multi-Characteristic
Every group contains the mixed measurements over the time window of options that share a logical relationship or signify a cohesive purposeful unit. This strategy analyzes the collective affect of interdependent options, guaranteeing their mixed affect is captured.

As soon as teams are outlined, Shapley values are computed precisely as within the particular person case, however utilizing group-level marginal contributions as a substitute of per-feature contributions.

ShaTS customized visualization
ShaTS features a visualization designed particularly for sequential information and for the three grouping methods above. The horizontal axis exhibits consecutive home windows. The left vertical axis lists the teams, and the best vertical axis overlays the mannequin’s anomaly rating for every window. Every heatmap cell at (i, Gj) represents the significance of group Gj for window i. Hotter reds point out a stronger optimistic contribution to the anomaly, cooler blues point out a stronger unfavourable contribution, and near-white means negligible affect. A purple dashed line traces the anomaly rating throughout home windows, and a horizontal dashed line at 0.5 marks the choice threshold between anomalous and regular home windows.
As an example, think about a mannequin that processes home windows of size 10 constructed from three options, X, Y, and Z. When an operator receives an alert and needs to know which sign triggered it, they examine the function grouping outcomes. Within the subsequent determine, round home windows 10–11 the anomaly rating rises above the brink, whereas the attribution for X intensifies. This sample signifies that the choice is being pushed primarily by X.

If the following query is when, inside every window, the anomaly happens, the operator switches to the temporal grouping view. The subsequent determine exhibits that the ultimate immediate of every window (t9) constantly carries the strongest optimistic attribution, revealing that the mannequin has discovered to depend on the final time step to categorise the window as anomalous.

Experimental Outcomes: Testing ShaTS on the SWaT Dataset
In our recent publication, we validated ShaTS on the Safe Water Therapy (SWaT) testbed, an industrial water facility with 51 sensors/actuators organized into six plant phases (P1–P6). A stacked Bi-LSTM skilled on windowed indicators served because the detector, and we in contrast ShaTS with publish hoc KernelSHAP utilizing three viewpoints: Temporal (which immediate within the window issues), Sensor/Actuator (which gadget), and Course of (which of the six phases).
Throughout assaults, ShaTS yielded tight, interpretable bands that pinpointed the true supply—all the way down to the sensor/actuator or plant stage—whereas publish hoc SHAP tended to diffuse significance throughout many teams, complicating root-cause evaluation. ShaTS was additionally quicker and extra scalable: grouping shrinks the participant set, so the coalition house drops dramatically; run time stays practically fixed because the window size grows as a result of the variety of teams doesn’t change; and GPU execution additional accelerates the strategy, making near-real-time use sensible.
Arms-on Instance: Integrating ShaTS into Your Workflow
This walkthrough exhibits methods to plug ShaTS right into a typical Python workflow: import the library, select a grouping technique, initialize the explainer along with your skilled mannequin and background information, compute group-wise Shapley values on a check set, and visualize the outcomes. The instance assumes a PyTorch time-series mannequin and that your information is windowed (e.g., form [window_len, n_features] per pattern).
1. Import ShaTS and configure the Explainer
In your Python script or pocket book, start by importing the mandatory parts from the ShaTS library. Whereas the repository exposes the summary ShaTS class, you’ll usually instantiate one in all its concrete implementations (e.g., FastShaTS).
import shats
from shats.grouping import TimeGroupingStrategy
from shats.grouping import FeaturesGroupingStrategy
from shats.grouping import MultifeaturesGroupingStrategy
2. Initialize the Mannequin and Information
Assume you have got a pre-trained time collection PyTorch mannequin and a background dataset, which ought to be an inventory of tensors representing typical information samples that the mannequin has seen throughout coaching. If you wish to higher undestand the background dataset verify this weblog from Cristoph Molnar.
mannequin = MyTrainedModel()
random_samples = random.pattern(vary(len(trainDataset)), 100)
background = [trainDataset[idx] for idx in random_samples]
shapley_class = shats.FastShaTS(mannequin,
support_dataset=background,
grouping_strategy= FeaturesGroupingStrategy(names=variable_names)
3. Compute Shapley Values
As soon as the explainer is initialized, compute the ShaTS values on your check dataset. The check dataset ought to be formatted equally to the background dataset.
shats_values = shaTS.compute(testDataset)
4. Visualize Outcomes
Lastly, use the built-in visualization perform to plot the ShaTS values. You possibly can specify which class (e.g., anomalous or regular) you need to clarify.
shaTS.plot(shats_values, test_dataset=testDataset, class_to_explain=1)
Key Takeaways
- Targeted Attribution: ShaTS supplies extra centered attributions than publish hoc SHAP, making it simpler to determine the basis trigger in time-series fashions.
- Effectivity: By lowering the variety of gamers to teams, ShaTS considerably decreases the coalitions to judge, resulting in quicker computation occasions.
- Scalability: ShaTS maintains constant efficiency whilst window measurement will increase, because of its mounted group construction.
- GPU Acceleration: ShaTS can leverage GPU assets, additional enhancing its pace and effectivity.
Strive it your self
Interactive demo
Examine ShaTS with publish hoc SHAP on artificial time-series here. You’ll find a tutorial on the next video.
Open supply
The ShaTS module is totally documented and able to plug into your ML/DL pipeline. Discover the code on Github.
I hope you favored it! You’re welcome to contact me when you’ve got questions, need to share suggestions, or just really feel like showcasing your personal initiatives.
