Close Menu
    Trending
    • On algorithms, life, and learning | MIT News
    • The hardest question to answer about AI-fueled delusions
    • 4 Pandas Concepts That Quietly Break Your Data Pipelines
    • Claude for Finance Teams: DCF, Comps & Reconciliation
    • Causal Inference Is Eating Machine Learning
    • Neuro-Symbolic Fraud Detection: Catching Concept Drift Before F1 Drops (Label-Free)
    • I Built a Podcast Clipping App in One Weekend Using Vibe Coding
    • The Bay Area’s animal welfare movement wants to recruit AI
    ProfitlyAI
    • Home
    • Latest News
    • AI Technology
    • Latest AI Innovations
    • AI Tools & Technologies
    • Artificial Intelligence
    ProfitlyAI
    Home » Neuro-Symbolic Fraud Detection: Catching Concept Drift Before F1 Drops (Label-Free)
    Artificial Intelligence

    Neuro-Symbolic Fraud Detection: Catching Concept Drift Before F1 Drops (Label-Free)

    ProfitlyAIBy ProfitlyAIMarch 23, 2026No Comments25 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    , each metric regarded good.
    RWSS = 1.000. Output possibilities unchanged. No labels moved.
    Every little thing stated “all clear.”

    Then the alert fired anyway.

    Window 3: severity=warning RWSS=1.000 fired=True ← FIDI Z fires right here

    The mannequin’s predictions didn’t know something was fallacious but.
    However the symbolic layer did.

    That is what truly occurred within the experiment — and why it issues for anybody working fraud fashions in manufacturing.

    Full code: https://github.com/Emmimal/neuro-symbolic-drift-detection

    TL;DR: What You Will Get From This Article

    • FIDI Z-Rating detects idea drift in 5 of 5 seeds, typically earlier than F1 drops, with zero labels required
    • RWSS alone missed 3 of 5 seeds. A Z-score extension of FIDI is what makes it work
    • Covariate drift is a whole blind spot. It wants a separate raw-feature monitor
    • The alert system is ~50 strains of code and the distinction between a scheduled retrain and an emergency one

    Not accustomed to the sequence? Hybrid Neuro-Symbolic Fraud Detection: Guiding Neural Networks with Domain Rules covers the structure. How a Neural Network Learned Its Own Fraud Rules: A Neuro-Symbolic AI Experiment explains how the mannequin discovers its personal guidelines. That is the drift detection chapter.

    The Story So Far

    That is Half 3 of a sequence. New right here? One paragraph is all you want.

    A HybridRuleLearner trains two parallel paths: an MLP for detection and a rule path that learns symbolic IF-THEN circumstances from the identical information. The rule path discovered V14 by itself throughout two seeds, with out being informed to search for it. That discovered rule (IF V14 < −1.5σ → Fraud) is now the factor being monitored. This text asks what occurs when V14 begins behaving in another way.

    Can the foundations act as a canary? Can neuro-symbolic idea drift monitoring work at inference time, with out labels?

    Three Methods Fraud Can Change

    Idea drift fraud detection is more durable than it sounds as a result of solely one of many three widespread drift sorts truly adjustments what the mannequin’s discovered associations imply. The experiment simulates three forms of drift on the Kaggle Credit score Card Fraud dataset (284,807 transactions, 0.17% fraud price) throughout 8 progressive home windows every [9].

    Covariate drift. The enter characteristic distributions shift. V14, V4, and V12 transfer by as much as +3.0σ progressively. Fraud patterns keep the identical. The world simply appears just a little completely different.

    Prior drift. The fraud price will increase from 0.17% towards 2.0%. Options are unchanged. Fraud turns into extra widespread.

    Idea drift. The signal of V14 is progressively flipped for fraud circumstances throughout 8 home windows. By the top, the transactions the mannequin discovered to flag as fraud now appear to be official ones. The rule IF V14 < −1.5σ → Fraud is now pointing within the fallacious path.

    That third one is the one that ought to fear you in manufacturing. With covariate and prior drift, there are exterior indicators. Enter distributions shift, or fraud charges visibly change. You’ll be able to monitor these independently. Idea drift leaves no such footprint. The one factor that adjustments is what the mannequin’s discovered associations imply. You’ll not know till F1 begins falling.

    Until one thing sees it first.

    The Drawback With the First Three Metrics

    The mannequin from How a Neural Network Learned Its Own Fraud Rules: A Neuro-Symbolic AI Experiment produced three label-free monitoring indicators as a by-product of the symbolic layer. The thought: if the foundations are studying fraud patterns, adjustments in how these guidelines hearth ought to reveal when fraud patterns are shifting.

    I anticipated the primary one to be the early warning. It was not.

    The issue is particular to how this mannequin trains. All 5 seeds converged between epochs 3 and 10 (Val PR-AUC: 0.7717, 0.6915, 0.6799, 0.7899, 0.7951), when temperature τ continues to be between 3.5 and 4.0. At that temperature, rule activations are delicate. Each enter produces a near-identical activation rating no matter its precise options. In plain phrases: the foundations had been firing nearly the identical manner on each transaction, clear or drifted. A similarity metric on near-constant vectors returns 1.000 nearly on a regular basis. The primary sign solely fired in 2 of 5 seeds for idea drift, and in each circumstances it was the identical window as F1 or later.

    Why excessive temperature makes monitoring more durable

    The LearnableDiscretizer makes use of a sigmoid gated by temperature τ: σ((x − θ) / τ). At τ = 5.0 (epoch 0), that sigmoid is almost flat — each characteristic worth produces an activation near 0.5 no matter the place it sits relative to the discovered threshold. As τ anneals towards 0.1, the sigmoid sharpens right into a near-binary step. Early stopping fires at τ ≈ 3.5–4.0 — earlier than the foundations have absolutely crystallised. The end result: activation vectors are near-constant throughout all inputs, so any similarity metric between them stays close to 1.000 even when fraud patterns are genuinely shifting.

    The second sign had the other downside. Absolutely the change in any characteristic’s contribution is tiny (values within the 0.001–0.005 vary) as a result of the rule weights themselves are small at an early-stopped checkpoint. In plain phrases: the sign was actual however invisible on the scale we had been measuring it. A hard and fast absolute threshold of 0.02 by no means fires.

    Here’s what these three unique indicators are:

    • RWSS (Rule Weight Stability Rating): cosine similarity between the baseline imply rule activation vector and the present one. In easy phrases: are the foundations nonetheless firing the identical manner they did on clear information?
    • FIDI (Function Significance Drift Index): how a lot every characteristic’s contribution to rule activations has modified from the baseline. In easy phrases: has any particular characteristic grow to be kind of vital to the foundations?
    • RFR (Rule Firing Fee): what fraction of transactions hearth every rule.

    That analysis led to the correct query. As a substitute of asking “has FIDI modified by greater than X?”, the correct query is “has FIDI modified by greater than X customary deviations from its personal historical past?”

    That query has a unique reply. And the reply is V14.

    The Metrics: Constructing a Label-Free Drift Detection System

    Three new metrics joined the unique three.

    RWSS Velocity measures the per-window price of change: RWSS[w] − RWSS[w−1]. A sudden drop of greater than 0.03 per window fires an alert even earlier than absolutely the worth crosses the brink. If RWSS is falling at −0.072 in a single step, that could be a sign no matter the place it began.

    FIDI Z-Rating is the one that really labored. Reasonably than a model new sign, it’s a easy extension of FIDI utilizing Z-score normalisation in opposition to the characteristic’s personal window historical past. As a substitute of asking whether or not absolutely the change crosses a hard and fast threshold, it asks whether or not the change is anomalous relative to what that characteristic has been doing. In contrast to conventional drift detection strategies that depend on enter distributions or output labels, this method operates purely on the symbolic layer, which implies it really works at inference time, with no floor reality required. It builds on differentiable rule-learning work together with ∂ILP [3], FINRule [4], RIFF [5], and Neuro-Symbolic Rule Lists [6], extending these representations with Z-score normalisation fairly than mounted thresholds. V14’s contribution to rule activations throughout the clear baseline home windows is small and flat. Close to zero, steady, predictable. When idea drift begins at window 3, it shifts. Not by a lot in absolute phrases. However by 9.53 customary deviations relative to the historical past it constructed throughout steady home windows. That is a gigantic relative anomaly, and no threshold calibration is required to catch it.

    PSI on Rule Activations was designed to catch distributional shift within the symbolic layer earlier than the MLP’s compensation masks it on the output degree. It didn’t work right here. The delicate activations from early-stopped coaching (τ ≈ 3.5–4.0 on the saved checkpoint) cluster close to 0.5, producing near-uniform distributions that PSI can not distinguish. PSI_rules = 0.0049 all through the complete experiment. PSI_rules by no means fired. It’s within the codebase for when fashions with absolutely crystallised guidelines (τ < 0.5) can be found. On this experiment it contributed nothing.

    The supposed detection order, from earliest to newest:

    RWSS Velocity → FIDI Z-Rating → PSI(guidelines) → RWSS absolute → F1 (label-based)

    Here’s what truly occurred.

    Outcomes: What Every Metric Did

    Idea Drift

    Seed F1 fires RWSS fires VEL fires FIDIZ fires PSIR fires
    42 W3 W4 (1w late) W4 (1w late) W3 (simultaneous) —
    0 W3 — — W3 (simultaneous) —
    7 W4 W4 (simult.) W4 (simult.) W3 (+1w early) —
    123 W3 — — W3 (simultaneous) —
    2024 W4 — — W3 (+1w early) —

    FIDI Z-Rating fires in 5 of 5 seeds, all the time at window 3. F1 fires at W3 in three seeds and W4 in two. The imply FIDIZ detection lag is +0.40 home windows, which means it leads F1 on common. In seeds 7 and 2024 it fires one full window earlier than F1 drops. Within the remaining three seeds it fires concurrently. It by no means fires after F1 for idea drift. Not as soon as.

    Throughout all drift sorts, FIDI Z-Rating is the one metric that detected idea drift in each seed and by no means lagged behind F1. For label-free drift detection fraud programs, that’s the headline end result.

    RWSS stays flat whereas F1 is already falling. The symbolic layer’s activation sample holds regular by way of W0–W3 — it registers no change whereas F1 is already on its manner down. The drop to 0.923 at W4 confirms the drift one window after F1 crosses its alert threshold. This is the reason RWSS alone is inadequate for early idea drift detection. Picture by Creator.

    RWSS fires in 2 of 5 seeds and in each circumstances concurrently with or after F1. Velocity matches RWSS precisely, similar window, each time. PSI on rule activations by no means fires in any respect.

    Idea Drift vs Covariate Drift: Why Symbolic Monitoring Has Blind Spots

    Covariate drift is the place the symbolic layer goes utterly silent.

    Each symbolic metric: 0 of 5 seeds. Not one sign. Not one window. F1 finally fires in 4 of 5 seeds at W6 or W7, slowly and late, and the symbolic layer had nothing to do with it. This isn’t a spot that higher tuning will shut. It’s a elementary property of what the symbolic layer measures.

    The reason being mechanical. When V14, V4, and V12 shift by +3.0σ, the shift is uniform throughout all samples. The learnable discretizer computes thresholds relative to the information. Every pattern nonetheless lands in roughly the identical threshold bin relative to its neighbours. Guidelines hearth on roughly the identical proportion of transactions. Nothing within the activation sample adjustments. Cosine similarity of imply activations stays at 1.0.

    In easy phrases: if each transaction shifts by the identical quantity, the foundations nonetheless see the identical relative image. Transaction A was above the brink earlier than. It’s nonetheless above the brink after. The fraud-vs-legitimate ordering is preserved. RWSS measures that ordering, not absolutely the values. Consider it as a tide that lifts all boats equally. The boats keep in the identical order. RWSS solely measures the order.

    If covariate drift is a priority in your deployment, you want a separate input-space monitor: PSI on uncooked options, a KS take a look at on V14, or an information high quality test. The symbolic layer can not aid you there. Symbolic layer drift monitoring has one blind spot, and covariate shift is it.

    RWSS and F1 scores across 8 windows under covariate drift. RWSS stays at exactly 1.000 throughout all windows. F1 mean gradually declines from W2 onward, crossing the alert threshold near W6–W7.
    RWSS sees nothing. Shifting V14, V4, and V12 by as much as 3σ has no impact on the symbolic layer — the inexperienced line by no means strikes from 1.000. The principles nonetheless hearth on the identical relative proportion of transactions as a result of the shift is uniform throughout all samples. F1 finally drifts under threshold late within the sequence, however no symbolic metric was there to warn you. Picture by Creator.

    Prior Drift

    FIDIZ fires in 5 of 5 seeds, all the time at W3. However prior drift causes F1 to drop at W0 (seed 123) or W2 (seed 2024) within the two seeds the place F1 fires in any respect. FIDIZ detection lag for prior drift: −2.00 home windows. It fires two home windows after F1.

    This isn’t a calibration downside. FIDIZ wants a minimal of three clear home windows to construct a historical past earlier than its Z-score is significant. Prior drift that causes a right away fraud price soar is already seen in F1 earlier than FIDIZ may even begin computing. A rolling fraud price counter will all the time be quicker right here.

    RWSS and F1 scores across 8 windows under prior drift. RWSS stays at 1.000 throughout. F1 mean sits around 0.63–0.65, staying above the alert threshold for all 8 windows in the mean.
    Prior drift — rising the fraud price from 0.17% to 2% — produces no symbolic sign in any respect. RWSS stays at 1.000 as a result of the rule activation patterns don’t change when solely the category steadiness shifts. The imply F1 truly stays above the alert threshold right here, although two particular person seeds (seen within the faint traces reaching into the 0.70–0.85 vary) do present efficiency degradation. That is the case the place you want a label distribution monitor, not a rule activation monitor. Picture by Creator.

    The Alert Demo: Window 3

    Right here is the second the entire system was constructed for.

    DriftAlertSystem is constructed as soon as from the validation set instantly after coaching. It shops the baseline. Then .test() is named on every new window. No labels. No retraining. That is inference-only drift detection: the system reads the symbolic layer and nothing else.

    Seed 42, idea drift, 8 home windows:

    Window 0: severity=none      RWSS=0.999  fired=False
    Window 1: severity=none      RWSS=0.999  fired=False
    Window 2: severity=none      RWSS=0.999  fired=False
    Window 3: severity=warning   RWSS=1.000  fired=True   ← FIDI Z fires right here
    Window 4: severity=vital  RWSS=0.928  fired=True   ← RWSS absolute confirms
    Window 5: severity=warning   RWSS=0.928  fired=True
    Window 6: severity=warning   RWSS=0.928  fired=True
    Window 7: severity=warning   RWSS=0.928  fired=True

    At window 3, RWSS is precisely 1.000. The activation sample is completely equivalent to baseline. Output possibilities haven’t modified. Nothing in the usual monitoring stack has moved.

    And the alert fires at WARNING severity.

    The reason being V14. Its Z-score is −9.53. Meaning V14’s contribution to rule activations has shifted to almost 10 customary deviations under the baseline it established throughout clear home windows. The mannequin’s output doesn’t know but. The MLP is compensating. However the rule path can not compensate. It was educated to specific a hard and fast symbolic relationship. It’s screaming.

    One window later, the MLP stops holding. RWSS drops to 0.928. Velocity falls 0.072 in a single step. Severity escalates to CRITICAL.

    ═══════════════════════════════════════════════════════
      DRIFT ALERT  |  severity: CRITICAL
      Earliest sign: VELOCITY
    ═══════════════════════════════════════════════════════
      ── Early-Warning Layer ─────────────────────────────
      RWSS Velocity : -0.0720  [threshold -0.03]  ⚠ FIRED
      FIDI Z-Rating  : ⚠ FIRED
           V14  Z = -9.53
      PSI (guidelines)   : 0.0049  [moderate≥0.10]  steady
      ── Confirmed Layer ─────────────────────────────────
      RWSS absolute : 0.9276  [threshold 0.97]  ⚠ FIRED
      Guidelines gone silent: 0  OK
      Imply RFR change  : -0.001
      Advisable motion:
        → Retrain instantly. Don't deploy.
    ═══════════════════════════════════════════════════════

    The report names VELOCITY because the earliest layer. That may be a precedence order within the inner logic. In precise window timing, FIDI Z-Rating fired one window earlier at W3. The W3 WARNING is the sooner human-facing alert. The one that provides you time to behave earlier than the CRITICAL fires.

    5×3 grid showing alert timelines for 5 seeds across covariate, prior, and concept drift. Green circles mark RWSS alerts; orange squares mark F1 alerts. Green circles appear only in the Concept column for seeds 7 and 42, at W4, one step to the right of the F1 alert.
    Each cell on this grid is one seed-drift mixture. Gray dots are silent home windows. An orange sq. is an F1 alert. A inexperienced circle is an RWSS alert. Take a look at the Idea column: seeds 7 and 42 every present a inexperienced RWSS circle at W4, with the F1 sq. one place to the left at W3 — that’s the 1-window lag. Seeds 0, 123, and 2024 present no inexperienced circle in any respect, solely an orange F1 sq.. Covariate and prior columns present no RWSS circles wherever. FIDI Z-Rating alerts (not proven on this determine) fired at W3 throughout all 5 concept-drift seeds. Picture by Creator.

    Why FIDI Z-Rating Sees It Earlier than F1 Does

    The mannequin has two paths working in parallel from the identical enter.

    The MLP path carries 88.6% of the ultimate output (imply α = 0.886 throughout seeds; α is the discovered mix weight; 0.886 means the neural community does 88.6% of the prediction work and the symbolic guidelines do the remaining 11.4%). When idea drift progressively reverses V14’s relationship to fraud labels, the MLP, educated on 284,000 transactions, partially absorbs that change. Its inner representations shift. Output possibilities keep roughly steady for not less than one window. That is the MLP compensating.

    The rule path carries 11.4%. It was educated to specific the MLP’s information in symbolic type: V14 under a threshold means fraud [2]. That relationship is mounted and specific. When V14 flips signal for fraud circumstances, the rule’s V14 contribution doesn’t modify. It merely stops working. The bit activations for V14 change path. The rule begins firing on the fallacious transactions.

    The neural community adapts. The symbolic layer doesn’t. And that’s precisely why the symbolic layer detects the drift first.

    That asymmetry is what FIDI Z-Rating exploits.

    Absolutely the change in V14’s contribution is tiny (values within the 0.001 to 0.005 vary) as a result of rule weights are small at an early-stopped checkpoint. A hard and fast absolute threshold by no means catches it.

    FIDI heatmap for concept drift across 7 features (V14, V12, V4, V11, V10, V17, V3) and 8 time windows. All cells are uniform grey throughout, indicating near-zero absolute FIDI values across all features and all windows.
    This clean heatmap is the entire level. Each cell is identical impartial gray — which means absolutely the FIDI values for all seven options throughout all eight home windows are successfully zero. The color scale runs from −0.30 to +0.30, however nothing registers. This is the reason a hard and fast absolute threshold (0.02) by no means fires, and why Z-score normalisation in opposition to every characteristic’s personal historical past is critical. V14’s contribution is collapsing — however solely relative to itself, not in absolute phrases the size can present. Picture by Creator.

    However V14’s historical past by way of the clear home windows is simply as flat. When idea drift strikes it at window 3, the Z-score is −9.53. Identical sample: near-zero absolute change, excessive relative shift.

    The symbolic layer is much less compensating than the MLP, so it exhibits the drift first. FIDI Z-Rating makes the sign seen by evaluating every characteristic to not a hard and fast threshold, however to its personal historical past.

    However this solely holds for one of many three drift sorts. The opposite two are a unique story fully.

    What This System Can not Do

    A system that claims early warning invitations overstatement. Here’s what the information truly says. That is label-free anomaly detection fraud monitoring, which implies the constraints are structural, not tunable.

    Covariate drift is a whole blind spot. 0 of 5 seeds. The mechanism is defined within the Outcomes part above. Use PSI on uncooked options or a KS take a look at on V14 as an alternative.

    FIDIZ fires late on prior drift by design. When the fraud price jumps, F1 reacts at W0 or W2. FIDIZ structurally can not hearth earlier than W3. It wants historical past that doesn’t but exist. A rolling fraud price monitor responds quicker.

    PSI on rule activations produced nothing. PSI_rules = 0.0049 all through each window of each seed. Comfortable activations from early-stopped coaching cluster close to 0.5, and PSI on near-uniform distributions is insensitive regardless of what’s truly taking place. This metric is within the codebase and may match with absolutely annealed fashions (τ < 0.5). On this experiment it was silent.

    5 seeds is proof, not proof. FIDIZ fires at W3 for idea drift throughout all 5 seeds. That’s constant and inspiring. It isn’t the identical as dependable in manufacturing throughout datasets, fraud sorts, and drift severities you haven’t examined. 5 seeds is a place to begin, not a conclusion. Extra seeds, extra drift configurations, and real-world validation are wanted earlier than sturdy deployment claims.

    Outcomes Abstract

    The sample is clearest when said plainly first. Consider this as an early warning idea drift system with three distinct modes relying on what’s altering. Covariate drift: the symbolic layer noticed nothing, F1 caught it slowly. Prior drift: the symbolic layer fired after F1, not earlier than. Idea drift: FIDI Z-Rating fired in each single seed, all the time at or earlier than F1, averaging +0.40 home windows of lead time.

    Drift kind F1 fired RWSS fired FIDIZ fired FIDIZ imply lag
    Covariate 4/5 0/5 0/5 —
    Prior 2/5 0/5 5/5 −2.00w (late)
    Idea 5/5 2/5 5/5 +0.40w (early)

    Lag = home windows earlier than F1 alert. Constructive = FIDIZ fires first. Unfavorable = F1 fires first.

    Three-panel chart comparing RWSS versus F1 across covariate, prior, and concept drift. Left panel: RWSS flat blue line, F1 slowly declining red line. Centre panel: RWSS flat amber line, F1 stable. Right panel: RWSS drops sharply at W4, F1 collapses at W3.
    Three drift sorts, three utterly completely different tales. Covariate drift (left): RWSS by no means strikes, F1 slowly degrades. Prior drift (centre): RWSS by no means strikes, F1 stays steady on common regardless of excessive seed variance. Idea drift (proper): RWSS holds till W4 then drops to ~0.91 — however F1 has already fallen sharply at W3. This panel is why RWSS alone will not be sufficient, and why FIDI Z-Rating — which fires at W3 in all 5 seeds — is required because the main sign. Picture by Creator.

    Constructing It

    The system is designed for use in manufacturing, not simply in a pocket book.

    # As soon as, instantly after coaching
    X_val_t      = torch.FloatTensor(X_val)
    alert_system = DriftAlertSystem.from_trained_model(mannequin, X_val_t, feature_names)
    alert_system.save("outcomes/drift_alert_baseline_seed42.pkl")
    
    # Each scoring run — weekly, each day, per-batch
    alert_system = DriftAlertSystem.load("outcomes/drift_alert_baseline_seed42.pkl")
    alert = alert_system.test(mannequin, X_this_week)
    
    if alert.fired:
        print(alert.report())

    No labels. No retraining. No infrastructure past saving a pickle file subsequent to the mannequin checkpoint. The .test() name computes RWSS velocity, FIDI Z-Rating, PSI on activations, and RWSS absolute in that order, utilizing PyTorch [7] and scikit-learn [8]. Severity escalates from none to warning to vital based mostly on what number of hearth and the way far RWSS has dropped.

    The three early-warning computations are every just a few strains.

    RWSS Velocity: price of change per window.

    def compute_rwss_velocity(rwss_history: Listing[float]) -> float:
        if len(rwss_history) < 2:
            return 0.0
        return float(rwss_history[-1] - rwss_history[-2])
    
    # Alert fires when drop > 0.03 per window
    vel_fired = rwss_velocity < -0.03

    FIDI Z-Rating: normalise characteristic contribution anomaly in opposition to historical past.

    def compute_fidi_zscore(fidi_history, current_fidi, min_history=3):
        if len(fidi_history) < min_history:
            return {ok: 0.0 for ok in current_fidi}
        z_scores = {}
        for feat_idx, current_val in current_fidi.gadgets():
            history_vals = [h.get(feat_idx, 0.0) for h in fidi_history]
            mean_h = np.imply(history_vals)
            std_h  = np.std(history_vals)
            z_scores[feat_idx] = (current_val - mean_h) / std_h if std_h > 1e-8 else 0.0
        return z_scores
    
    # Alert fires when any characteristic Z > 2.5
    fidi_z_fired = any(abs(z) > 2.5 for z in z_scores.values())

    PSI on Rule Activations: distributional shift within the symbolic layer (included for completeness).

    def compute_psi_rules(baseline_acts, current_acts, n_bins=10):
        bins = np.linspace(0, 1, n_bins + 1)
        psi_per_rule = []
        for r in vary(baseline_acts.form[1]):
            b = np.histogram(baseline_acts[:, r], bins=bins)[0] + 1e-6
            c = np.histogram(current_acts[:,  r], bins=bins)[0] + 1e-6
            b /= b.sum(); c /= c.sum()
            psi_per_rule.append(float(np.sum((c - b) * np.log(c / b))))
        return np.imply(psi_per_rule)

    V14: Three Articles, One Function

    That is the half I didn’t plan. However V14 idea drift behaviour seems to be the thread that ties all three articles collectively.

    Guiding Neural Networks with Domain Rules: I wrote guidelines about massive transaction quantities and anomalous PCA norms. Affordable intuitions. Nothing to do with V14.

    How a Neural Network Learned Its Own Fraud Rules: The mannequin discovered V14 anyway. Given 30 anonymised options and no steering, the gradient landed on the one characteristic with the best absolute correlation to fraud. Twice, throughout two unbiased seeds.

    This Article, I intentionally made V14 break. I flipped its signal for fraud circumstances, progressively, throughout 8 home windows. And FIDI Z-Rating registered the collapse at −9.53 customary deviations whereas RWSS was nonetheless 1.000 and F1 had not moved.

    V14 FIDI absolute values across 8 time windows under concept drift. Mean line (dark green) runs flat near zero throughout. FIDI alert threshold at 0.20 is shown as a dotted line far above the data.
    V14’s FIDI absolute worth stays inside 0.002 of zero throughout all 8 home windows and all 5 seeds. The alert threshold of 0.20 sits utterly out of attain on the prime of the chart — the hole between the information and the brink is the rationale the unique FIDI monitor by no means fires. What the Z-score catches as an alternative is that even this near-zero line has a steady historical past, and when V14’s contribution shifts by a tiny absolute quantity at W3, that shift is −9.53 customary deviations from the baseline. The anomaly is within the relative change, not absolutely the worth. Picture by Creator.

    The identical characteristic, three completely different roles: ignored, found, then monitored as the very first thing to fail. That coherence was not engineered. It’s what reproducible multi-seed analysis on a constant dataset retains producing.

    What to Do With This

    Use FIDI Z-Rating for idea drift detection with out labels. It fires in 5 of 5 seeds, requires solely 3 home windows of historical past, by no means fires after F1, and desires no labels. Preserve the Z-score threshold at 2.5 and minimal historical past at 3 home windows.

    Add a separate input-space monitor for covariate drift. PSI on uncooked options or a KS take a look at on vital options like V14. The symbolic layer is blind to distributional shifts that protect relative activation order.

    Use a rolling fraud price counter for prior drift. FIDIZ structurally can not hearth earlier than W3. A label-based price counter fires at W0.

    Construct the alert baseline instantly after coaching. Not after drift is suspected. Do it after coaching. If you happen to wait, you might have already misplaced your clear reference level. Put it aside alongside the checkpoint file.

    One window of early warning is actual. Whether or not it’s one week or sooner or later will depend on your scoring cadence. For many manufacturing fraud groups, the distinction between a scheduled retrain and an emergency one is measured in precisely these items.

    Three Issues That Will Catch You Utilizing This Idea Drift Early Warning System

    The three-window blind interval. FIDIZ has no historical past to work with for the primary 3 home windows after deployment. You might be monitoring with RWSS and RFR solely throughout that point. Plan for it explicitly.

    Comfortable activations will silence PSI_rules. In case your greatest checkpoint arrives when τ ≥ 1.0 (which occurs each time early stopping fires earlier than coaching is full), rule activations cluster close to 0.5 and PSI_rules returns noise. Examine τ at your saved checkpoint. On this experiment τ was nonetheless 3.5–4.0 at convergence. That’s the reason PSI_rules was silent all through.

    Retrain means re-audit. This technique is a fraud mannequin retraining set off, not a retrain substitute. After retraining, the foundations change. V14 might now not dominate, or new options might have entered. The compliance sign-off from the earlier mannequin doesn’t carry ahead. Construct the audit into the retrain course of, not as a step after, however because the step that closes the loop.

    Closing

    Three articles. One characteristic stored showing.

    Guiding Neural Networks with Domain Rules: I ignored it. How a Neural Network Learned Its Own Fraud Rules: The gradient discovered it. Article 3: When it broke, the symbolic layer observed earlier than the output layer did.

    The experiment has a particular, trustworthy scope: FIDI Z-Rating detects idea drift in 5 of 5 seeds, typically one window earlier than F1, by no means after it, fully with out labels. For covariate drift it’s blind. For prior drift it’s late. These aren’t caveats added on the finish to melt the declare. They’re findings that inform you precisely the place to make use of this and the place to not.

    A neuro-symbolic mannequin offers you two channels. The MLP is best at prediction. The symbolic layer is best at figuring out when prediction is about to go fallacious. They don’t seem to be redundant. They’re watching completely different elements of the identical downside.

    The MLP compensates. The symbolic layer can not. That’s its weak spot. On this experiment, it turned out to even be its earliest warning.

    Sequence

    Disclosure

    This text relies on unbiased experiments utilizing publicly accessible information (Kaggle Credit score Card Fraud dataset, CC-0 Public Area) and open-source instruments (PyTorch, scikit-learn). No proprietary datasets, firm assets, or confidential data had been used. The outcomes and code are absolutely reproducible as described. The views and conclusions expressed listed here are my very own and don’t characterize any employer or organisation.

    References

    [1] Dal Pozzolo, A. et al. (2015). Calibrating Chance with Undersampling for Unbalanced Classification. IEEE SSCI. Dataset: https://www.kaggle.com/datasets/mlg-ulb/creditcardfraud (CC-0)

    [2] Alexander, E. P. (2026). Hybrid Neuro-Symbolic Fraud Detection: Guiding Neural Networks with Area Guidelines. In direction of Information Science. https://towardsdatascience.com/hybrid-neuro-symbolic-fraud-detection-guiding-neural-networks-with-domain-rules/

    [3] Evans, R., & Grefenstette, E. (2018). Studying Explanatory Guidelines from Noisy Information. JAIR, 61, 1–64. https://arxiv.org/abs/1711.04574

    [4] Wolfson, B., & Acar, E. (2024). Differentiable Inductive Logic Programming for Fraud Detection. arXiv:2410.21928. https://arxiv.org/abs/2410.21928

    [5] Martins, J. L., Bravo, J., Gomes, A. S., Soares, C., & Bizarro, P. (2024). RIFF: Inducing Guidelines for Fraud Detection from Choice Bushes. In RuleML+RR 2024. arXiv:2408.12989. https://arxiv.org/abs/2408.12989

    [6] Xu, S., Walter, N. P., & Vreeken, J. (2024). Neuro-Symbolic Rule Lists. arXiv:2411.06428. https://arxiv.org/abs/2411.06428

    [7] Paszke, A. et al. (2019). PyTorch: An Crucial Type, Excessive-Efficiency Deep Studying Library. NeurIPS 32. https://pytorch.org

    [8] Pedregosa, F. et al. (2011). Scikit-learn: Machine Studying in Python. JMLR, 12, 2825–2830. https://scikit-learn.org

    [9] Gama, J. et al. (2014). A Survey on Idea Drift Adaptation. ACM Computing Surveys, 46(4). https://dl.acm.org/doi/10.1145/2523813

    Code: https://github.com/Emmimal/neuro-symbolic-drift-detection


    If you happen to work with manufacturing fashions: what drift kind worries you most? Idea drift the place the patterns quietly change, covariate shift in your enter options, or one thing else? I’m curious what monitoring gaps persons are truly working into in actual deployments.



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleI Built a Podcast Clipping App in One Weekend Using Vibe Coding
    Next Article Causal Inference Is Eating Machine Learning
    ProfitlyAI
    • Website

    Related Posts

    Artificial Intelligence

    On algorithms, life, and learning | MIT News

    March 23, 2026
    Artificial Intelligence

    4 Pandas Concepts That Quietly Break Your Data Pipelines

    March 23, 2026
    Artificial Intelligence

    Causal Inference Is Eating Machine Learning

    March 23, 2026
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    The three big unanswered questions about Sora

    October 7, 2025

    Empowering AI Creativity with Human Insight: The Power of Subjective Evaluation

    April 9, 2025

    A glimpse into OpenAI’s largest ambitions

    August 5, 2025

    How I Finally Understood MCP — and Got It Working in Real Life

    May 13, 2025

    How to Correctly Apply Limits on the Result in DAX (and SQL)

    August 18, 2025
    Categories
    • AI Technology
    • AI Tools & Technologies
    • Artificial Intelligence
    • Latest AI Innovations
    • Latest News
    Most Popular

    Why AI Projects Fail | Towards Data Science

    June 6, 2025

    Reinforcement Learning Made Simple: Build a Q-Learning Agent in Python

    May 27, 2025

    Simpler models can outperform deep learning at climate prediction | MIT News

    August 26, 2025
    Our Picks

    On algorithms, life, and learning | MIT News

    March 23, 2026

    The hardest question to answer about AI-fueled delusions

    March 23, 2026

    4 Pandas Concepts That Quietly Break Your Data Pipelines

    March 23, 2026
    Categories
    • AI Technology
    • AI Tools & Technologies
    • Artificial Intelligence
    • Latest AI Innovations
    • Latest News
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us
    Copyright © 2025 ProfitlyAI All Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.