Interaction Analysis & ANOVA

Overview

MARS naturally identifies interaction terms (products of univariate hinges) and provides ANOVA decomposition to understand which variables interact and their magnitudes.

Interaction Concepts

What is an Interaction?

An interaction occurs when the effect of one variable depends on the value of another.

Example:

The true model: .. math:

y = x_0 + x_1 + x_0 \cdot x_1

has a 2-way interaction between \(x_0\) and \(x_1\):

Main effect of \(x_0\): \(\frac{\partial y}{\partial x_0} = 1 + x_1\)
Effect depends on \(x_1\): interaction!

Visually:

No interaction:            With interaction:

y │                        y │      ╱╱
  │    ╱      ╱               │    ╱  ╱
  │   ╱      ╱                │   ╱   ╱
  │  ╱      ╱                 │  ╱   ╱
  └─────────────► x0          └──────────► x0
       x1=0                    (lines converge!)
        (parallel)

MARS Basis Functions as Interactions

MARS represents interactions as products of univariate hinges:

\[B_m(x) = h(x_i, t_1, d_1) \times h(x_j, t_2, d_2)\]

Example MARS basis:

B_5 = h(x0, +, 0.5) * h(x1, +, 1.0)

Meaning: This basis "turns on" when:
- x0 > 0.5  AND
- x1 > 1.0

The product of hinges creates a localized interaction in that region.

Interaction Degree

Degree	Form	Interpretation
0	Constant	Baseline
1	\(h(x_i, t, d)\)	Main effect of \(x_i\)
2	\(h(x_i, t_1, d_1) \times h(x_j, t_2, d_2)\)	2-way interaction
3	Product of 3 hinges	3-way interaction
k	Product of k hinges	k-way interaction

Parameter: max_degree (default=1)

max_degree=1: No interactions (additive model)
max_degree=2: Up to 2-way interactions
max_degree=3: Up to 3-way interactions

Fitting Models with Interactions

Allowing Interactions

from pymars import MARS

# Additive (no interactions)
model_additive = MARS(max_degree=1)

# With 2-way interactions
model_2way = MARS(max_degree=2)

# With 3-way interactions
model_3way = MARS(max_degree=3)

Typically: - Start with max_degree=1 (simpler) - If R² is low, try max_degree=2 - Rarely need max_degree > 2

Example: Detecting Interactions

import numpy as np
from pymars import MARS

# Generate data with TRUE interaction
np.random.seed(42)
X = np.random.uniform(-1, 1, (200, 3))

# y = x0 + x1 + x0*x1 (interaction between x0, x1)
y = X[:, 0] + X[:, 1] + X[:, 0] * X[:, 1] + np.random.randn(200)*0.1

# Fit WITHOUT interactions
model_no_int = MARS(max_degree=1)
model_no_int.fit(X, y)
r2_no_int = model_no_int.score(X, y)

# Fit WITH interactions
model_with_int = MARS(max_degree=2)
model_with_int.fit(X, y)
r2_with_int = model_with_int.score(X, y)

print(f"Without interactions: R² = {r2_no_int:.4f}")
print(f"With interactions:    R² = {r2_with_int:.4f}")
print(f"Improvement: {(r2_with_int - r2_no_int):.4f}")

Expected Output:

Without interactions: R² = 0.7234
With interactions:    R² = 0.9876
Improvement: 0.2642

The significant improvement indicates the interaction was detected!

ANOVA Decomposition

ANOVA = Analysis of Variance, extended here to MARS basis functions.

Goal: Decompose the fitted function into:

\[\hat{f}(x) = c_0 + \sum_i f_i(x_i) + \sum_{i<j} f_{ij}(x_i, x_j) + \ldots\]

where: - \(c_0\) = constant term - \(f_i(x_i)\) = main effect of variable \(i\) - \(f_{ij}(x_i, x_j)\) = 2-way interaction between \(i\) and \(j\)

Extracting ANOVA Decomposition

# Fit model with interactions
model = MARS(max_degree=2)
model.fit(X, y)

# Get ANOVA decomposition
anova = model.get_anova_decomposition()

# anova is a dict: {order: [basis_functions]}
# order=0: constant
# order=1: main effects
# order=2: 2-way interactions
# order=3: 3-way interactions

Inspecting Components

anova = model.get_anova_decomposition()

print("ANOVA Decomposition:\n")

# Constant term
if 0 in anova:
    c0 = model.coefficients_[0]
    print(f"Constant: {c0:.6f}\n")

# Main effects
if 1 in anova:
    print("Main Effects (Degree 1):")
    for bf in anova[1]:
        idx = model.basis_functions_.index(bf)
        coef = model.coefficients_[idx]
        print(f"  {bf} : coef = {coef:.6f}")
    print()

# 2-way interactions
if 2 in anova:
    print("2-way Interactions (Degree 2):")
    for bf in anova[2]:
        idx = model.basis_functions_.index(bf)
        coef = model.coefficients_[idx]
        print(f"  {bf} : coef = {coef:.6f}")

Contribution to Prediction

Compute each term’s contribution to predictions:

def anova_contributions(model, X):
    """
    Compute contribution of each ANOVA term to predictions.

    Returns:
        dict: {order: contributions}
    """
    anova = model.get_anova_decomposition()
    contributions = {0: np.full(len(X), model.coefficients_[0])}

    for order, basis_list in anova.items():
        if order == 0:
            continue

        contrib = np.zeros(len(X))

        for bf in basis_list:
            idx = model.basis_functions_.index(bf)
            coef = model.coefficients_[idx]
            bf_eval = bf.evaluate(X)  # Evaluate basis
            contrib += coef * bf_eval

        contributions[order] = contrib

    return contributions

Visualizing Interactions

Partial Effects Plot

For a 2-way interaction, plot the effect of varying both variables:

import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D

# Grid of x0, x1 values
x0_range = np.linspace(X[:, 0].min(), X[:, 0].max(), 50)
x1_range = np.linspace(X[:, 1].min(), X[:, 1].max(), 50)
X0_grid, X1_grid = np.meshgrid(x0_range, x1_range)

# Predictions on grid (fix x2 at median)
x2_median = np.median(X[:, 2])
X_grid = np.column_stack([
    X0_grid.ravel(),
    X1_grid.ravel(),
    np.full(X0_grid.size, x2_median)
])

y_grid = model.predict(X_grid).reshape(X0_grid.shape)

# 3D surface plot
fig = plt.figure(figsize=(10, 8))
ax = fig.add_subplot(111, projection='3d')

surf = ax.plot_surface(X0_grid, X1_grid, y_grid, cmap='viridis', alpha=0.8)
ax.scatter(X[:, 0], X[:, 1], y, alpha=0.3, s=10)

ax.set_xlabel('x0'), ax.set_ylabel('x1'), ax.set_zlabel('y')
ax.set_title('Interaction Effect: y vs (x0, x1)')

fig.colorbar(surf, ax=ax, label='Predicted y')
plt.show()

Interaction Heatmap

Visualize which pairs of variables interact:

import matplotlib.pyplot as plt

# Count interactions per variable pair
d = X.shape[1]
interaction_matrix = np.zeros((d, d))

anova = model.get_anova_decomposition()
for order, basis_list in anova.items():
    if order == 2:
        for bf in basis_list:
            # Extract variables from basis function
            vars_in_bf = bf.get_variables()  # [i, j]
            if len(vars_in_bf) == 2:
                i, j = vars_in_bf
                interaction_matrix[i, j] += 1
                interaction_matrix[j, i] += 1

# Heatmap
plt.figure(figsize=(8, 6))
plt.imshow(interaction_matrix, cmap='YlOrRd', interpolation='nearest')
plt.colorbar(label='Interaction Count')
plt.xlabel('Variable')
plt.ylabel('Variable')
plt.title('Variable Interactions in MARS Model')
plt.xticks(range(d)), plt.yticks(range(d))
plt.tight_layout()
plt.show()

Quantifying Interaction Strength

Relative Contribution

# Total sum of absolute coefficients by order
contributions = {}

anova = model.get_anova_decomposition()
for order, basis_list in anova.items():
    total_coef = sum(
        abs(model.coefficients_[model.basis_functions_.index(bf)])
        for bf in basis_list
    )
    contributions[order] = total_coef

# Normalize
total = sum(contributions.values())
relative = {k: v/total for k, v in contributions.items()}

print("Relative Contribution by Interaction Order:")
for order in sorted(relative.keys()):
    print(f"  Order {order}: {relative[order]:.1%}")

Example Output:

Relative Contribution by Interaction Order:
  Order 0 (constant): 5.2%
  Order 1 (main):    78.3%
  Order 2 (2-way):   16.5%

Interpretation

This model is: - Dominated by main effects (78%) - Minor interactions (17%) - Mostly additive structure

Compare Models

for max_degree in [1, 2, 3]:
    model = MARS(max_degree=max_degree)
    model.fit(X, y)

    anova = model.get_anova_decomposition()

    print(f"max_degree={max_degree}:")

    for order in sorted(anova.keys()):
        n_basis = len(anova[order])
        print(f"  Order {order}: {n_basis} basis functions")

Interaction Constraints

PyMARS respects interaction constraints during fitting:

max_degree constraint: Cannot create basis of degree > max_degree
Parent constraint: A basis can only be parent if not already degree max_degree
Forward pass: Only adds child bases respecting constraints

Example:

# Can't create 3-way without allowing it
model = MARS(max_degree=2)  # max degree 2

# Forward pass will never create a basis like:
# B = h(x0, t0, d0) * h(x1, t1, d1) * h(x2, t2, d2)

# But with max_degree=3, it can

Testing for Interactions

Statistical Test

from scipy.stats import f_oneway

# Additive model (no interactions)
model_1 = MARS(max_degree=1)
model_1.fit(X, y)
residuals_1 = y - model_1.predict(X)
ss_1 = np.sum(residuals_1**2)

# Interaction model
model_2 = MARS(max_degree=2)
model_2.fit(X, y)
residuals_2 = y - model_2.predict(X)
ss_2 = np.sum(residuals_2**2)

# F-test: Does adding interactions significantly improve fit?
n = len(X)
p1 = len(model_1.basis_functions_)
p2 = len(model_2.basis_functions_)

f_stat = ((ss_1 - ss_2) / (p2 - p1)) / (ss_2 / (n - p2))

from scipy.stats import f
p_value = 1 - f.cdf(f_stat, p2-p1, n-p2)

print(f"F-statistic: {f_stat:.4f}")
print(f"p-value: {p_value:.6f}")

if p_value < 0.05:
    print("Interactions are significant! ✓")
else:
    print("No significant interactions detected.")

Interpreting Results

Strong Main Effects, Weak Interactions

Model structure: .. math:

\hat{f} \approx f(x_0) + f(x_1) + f(x_2)

Indicates: - Variables act independently - Simple, interpretable model - Good for prediction and explanation

Weak Main Effects, Strong Interactions

Model structure: .. math:

\hat{f} \approx f(x_0, x_1) + f(x_1, x_2) + \ldots

Indicates: - Variables only matter in combination - Complex interdependencies - Harder to interpret but captures key insights

Mixed (Both Strong)

Model combines main + interaction effects: .. math:

\hat{f} \approx f(x_0) + f(x_1) + f(x_0, x_1) + \ldots

Indicates: - Variables have individual effects - Plus synergistic effects - Requires explaining both components

Best Practices

Start with max_degree=1 – ensures interpretability
Check if interactions help – compare R² and GCV
Visualize interactions – use surface plots
Quantify strength – relative contributions
Test significance – F-test or cross-validation
Interpret carefully – interactions are harder to explain

Summary

Key Points:

✅ MARS naturally finds interactions as basis products
✅ max_degree controls max interaction order
✅ ANOVA decomposes model by degree
✅ 2-way interactions usually sufficient
✅ Visualize with surface and heatmap plots
✅ Test significance before concluding interactions matter

When to Use Interactions:

✅ Domain knowledge suggests interactions
✅ R² improves significantly
✅ GCV score decreases with interactions
✅ You care about understanding relationships

When to Avoid:

❌ Need highest interpretability
❌ Sample size is very small
❌ Computational speed is critical
❌ Explanation to non-technical audience required