Interaction Analysis & ANOVA
Overview
MARS naturally identifies interaction terms (products of univariate hinges) and provides ANOVA decomposition to understand which variables interact and their magnitudes.
Interaction Concepts
What is an Interaction?
An interaction occurs when the effect of one variable depends on the value of another.
Example:
The true model: .. math:
y = x_0 + x_1 + x_0 \cdot x_1
has a 2-way interaction between \(x_0\) and \(x_1\):
Main effect of \(x_0\): \(\frac{\partial y}{\partial x_0} = 1 + x_1\)
Effect depends on \(x_1\): interaction!
Visually:
No interaction: With interaction:
y │ y │ ╱╱
│ ╱ ╱ │ ╱ ╱
│ ╱ ╱ │ ╱ ╱
│ ╱ ╱ │ ╱ ╱
└─────────────► x0 └──────────► x0
x1=0 (lines converge!)
(parallel)
MARS Basis Functions as Interactions
MARS represents interactions as products of univariate hinges:
Example MARS basis:
B_5 = h(x0, +, 0.5) * h(x1, +, 1.0)
Meaning: This basis "turns on" when:
- x0 > 0.5 AND
- x1 > 1.0
The product of hinges creates a localized interaction in that region.
Interaction Degree
Degree |
Form |
Interpretation |
|---|---|---|
0 |
Constant |
Baseline |
1 |
\(h(x_i, t, d)\) |
Main effect of \(x_i\) |
2 |
\(h(x_i, t_1, d_1) \times h(x_j, t_2, d_2)\) |
2-way interaction |
3 |
Product of 3 hinges |
3-way interaction |
k |
Product of k hinges |
k-way interaction |
Parameter: max_degree (default=1)
max_degree=1: No interactions (additive model)
max_degree=2: Up to 2-way interactions
max_degree=3: Up to 3-way interactions
Fitting Models with Interactions
Allowing Interactions
from pymars import MARS
# Additive (no interactions)
model_additive = MARS(max_degree=1)
# With 2-way interactions
model_2way = MARS(max_degree=2)
# With 3-way interactions
model_3way = MARS(max_degree=3)
Typically: - Start with max_degree=1 (simpler) - If R² is low, try max_degree=2 - Rarely need max_degree > 2
Example: Detecting Interactions
import numpy as np
from pymars import MARS
# Generate data with TRUE interaction
np.random.seed(42)
X = np.random.uniform(-1, 1, (200, 3))
# y = x0 + x1 + x0*x1 (interaction between x0, x1)
y = X[:, 0] + X[:, 1] + X[:, 0] * X[:, 1] + np.random.randn(200)*0.1
# Fit WITHOUT interactions
model_no_int = MARS(max_degree=1)
model_no_int.fit(X, y)
r2_no_int = model_no_int.score(X, y)
# Fit WITH interactions
model_with_int = MARS(max_degree=2)
model_with_int.fit(X, y)
r2_with_int = model_with_int.score(X, y)
print(f"Without interactions: R² = {r2_no_int:.4f}")
print(f"With interactions: R² = {r2_with_int:.4f}")
print(f"Improvement: {(r2_with_int - r2_no_int):.4f}")
Expected Output:
Without interactions: R² = 0.7234
With interactions: R² = 0.9876
Improvement: 0.2642
The significant improvement indicates the interaction was detected!
ANOVA Decomposition
ANOVA = Analysis of Variance, extended here to MARS basis functions.
Goal: Decompose the fitted function into:
where: - \(c_0\) = constant term - \(f_i(x_i)\) = main effect of variable \(i\) - \(f_{ij}(x_i, x_j)\) = 2-way interaction between \(i\) and \(j\)
Extracting ANOVA Decomposition
# Fit model with interactions
model = MARS(max_degree=2)
model.fit(X, y)
# Get ANOVA decomposition
anova = model.get_anova_decomposition()
# anova is a dict: {order: [basis_functions]}
# order=0: constant
# order=1: main effects
# order=2: 2-way interactions
# order=3: 3-way interactions
Inspecting Components
anova = model.get_anova_decomposition()
print("ANOVA Decomposition:\n")
# Constant term
if 0 in anova:
c0 = model.coefficients_[0]
print(f"Constant: {c0:.6f}\n")
# Main effects
if 1 in anova:
print("Main Effects (Degree 1):")
for bf in anova[1]:
idx = model.basis_functions_.index(bf)
coef = model.coefficients_[idx]
print(f" {bf} : coef = {coef:.6f}")
print()
# 2-way interactions
if 2 in anova:
print("2-way Interactions (Degree 2):")
for bf in anova[2]:
idx = model.basis_functions_.index(bf)
coef = model.coefficients_[idx]
print(f" {bf} : coef = {coef:.6f}")
Contribution to Prediction
Compute each term’s contribution to predictions:
def anova_contributions(model, X):
"""
Compute contribution of each ANOVA term to predictions.
Returns:
dict: {order: contributions}
"""
anova = model.get_anova_decomposition()
contributions = {0: np.full(len(X), model.coefficients_[0])}
for order, basis_list in anova.items():
if order == 0:
continue
contrib = np.zeros(len(X))
for bf in basis_list:
idx = model.basis_functions_.index(bf)
coef = model.coefficients_[idx]
bf_eval = bf.evaluate(X) # Evaluate basis
contrib += coef * bf_eval
contributions[order] = contrib
return contributions
Visualizing Interactions
Partial Effects Plot
For a 2-way interaction, plot the effect of varying both variables:
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
# Grid of x0, x1 values
x0_range = np.linspace(X[:, 0].min(), X[:, 0].max(), 50)
x1_range = np.linspace(X[:, 1].min(), X[:, 1].max(), 50)
X0_grid, X1_grid = np.meshgrid(x0_range, x1_range)
# Predictions on grid (fix x2 at median)
x2_median = np.median(X[:, 2])
X_grid = np.column_stack([
X0_grid.ravel(),
X1_grid.ravel(),
np.full(X0_grid.size, x2_median)
])
y_grid = model.predict(X_grid).reshape(X0_grid.shape)
# 3D surface plot
fig = plt.figure(figsize=(10, 8))
ax = fig.add_subplot(111, projection='3d')
surf = ax.plot_surface(X0_grid, X1_grid, y_grid, cmap='viridis', alpha=0.8)
ax.scatter(X[:, 0], X[:, 1], y, alpha=0.3, s=10)
ax.set_xlabel('x0'), ax.set_ylabel('x1'), ax.set_zlabel('y')
ax.set_title('Interaction Effect: y vs (x0, x1)')
fig.colorbar(surf, ax=ax, label='Predicted y')
plt.show()
Interaction Heatmap
Visualize which pairs of variables interact:
import matplotlib.pyplot as plt
# Count interactions per variable pair
d = X.shape[1]
interaction_matrix = np.zeros((d, d))
anova = model.get_anova_decomposition()
for order, basis_list in anova.items():
if order == 2:
for bf in basis_list:
# Extract variables from basis function
vars_in_bf = bf.get_variables() # [i, j]
if len(vars_in_bf) == 2:
i, j = vars_in_bf
interaction_matrix[i, j] += 1
interaction_matrix[j, i] += 1
# Heatmap
plt.figure(figsize=(8, 6))
plt.imshow(interaction_matrix, cmap='YlOrRd', interpolation='nearest')
plt.colorbar(label='Interaction Count')
plt.xlabel('Variable')
plt.ylabel('Variable')
plt.title('Variable Interactions in MARS Model')
plt.xticks(range(d)), plt.yticks(range(d))
plt.tight_layout()
plt.show()
Quantifying Interaction Strength
Relative Contribution
# Total sum of absolute coefficients by order
contributions = {}
anova = model.get_anova_decomposition()
for order, basis_list in anova.items():
total_coef = sum(
abs(model.coefficients_[model.basis_functions_.index(bf)])
for bf in basis_list
)
contributions[order] = total_coef
# Normalize
total = sum(contributions.values())
relative = {k: v/total for k, v in contributions.items()}
print("Relative Contribution by Interaction Order:")
for order in sorted(relative.keys()):
print(f" Order {order}: {relative[order]:.1%}")
Example Output:
Relative Contribution by Interaction Order:
Order 0 (constant): 5.2%
Order 1 (main): 78.3%
Order 2 (2-way): 16.5%
Interpretation
This model is: - Dominated by main effects (78%) - Minor interactions (17%) - Mostly additive structure
Compare Models
for max_degree in [1, 2, 3]:
model = MARS(max_degree=max_degree)
model.fit(X, y)
anova = model.get_anova_decomposition()
print(f"max_degree={max_degree}:")
for order in sorted(anova.keys()):
n_basis = len(anova[order])
print(f" Order {order}: {n_basis} basis functions")
Interaction Constraints
PyMARS respects interaction constraints during fitting:
max_degree constraint: Cannot create basis of degree > max_degree
Parent constraint: A basis can only be parent if not already degree max_degree
Forward pass: Only adds child bases respecting constraints
Example:
# Can't create 3-way without allowing it
model = MARS(max_degree=2) # max degree 2
# Forward pass will never create a basis like:
# B = h(x0, t0, d0) * h(x1, t1, d1) * h(x2, t2, d2)
# But with max_degree=3, it can
Testing for Interactions
Statistical Test
from scipy.stats import f_oneway
# Additive model (no interactions)
model_1 = MARS(max_degree=1)
model_1.fit(X, y)
residuals_1 = y - model_1.predict(X)
ss_1 = np.sum(residuals_1**2)
# Interaction model
model_2 = MARS(max_degree=2)
model_2.fit(X, y)
residuals_2 = y - model_2.predict(X)
ss_2 = np.sum(residuals_2**2)
# F-test: Does adding interactions significantly improve fit?
n = len(X)
p1 = len(model_1.basis_functions_)
p2 = len(model_2.basis_functions_)
f_stat = ((ss_1 - ss_2) / (p2 - p1)) / (ss_2 / (n - p2))
from scipy.stats import f
p_value = 1 - f.cdf(f_stat, p2-p1, n-p2)
print(f"F-statistic: {f_stat:.4f}")
print(f"p-value: {p_value:.6f}")
if p_value < 0.05:
print("Interactions are significant! ✓")
else:
print("No significant interactions detected.")
Interpreting Results
Strong Main Effects, Weak Interactions
Model structure: .. math:
\hat{f} \approx f(x_0) + f(x_1) + f(x_2)
Indicates: - Variables act independently - Simple, interpretable model - Good for prediction and explanation
Weak Main Effects, Strong Interactions
Model structure: .. math:
\hat{f} \approx f(x_0, x_1) + f(x_1, x_2) + \ldots
Indicates: - Variables only matter in combination - Complex interdependencies - Harder to interpret but captures key insights
Mixed (Both Strong)
Model combines main + interaction effects: .. math:
\hat{f} \approx f(x_0) + f(x_1) + f(x_0, x_1) + \ldots
Indicates: - Variables have individual effects - Plus synergistic effects - Requires explaining both components
Best Practices
Start with max_degree=1 – ensures interpretability
Check if interactions help – compare R² and GCV
Visualize interactions – use surface plots
Quantify strength – relative contributions
Test significance – F-test or cross-validation
Interpret carefully – interactions are harder to explain
Summary
Key Points:
✅ MARS naturally finds interactions as basis products
✅ max_degree controls max interaction order
✅ ANOVA decomposes model by degree
✅ 2-way interactions usually sufficient
✅ Visualize with surface and heatmap plots
✅ Test significance before concluding interactions matter
When to Use Interactions:
✅ Domain knowledge suggests interactions
✅ R² improves significantly
✅ GCV score decreases with interactions
✅ You care about understanding relationships
When to Avoid:
❌ Need highest interpretability
❌ Sample size is very small
❌ Computational speed is critical
❌ Explanation to non-technical audience required