PyMARS: MARS Implementation (Friedman 1991)
PyMARS is a comprehensive, educational implementation of Multivariate Adaptive Regression Splines (MARS) based on Jerome Friedman’s seminal 1991 paper, extended with cubic spline support and full interactive term analysis.
What is MARS?
MARS is a nonparametric regression technique that:
✅ Automatically identifies relevant variables from high-dimensional data
✅ Detects two-way (or higher) interaction effects
✅ Produces continuous, differentiable models with hinge basis functions
✅ Provides interpretable decomposition via ANOVA
✅ Works optimally with 3–20 predictors and 50–1000 observations
✅ Applies data-driven knot selection with GCV-based model selection
MARS combines the flexibility of recursive partitioning (tree-based methods) with the smoothness of piecewise regression and cubic extensions.
Key Features
Core Capabilities
Feature |
Description |
Status |
|---|---|---|
Forward Pass |
Greedy basis pair addition |
✅ Complete |
Backward Pruning |
GCV-optimized reduction |
✅ Complete |
Basis Functions |
Hinge products: \(h(x,t,d) = (d(x-t))_+\) |
✅ Complete |
Knot Selection |
Minspan/Endspan from Friedman (1991) |
✅ Complete |
Interactions |
Max-degree detection & visualization |
✅ Complete |
Cubic Splines |
Continuous 2nd derivative |
✅ Complete |
ANOVA Decomposition |
Univariate & interaction effects |
✅ Complete |
GCV Model Selection |
Generalized cross-validation with penalty |
✅ Complete |
Theory & References
This implementation fully adheres to:
Friedman, J.H. (1991) – “Multivariate Adaptive Regression Splines.” The Annals of Statistics, 19(1), 1–67. [[PDF]](https://projecteuclid.org/euclid.aos/1176347963)
Friedman & Silverman (1989) – Flexible parsimonious smoothing and additive modeling.
Every major algorithm, equation, and formula from the original paper is reproduced and verified.
Quick Start
Installation
git clone https://github.com/abder111/pymars.git
cd pymars
pip install -e .
Basic Usage
from pymars import MARS
import numpy as np
# Generate data
X = np.random.randn(200, 5)
y = X[:, 0]**2 + 2*X[:, 1]*X[:, 2] + np.random.randn(200)*0.1
# Fit MARS model
model = MARS(max_terms=20, max_degree=2)
model.fit(X, y)
# Predictions
y_pred = model.predict(X)
# Model summary
model.summary()
Documentation
Getting Started
- Installation
- User Guide
- Tutorial
- Complete Step-by-Step Example
- Step 1: Data Generation
- Step 2: Create and Fit MARS Model
- Step 3: Model Evaluation
- Step 4: Model Summary
- Step 5: Feature Importance
- Step 6: Visualize Predictions
- Step 7: Basis Functions
- Step 8: ANOVA Decomposition
- Step 9: Compare Linear vs Cubic
- Step 10: Partial Effects (1D Slices)
- Step 11: Cross-Validation
- Step 12: Parameter Tuning
- Step 13: Final Model with Optimal Parameters
- Complete Code
- Key Takeaways
- Next Steps
Theory & Algorithms
API Reference
Project Status
✅ Production Ready – Version 0.1.0
✅ All 11 bugs fixed & verified
✅ 55+ comprehensive tests (100% passing)
✅ Friedman 1991 compliance confirmed
✅ Cubic implementation tested
✅ Full documentation generated
Development Team
ES-SAFI ABDERRAHMAN – Lead Developer
LAMGHARI YASSINE – Core Developer
CHAIBOU SAIDOU ABDOUYE – Core Developer
Repository
License: MIT (2025)
License
MIT License
Copyright (c) 2025 ES-SAFI ABDERRAHMAN, LAMGHARI YASSINE, CHAIBOU SAIDOU ABDOUYE
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software...
See LICENSE file for full text.