Novel Method

The Complexity Navigator: Finding the Simplest Model That Actually Works

By Moonlit Social Labs · April 3, 2026 · 7 min read

Every researcher who has ever built a regression model has faced the same question: how complex should this be? Add too few predictors and you miss real relationships. Add too many and your model memorizes noise, producing results that look great on your data but fail to replicate.

Textbooks tell you to use AIC, BIC, or cross-validation. But these tools give you numbers for individual models — they don't show you the full landscape. The Complexity Navigator does.

The Bias-Variance Tradeoff, Visualized

The fundamental tension in modeling is between bias (systematic error from oversimplification) and variance (sensitivity to the specific sample you happened to collect). A model that's too simple has high bias — it misses real patterns. A model that's too complex has high variance — it captures patterns that are just noise.

In theory, there's a sweet spot. In practice, finding it requires trying multiple models and comparing them on meaningful criteria. That's tedious work, and most researchers either skip it entirely (defaulting to the model they planned to run) or do it informally (adding terms until the R-squared "looks good enough").

How the Complexity Navigator Works

Plain language

The Complexity Navigator fits your data to models of increasing complexity — starting from "just predict the average" and progressing through linear, quadratic, cubic, and higher-order polynomial models. At each step, it computes four quality metrics and plots them on a single path.

The result is a "complexity path" — a visual trajectory from simple to complex. You can see exactly where:

Adding complexity helps (prediction error drops, explained variance rises)
Returns diminish (each new term buys almost nothing)
Overfitting begins (training error keeps falling but test error starts rising)
The recommended "natural resting point" sits — the simplest model that captures what's real

Technical details

The tool fits a complexity ladder: intercept-only, linear, polynomial(2), ..., polynomial(k). For multivariate predictors, interaction terms are included at each step. At each rung, it computes:

AIC (Akaike, 1974): penalizes model complexity by 2k parameters. Tends to favor slightly more complex models.
BIC (Schwarz, 1978): penalizes by k ln(n), which is heavier for large samples. Tends to favor simpler models.
Adjusted R²: variance explained, adjusted for the number of parameters. Prevents the illusion that more terms always improve fit.
k-fold CV RMSE: out-of-sample prediction error from cross-validation. The gold standard for generalization.

The natural resting point is the model with the lowest BIC. The tool also reports the simplest model within ΔBIC < 2 of the best (following Burnham & Anderson's guidelines), since models within 2 BIC units are considered roughly equivalent.

Additional diagnostics:

Overfitting flag: triggered when CV RMSE rises while training RMSE continues to fall.
Diminishing returns flag: triggered when the marginal gain in adjusted R² drops below 1%.

Honest Assessment

The Complexity Navigator packages well-established model selection theory (AIC, BIC, cross-validation) into a single navigable interface. The math is not new. The value is in the unified view — seeing the full complexity path at once rather than manually comparing models. The "Natural Resting Point" metaphor is made concrete through BIC minimization and the ΔBIC < 2 equivalence zone.

When to Use It

The Complexity Navigator is useful whenever you're:

Building a regression model and unsure whether to include quadratic or interaction terms
Exploring a new dataset and want to understand the functional form of a relationship
Responding to a reviewer who asks "did you check for non-linear effects?"
Teaching students about the bias-variance tradeoff with a concrete, visual tool

Example

You have data on study hours (x) and exam scores (y) for 120 students. You suspect the relationship might be non-linear — maybe there are diminishing returns at high study hours.

The Complexity Navigator fits models from intercept-only through polynomial(6) and reports:

Linear model: Adj. R² = 0.47, BIC = 612
Quadratic model: Adj. R² = 0.61, BIC = 589 ← Natural resting point
Cubic model: Adj. R² = 0.62, BIC = 591 (within ΔBIC < 2, but no meaningful gain)
Polynomial(4+): CV RMSE starts rising. Overfitting flag triggered.

Conclusion: the relationship between study hours and exam scores is curved (quadratic), with diminishing returns. There is no evidence for more complex functional forms.

Try It

The Complexity Navigator is available in the Novel Methods module. Enter your X and Y values and see the full complexity path in seconds.

Get started free →

Novel Method

Convergent Core Analysis

You ran 200 specifications. Now what? CCA extracts the core finding that survives across virtually all analytical choices.

Novel Method

Scale-Persistent Features

Some findings vanish when you zoom in or out. SPF tests whether your variables hold their relationships across different scales of analysis.

The Complexity Navigator: Finding the Simplest Model That Actually Works

The Bias-Variance Tradeoff, Visualized

How the Complexity Navigator Works

Plain language

Technical details

When to Use It

Example

Try It

Related Posts