Statistical Inputs That Power Fantasy Projections
Every fantasy projection starts with raw data — specific, measurable inputs that get fed into a model before any output number appears. The quality of those inputs determines the quality of the projection, which is why understanding what goes into the machine matters as much as knowing how to read what comes out. This page covers the core statistical categories that projection systems draw on, how those inputs interact, and where the signal breaks down.
Definition and scope
A statistical input, in the context of fantasy projections, is any quantified data point used to estimate a player's future performance output. These inputs range from raw counting stats (carries, targets, innings pitched) to derived efficiency metrics (yards per route run, strikeout rate, true shooting percentage) to external contextual variables (Vegas implied team totals, injury reports, weather forecasts).
The scope is deliberately broad. A projection model might pull from 40 or more distinct input variables for a single player, weighting each one differently depending on position, sport, and scoring format. The data sources used in projections span official league tracking systems, sports reference databases, and third-party analytics providers — all feeding into models that are updated on schedules documented at projection update schedules.
Two broad categories define the input landscape:
- Historical performance data — what the player has actually done, across multiple seasons and sample sizes
- Contextual or situational data — what conditions the player faces going forward, including opponent quality, role changes, and game environment
Neither category alone produces reliable projections. Historical data tells the story of who a player was; contextual data shapes the forecast for who they're likely to be.
How it works
Statistical inputs don't simply get averaged together. A projection system applies weights — either manually assigned by analysts or learned through machine learning — that reflect how predictive each input actually is across large populations of players. Machine learning in fantasy projections explores the algorithmic side of that weighting process in detail.
The basic pipeline works in three stages:
- Data ingestion — Raw stats are pulled from source databases (NFL Next Gen Stats, Statcast, NBA Second Spectrum, etc.) and cleaned for completeness and accuracy.
- Feature engineering — Raw counts get transformed into rate-based or efficiency metrics. A wide receiver's 80 targets becomes a 23.5% target share when divided by his team's total targets, which is a far more portable and stable signal than the raw number alone. Snap count and target share data breaks down why opportunity rates outperform raw volume as inputs.
- Weighted aggregation — Each engineered feature gets assigned a predictive weight derived from historical correlation with future output. Regression to the mean in fantasy is partly a story about what happens when analysts over-weight recent small-sample data and under-weight longer-term baselines.
A key distinction worth holding onto: inputs correlated with past performance versus inputs correlated with future performance. A quarterback's passer rating is easy to calculate from last season's data — but completion percentage over expected (CPOE), which accounts for the difficulty of each attempt, is a stronger predictor of future performance according to research published by ESPN's Stats & Information Group and later formalized through NFL Next Gen Stats public documentation.
Common scenarios
Three situations illustrate how input selection changes based on context.
Preseason projections lean heavily on historical baselines, adjusted for role changes. If a running back changes teams and inherits a 20-carry workload from an injured starter, the projection system needs to weight his historical yards-per-carry efficiency against the new offensive line's blocking grades and the opponent schedule. Preseason versus in-season projections covers how input emphasis shifts once real-season data begins accumulating.
DFS single-game slates compress the time horizon and amplify contextual inputs. Vegas implied totals, spread, and game total become dominant variables because they encode the collective market's estimate of scoring environment. A game with an over/under of 51.5 points will produce more fantasy-relevant volume than one at 38.5 — and Vegas lines and fantasy projections traces exactly how that translation works mathematically.
Injury-adjusted projections require a different input set entirely. When a player is returning from a hamstring strain, snap count limits, practice participation reports, and historical recovery timelines for that injury type all enter the model as weighted inputs alongside traditional performance data. Injury adjustments in projections covers the specific methodologies applied here.
Decision boundaries
Not all statistical inputs improve a projection. Some add noise rather than signal, particularly when the underlying sample size is too small to stabilize. A starting pitcher with 3 appearances in a new ballpark doesn't have enough data for ballpark factor to function as a reliable input — forcing that variable into the model can actually reduce accuracy compared to leaving it out.
Sample size and projection reliability documents the specific thresholds at which common inputs become trustworthy: target share stabilizes around 50 targets for NFL receivers; ERA stabilizes near 300 innings for MLB starters according to research published by FanGraphs — a figure that makes single-season ERA essentially noise for projection purposes.
The decision boundary question also applies to scoring format impact on projections. A reception, in PPR formats, is worth 1 point — which means a running back's target share becomes a first-order input, not a secondary one. The same player, the same raw stats, and a different scoring format can produce meaningfully different projection outputs purely because the input weights shift.
The full projection ecosystem — how these inputs combine, how they're validated, and how they translate into actionable numbers — is documented across fantasyprojectionlab.com.