Methodology
Golf Data Viz estimates strokes gained from round-level scorecard stats. This page explains the formulas, data sources, confidence levels, and known limitations. New to strokes gained? Start with our beginner's guide.
SG Formulasv3.3.0
These formulas reflect the current production methodology version. Coefficients may change in future releases as the model is recalibrated.
Off the Tee
w: 6.0, 0.8(FIR% − peerFIR%) × 6.0 + (peerPenalties − penalties) × 0.8
Approach
w: 8.0(GIR/18 − peerGIR%) × 8.0
Around the Green
w: 5.0(scrambleRate − peerScramble%) × 5.0
ATG-fallback uses a reduced calibration weight from the table below so missing short-game inputs do not over-attribute misses to chipping.
Putting
w: 4.0(expectedPutts − actualPutts) / 18 × 4.0
expectedPutts = GIR × puttsPerGIR + (18 − GIR) × puttsPerNonGIR. Falls back to (peerPutts/18 − playerPutts/18) when GIR is unavailable. Optional three-putt input adds a bounded adjustment of up to ±0.50 strokes.
Total Anchor
Total SG is anchored to a peer expectation so that category values sum to a coherent total. Two modes exist:
Course-Adjusted (preferred)
When course rating and slope are available, total SG is anchored to a course-adjusted peer expectation:
peerExpectation = courseRating + (handicapIndex × slopeRating / 113)
totalSG = peerExpectation − actualScore
Based on the standard USGA expected score formula. Positive = played better than expected.
Course-Neutral (fallback)
When course metadata is missing or invalid, the tool falls back to a course-neutral estimate and labels it accordingly:
totalSG = benchmark.averageScore − actualScore
Course-neutral mode activates when course rating is 0 or slope is outside the valid 55–155 range.
Calibration
Raw stat deltas (e.g., FIR% difference, GIR difference) are multiplied by calibrated coefficients to produce category SG values. Coefficients are versioned separately (seed-1.1.0) and may be updated independently of the methodology version.
Input paths
Coefficients vary by the data available for each round:
- Full — GIR provided and up-and-down data available (or no missed greens)
- GIR-estimated — GIR not provided by the user, estimated from scoring distribution
- ATG-fallback — GIR provided but no up-and-down data with missed greens
| Profile | OTT FIR | OTT Pen | Approach | ATG | Putting |
|---|---|---|---|---|---|
| Full | 6 | 0.8 | 8 | 5 | 4 |
| GIR-estimated | 6 | 0.8 | 6.5 | 4 | 4 |
| ATG-fallback | 6 | 0.8 | 8 | 2.5 | 4 |
Coefficients are model artifacts, not hardcoded truths. The current seed coefficients (seed-1.1.0) are derived from heuristic analysis and will be empirically recalibrated as production data accumulates.
Reconciliation
After calibration, per-category values may not sum exactly to the total anchor. Reconciliation scales categories so they sum to the anchor value.
Confidence-weighted scaling: Lower-confidence categories absorb more of the adjustment. A category rated “Low” confidence will shift more than one rated “High.”
Scale factor: The maximum proportional change applied to any category. A scale factor near 0 means strong alignment between the calibrated sum and the anchor. A factor above 0.5 triggers an “excessive scaling” flag.
Sign-flip prevention: If a reconciliation adjustment would reverse a category's sign (e.g., turning a positive into a negative), the category is clamped to zero instead. The excess amount is tracked as “Other (not captured by scorecard stats)” and shown on the results page when non-zero.
Signal breakdown: The info icon next to each category shows the raw signal value and reconciliation adjustment separately, so you can see how much of the final value came from your scorecard data vs. the reconciliation step.
Skipped categories (value = 0) are excluded from reconciliation. Final categories sum to the total anchor within ±0.1.
Attribution Correctionac-1.0.0
Between calibration and reconciliation, an attribution correction layer detects divergence patterns between Off the Tee and Approach and redistributes a bounded amount of strokes between them.
Why
In scorecard-based models, some tee-shot value leaks into Approach when longer or more playable drives create shorter approach shots. A player with low FIR% but high GIR% may be under-credited on OTT and over-credited on Approach (or vice versa).
How
The correction measures the divergence between a player's FIR% delta and GIR% delta versus their handicap peers. When the divergence exceeds a deadzone (±0.05), a bounded zero-sum adjustment shifts strokes between OTT and Approach.
divergence = (playerGIR% − peerGIR%) − (playerFIR% − peerFIR%)
correction = clamp(divergence × 0.6 × pathMultiplier × confidenceGate × shrinkage, −0.5, +0.5)
OTT += correction, Approach −= correction
Invariants
- Zero-sum — OTT adjustment + Approach adjustment = 0. Putting and Around the Green are never modified.
- Bounded — hard-capped at ±0.5 strokes maximum shift per round.
- Confidence-gated — skipped entirely when FIR data is missing (OTT confidence “Low”). Reduced by 50% when GIR is estimated (Approach confidence “Med”).
- Path-dependent — enabled for Full and GIR-estimated input paths. Disabled for ATG-fallback.
Divergence can be caused by driving distance, par mix, course setup, or round noise — not exclusively distance. This correction partially mitigates the attribution limitation but does not replace shot-level distance data.
Data Sources & Citations
All benchmark metrics used in the SG calculation are source-locked and versioned. 6 of 7 tracked metrics have published-source coverage for some brackets. 1 remain unsourced.
Confidence Levels
Confidence is assigned separately for each category based on the calculation path used for that category, not as an overall rating of the full round. Each category displays a confidence badge reflecting the quality of data available for that estimate.
| Level | Meaning | Category Examples |
|---|---|---|
| High | Direct data provided by user | Putting (total putts), Approach (GIR provided), ATG (up-and-down data) |
| Med | Derived from related inputs | OTT (FIR-only — no distance/miss quality), Approach (GIR estimated), ATG (from GIR + scoring) |
| Low | Limited data available | OTT (penalties only), ATG (estimated from estimated GIR) |
Plus Handicaps
Plus-handicap rounds are supported using extrapolated peer benchmarks below scratch. For category comparisons, we project the 0–5 handicap trend below 0 where the source data behaves sensibly. FIR% stays fixed at the scratch benchmark because the underlying benchmark data is non-monotonic, and GIR% is capped at 80% to avoid unrealistic elite values. Total strokes gained still uses the player's actual plus handicap in the course-adjusted calculation. This replaces the previous scratch-clamped approach, which could flatten category results and make it harder for plus-handicap players to see what to work on.
Off the Tee Limitations
Off the Tee is primarily driven by fairway hit rate, without shot-level distance context, so it cannot distinguish a short straight drive from a long playable miss. It also does not capture miss direction, playable-vs-dead misses, or per-hole recovery context. Because of that, OTT is a medium-confidence category: a missed fairway can mean anything from “still a great drive” to “hole effectively over.” Our current attribution correction partially reduces obvious OTT/Approach cross-contamination, but higher-confidence OTT will require richer inputs such as distance and miss quality in a future phase.
Scorecard SG vs Shot-Level SG
True strokes gained (as used by the PGA Tour) measures each shot's start and end position against expected strokes from that location. This requires GPS or shot-tracking hardware.
Scorecard SG estimates category-level performance from aggregate round stats (fairways hit, GIR, putts, penalties, scoring distribution). It answers “where am I gaining/losing relative to my handicap peers?” without shot-level data.
Scorecard SG is directionally useful for practice prioritization but cannot capture within-category nuance (e.g., putt starting distances, approach miss direction, driving distance).
Assumptions & Limitations
- Scorecard-based model — not true SG Putting (requires putt starting distances)
- Composite averages — weighted composites from public reports, not a single sampled dataset
- Weights — seed coefficients (seed-1.1.0) derived from heuristic analysis, subject to empirical recalibration as production data accumulates
- OTT→Approach attribution — in scorecard-based models, some tee-shot value can appear in Approach when longer or more playable drives create shorter approach shots. The attribution correction layer (ac-1.0.0) partially mitigates this using FIR/GIR divergence patterns, but full correction requires shot-level distance and lie data.
- Scoring distribution — scoring-derived logic is used only in specific fallback paths where direct inputs are unavailable
Directional Fixture Check
These examples validate that the model behaves sensibly. They are not proof of perfect calibration and should not be interpreted as reference benchmarks for all rounds.
| Scenario | HCP | Score | Expected Direction | Actual SG Total |
|---|---|---|---|---|
| Scratch good round | 2 | 73 | Positive (better than peers) | +1.30 |
| 10-HCP average | 10 | 84 | ~0 (near peer average) | -0.50 |
| 15-HCP bad round | 15 | 98 | Negative (worse than peers) | -8.74 |
| 20-HCP typical | 22 | 97 | ~0 (near peer average) | +0.31 |
| 30+ HCP round | 35 | 115 | Negative (worse than peers) | -2.73 |
Changelog
- v1.1.0 (2026-03-14) — Added puttsPerGIR and puttsPerNonGIR to all anchors and brackets for GIR-adjusted putting delta in V3 pipeline.
- v1.0.0+extrap (2026-03-09) — Plus-handicap benchmark extrapolation: project 0-5 gradient below scratch with per-metric safeguards (FIR% frozen, GIR% cap 80%, putts floor 27, penalties floor 0.05, score floor 60). Replaces scratch clamping.
- v1.0.0 (2026-03-06) — All SG-consumed metrics source-locked from Shot Scope. Anchor-based interpolation replaces bracket snapping. puttsPerRound and penaltiesPerRound now sourced.
- v0.2.0 (2026-03-01) — Shot Scope data for core metrics (brackets 0-5 through 25-30). Average score derived from par-hole averages. FIR% confirmed non-monotonic across handicaps.
- v0.1.0 (2026-02-28) — Initial seed data. Provisional.