Pillar guide

Expected Goals (xG) Explained

How shot-quality modelling replaced scoreline analysis in modern football

Expected Goals (xG) is the most important single statistic in football analytics - and also the most misunderstood. In its simplest form, xG answers the question: given the exact situation of this shot (distance, angle, body part, whether it came from a cross, whether a defender was closing in), how often does an average player score? Add up the xG for every shot a team takes in a match and you get their expected goal total - a much more honest estimate of how well they played than the actual goal total, which is subject to woodwork, lucky deflections and goalkeeping heroics.

Where xG comes from

The xG concept emerged in the early 2010s, popularised by analysts like Sam Green at Opta and Michael Caley at Cartilage Free Captain. The basic insight is that shots are not equally valuable: a tap-in from two yards has a ~90% scoring probability; a speculative 30-yard effort has a ~3% scoring probability. Treating every shot as a single 'shots on target' unit washes out the real signal.

A modern xG model is a logistic regression (or gradient-boosted tree) trained on millions of historical shots. For each shot it ingests distance to goal, angle to goal, body part used, pass type that preceded it, number of defenders between shooter and goal, and sometimes game state (winning / losing, minute of the match). The output is a probability between 0 and 1 - that's the xG value of the shot.

A team with 1.8 xG in a match is said to have generated chances worth 1.8 goals on average. If they won the match 3-0 despite only 1.2 xG, they over-performed their process; if they lost 0-1 with 2.4 xG, they under-performed. Over long samples, actual goals converge toward xG because the deviations are mostly random (keeper saves, post hits, VAR calls).

Why pundits hate it (and why they're wrong)

The most common complaint against xG is that it 'ignores context' - it treats every shot the same regardless of the pressure, the goalkeeper, the scoreline. The modern generation of models actually account for most of those factors, but the broader critique misses the point: xG is not trying to describe what happened, it's trying to describe what would happen on average if the same chances were replayed repeatedly.

The other critique is that xG undervalues clinical finishing. Jamie Vardy is famously a 'Vardy of the spot' - a player who scores more than his chances suggest. But even Vardy regresses: over his best five seasons, his actual goals are 15% above his xG, not 50%. On a single-match basis the gap can look dramatic, but it shrinks fast over longer samples, which is exactly what you want from a stable modelling input.

The real danger of xG is over-application. Using xG to predict next weekend's scoreline is fine; using it to predict which player will be top scorer next month requires additional shot-volume projections, penalty assignments, and rotation risk - things that the raw xG number doesn't encode. BetsPlug uses xG only for what it's designed for: estimating the Poisson lambda for each team in an upcoming fixture.

From xG to match prediction

Inside our ensemble, we don't use xG as a standalone signal. We feed each team's rolling xG numbers (attacking output, defensive concessions, home/away split) into a Poisson goal model that produces a probability distribution over every possible scoreline. From there you can derive the 1X2 probabilities, Over/Under totals, BTTS probabilities and Asian handicap lines - all from the same xG-driven Poisson surface.

The tricky part is deciding how many matches of xG history to weight. Too few and you overreact to small samples (a 6-shot burst from Bruno Fernandes against ten men doesn't mean United's attack is suddenly elite). Too many and you miss real form shifts (Arsenal's attacking output changed meaningfully after Ødegaard returned from injury). Our pipeline uses a rolling window that blends the last 8 matches with a long-run season-level prior, weighted by the confidence interval around the current estimate.

The xG pipeline is also where we catch data problems fastest. Every week, we cross-check the xG totals from our primary data vendor against a secondary source. Matches with divergences above 0.4 xG get flagged for manual review before any downstream model consumes them. This sounds boring but it's the kind of plumbing that separates a hobbyist model from a production system.

Common xG mistakes

Mistake one: treating xG as a guaranteed result. 'Arsenal had 2.5 xG so they should have won' is a misreading - they had a performance consistent with 2.5 average goals, but the actual distribution is wide. A team with 2.5 xG still scores zero ~8% of the time.

Mistake two: comparing xG across very different data providers. StatsBomb xG, Opta xG and Understat xG all use different training sets, different feature engineering and different shot metadata, so a 1.8 xG from one provider doesn't equal a 1.8 xG from another. Always compare like with like.

Mistake three: mistaking xG for a skill rating. A player with 0.5 xG per 90 minutes is not a better finisher than a player with 0.4 xG per 90 - they just get into better positions. Finishing skill shows up in the gap between expected and actual goals, which is noisy and only stabilises over thousands of shots.

Members only

Unlock every game

Join BetsPlug to see all upcoming predictions across the top leagues - with confidence scores, live updates and our full public track record.

  • Unlimited daily AI predictions
  • All 4 models + Ensemble output
  • Live probability tracking
  • Cancel anytime - 14-day refund

Just €0.01 activates your 7-day full-access trial.

Expected Goals (xG) Explained - FAQ

Common questions on this topic, answered without the marketing fluff.

How is xG different from 'shots on target'?
Shots on target counts each attempt equally - a tap-in and a hopeful 30-yard effort both register as 1. xG weights each shot by its probability of becoming a goal, so a tap-in might add 0.9 xG while the long shot adds 0.03 xG.
Can xG predict which team wins?
Not on its own, but feed two teams' rolling xG numbers into a Poisson goal model and you get a probability distribution over every scoreline - including 1X2 probabilities. That's exactly what BetsPlug's Poisson head does.
Which xG provider does BetsPlug use?
We source from the football-data.org feed, cross-checked against OpenLigaDB for leagues where both are available. Matches with large divergences get flagged for manual review.
Why do some matches have 'wrong' xG results?
Short-run variance. A team can have 2.5 xG and lose 0-1 because finishing is probabilistic. Over a full season (~38 matches) actual goals track xG closely, but any individual match can diverge substantially.
Does xG account for penalties?
Most providers do - a standard penalty is assigned 0.76 xG (the historical conversion rate). BetsPlug splits penalty xG from open-play xG internally so post-match xG totals reflect the real shot-quality picture.